Work Packages: Namespace-Aware Artifact Body Sync

Inputs: Design documents from /kitty-specs/047-namespace-aware-artifact-body-sync/ Prerequisites: plan.md (required), spec.md (user stories), data-model.md, contracts/, research.md, quickstart.md

Tests: Required per NFR-004 (90%+ line coverage for new modules, mypy --strict).

Organization: Fine-grained subtasks (Txxx) roll up into work packages (WPxx). Each work package is independently deliverable and testable.

Prompt Files: Each work package references a matching prompt file in /tasks/ generated by /spec-kitty.tasks.

Subtask Format: [Txxx] [P?] Description

  • [P] indicates the subtask can proceed in parallel (different files/components).
  • Precise file paths follow the plan.md project structure.

Path Conventions

  • Source: src/specify_cli/sync/
  • Tests: tests/specify_cli/sync/

Work Package WP01: Foundation - NamespaceRef & Shared Types (Priority: P0)

Goal: Establish the foundational data types that all other WPs depend on: NamespaceRef, SupportedInlineFormat, and UploadOutcome. Independent Test: pytest tests/specify_cli/sync/test_namespace.py passes; mypy --strict passes on namespace.py. Prompt: /tasks/WP01-foundation-namespace-and-shared-types.md Estimated Prompt Size: ~350 lines

Included Subtasks

  • ✅ T001 Create NamespaceRef dataclass in src/specify_cli/sync/namespace.py with 5 namespace fields
  • ✅ T002 Add NamespaceRef.from_project_identity() factory method consuming ProjectIdentity + feature metadata + manifest version
  • ✅ T003 [P] Create SupportedInlineFormat enum (.md, .json, .yaml, .yml, .csv) and is_supported_format() helper
  • ✅ T004 [P] Create UploadOutcome dataclass with status codes (uploaded, already_exists, queued, skipped, failed) and reason strings
  • ✅ T005 Write tests/specify_cli/sync/test_namespace.py covering construction, validation, factory, and edge cases

Implementation Notes

  • NamespaceRef carries project_uuid, feature_slug, target_branch, mission_key, manifest_version
  • Factory reads ProjectIdentity (from project_identity.py), feature metadata (from meta.json), and manifest_version (from ExpectedArtifactManifest)
  • SupportedInlineFormat and UploadOutcome can live in the same namespace.py or a small body_types.py — plan.md places them in namespace.py

Parallel Opportunities

  • T003 and T004 are independent of T001/T002 and can be developed simultaneously.

Dependencies

  • None (starting package).

Risks & Mitigations

  • Risk: manifest_version sourcing may be ambiguous. Mitigation: plan.md specifies ExpectedArtifactManifest.manifest_version from expected-artifacts.yaml.

Work Package WP02: Body Queue - SQLite Persistence Layer (Priority: P0)

Goal: Create the durable body_upload_queue SQLite table and OfflineBodyUploadQueue class with idempotent enqueue, per-task backoff drain, and lifecycle methods. Independent Test: pytest tests/specify_cli/sync/test_body_queue.py passes; queue survives process restart. Prompt: /tasks/WP02-body-queue-sqlite-persistence.md Estimated Prompt Size: ~450 lines

Included Subtasks

  • ✅ T006 Add body_upload_queue SQLite table schema + migration in queue.py shared DB infrastructure
  • ✅ T007 Create OfflineBodyUploadQueue class in src/specify_cli/sync/body_queue.py with __init__() and schema creation
  • ✅ T008 Implement enqueue() with 7-field idempotent unique constraint (project_uuid, feature_slug, target_branch, mission_key, manifest_version, artifact_path, content_hash)
  • ✅ T009 Implement drain(limit) with per-task backoff filtering (WHERE next_attempt_at <= now())
  • ✅ T010 Implement mark_uploaded(), mark_failed(), update_backoff() methods
  • ✅ T011 [P] Implement get_stats() returning queue count, age distribution, retry histogram
  • ✅ T012 Write tests/specify_cli/sync/test_body_queue.py covering CRUD, idempotent enqueue, drain ordering, backoff

Implementation Notes

  • Table lives in the same SQLite DB file as the event queue table (shared DB connection)
  • Unique constraint prevents duplicate tasks for same content+namespace; changed content (new hash) creates new task
  • next_attempt_at column enables per-task exponential backoff (1s initial, 5min cap)
  • Queue capacity limit: 100,000 rows by default, reusing the shared sync queue policy/configuration

Parallel Opportunities

  • T011 (stats) is independent once schema exists.

Dependencies

  • Depends on WP01 (uses NamespaceRef for field definitions).

Risks & Mitigations

  • Risk: Schema migration on existing DB files. Mitigation: CREATE TABLE IF NOT EXISTS + CREATE INDEX IF NOT EXISTS, no destructive DDL.

Work Package WP03: Body Upload Preparation & Filtering (Priority: P1)

Goal: Implement prepare_body_uploads() in body_upload.py — the function that filters ArtifactRef objects from the indexer, reads file content, validates hashes, and enqueues upload tasks. Independent Test: pytest tests/specify_cli/sync/test_body_upload.py passes; filters match FR-004/FR-005/FR-006 exactly. Prompt: /tasks/WP03-body-upload-preparation-and-filtering.md Estimated Prompt Size: ~450 lines

Included Subtasks

  • ✅ T013 Implement _is_supported_surface() matching FR-004 artifact surfaces (spec.md, plan.md, tasks.md, research.md, quickstart.md, data-model.md, research/, contracts/, checklists/*, tasks/WP.md)
  • ✅ T014 Implement _is_supported_format() for FR-005 inline text formats (.md, .json, .yaml, .yml, .csv)
  • ✅ T015 Implement _check_size_limit() for 512 KiB inline limit + binary skip (FR-006)
  • ✅ T016 Implement _read_and_rehash() — read file content as UTF-8, compute SHA-256, compare to ArtifactRef.content_hash_sha256, skip on mismatch
  • ✅ T017 Implement prepare_body_uploads(artifacts, namespace_ref, body_queue) orchestrating all filters with per-file UploadOutcome diagnostics
  • ✅ T018 Write tests/specify_cli/sync/test_body_upload.py covering filter logic, re-hash guard, skip reasons, and integration with body queue

Implementation Notes

  • Input: List[ArtifactRef] from Indexer.index_feature() + NamespaceRef + OfflineBodyUploadQueue
  • Surface matching uses ArtifactRef.relative_path (feature-relative, same as dossier indexer convention per FR-014)
  • Re-hash guard: if file changed between index scan and body read, skip with UploadOutcome(skipped, "content_hash_mismatch")
  • Deleted files: if file gone after index, skip with UploadOutcome(skipped, "deleted_after_scan")

Parallel Opportunities

  • T013, T014, T015 are independent filter functions that can be developed simultaneously.

Dependencies

  • Depends on WP01 (NamespaceRef, SupportedInlineFormat, UploadOutcome).
  • Depends on WP02 (OfflineBodyUploadQueue.enqueue()).

Risks & Mitigations

  • Risk: Race between indexer scan and content read. Mitigation: re-hash guard (T016) detects stale data; next sync picks up fresh content.

Work Package WP04: HTTP Transport - Body Push Client (Priority: P1)

Goal: Implement push_content() HTTP transport in body_transport.py with response classification and 404 sub-code dispatch. Independent Test: pytest tests/specify_cli/sync/test_body_transport.py passes; all HTTP response codes correctly classified. Prompt: /tasks/WP04-http-transport-body-push-client.md Estimated Prompt Size: ~400 lines

Included Subtasks

  • ✅ T019 Implement push_content(task, auth_token, server_url) HTTP POST to /api/dossier/push-content/
  • ✅ T020 Implement response classification into UploadOutcome (201 stored, 200 already_exists, 400 bad_request, 401 unauthorized, 429 rate_limited, 5xx server_error)
  • ✅ T021 Implement 404 sub-code dispatch: index_entry_not_found (retryable per FR-008) vs namespace_not_found (fatal/poison)
  • ✅ T022 Build request body with full namespace tuple + artifact payload (artifact_path, content_hash, hash_algorithm, content_body per FR-002/FR-003)
  • ✅ T023 Write tests/specify_cli/sync/test_body_transport.py with requests mocking for all response codes

Implementation Notes

  • Uses existing AuthClient and requests — no new auth flow (C-002)
  • Response body JSON must include status field for classification; 404 includes error_code for dispatch
  • Auth expiry (401) is retryable — queued task stays pending until auth refresh
  • Request includes hash_algorithm: "sha256" explicitly

Parallel Opportunities

  • This entire WP can run in parallel with WP02 and WP03 (only depends on WP01 types).

Dependencies

  • Depends on WP01 (NamespaceRef, UploadOutcome).

Risks & Mitigations

  • Risk: SaaS endpoint not yet available. Mitigation: develop against contract in contracts/push-content-api.md; tests use HTTP mocks.

Work Package WP05: Dossier Pipeline Orchestration (Priority: P1)

Goal: Create sync_feature_dossier() in dossier_pipeline.py — the orchestration entrypoint that wires indexer output through event emission and body upload preparation. Independent Test: pytest tests/specify_cli/sync/test_dossier_pipeline.py passes; full pipeline runs index → emit → enqueue. Prompt: /tasks/WP05-dossier-pipeline-orchestration.md Estimated Prompt Size: ~350 lines

Included Subtasks

  • ✅ T024 Create sync_feature_dossier(feature_dir, namespace_ref, body_queue, emitter) function signature and control flow
  • ✅ T025 Wire Indexer.index_feature()emit_artifact_indexed() (existing) → prepare_body_uploads() (new from WP03)
  • ✅ T026 Handle partial failures: event emission success + body enqueue failure = log warning, continue (non-fatal)
  • ✅ T027 Write tests/specify_cli/sync/test_dossier_pipeline.py with integration tests for full pipeline

Implementation Notes

  • This is the ONLY place body uploads are prepared — BackgroundSyncService only drains already-enqueued work
  • Invoked by feature-aware sync commands, NOT by BackgroundSyncService (which has no "active feature" concept)
  • Must handle: indexer returns empty dossier (no artifacts), partial indexer failures, missing feature_dir

Parallel Opportunities

  • None within this WP — subtasks are sequential.

Dependencies

  • Depends on WP03 (prepare_body_uploads()).
  • Transitively depends on WP01 and WP02.

Risks & Mitigations

  • Risk: Coupling with existing dossier event emission. Mitigation: dossier_pipeline.py is a new module that imports both; no modification to existing dossier/events.py.

Work Package WP06: Background Sync & Runtime Integration (Priority: P1)

Goal: Extend BackgroundSyncService to drain body queue after event queue, and wire OfflineBodyUploadQueue into SyncRuntime lifecycle. Independent Test: pytest tests/specify_cli/sync/test_background_body.py passes; body queue drains after event queue; runtime starts/stops body queue correctly. Prompt: /tasks/WP06-background-sync-and-runtime-integration.md Estimated Prompt Size: ~400 lines

Included Subtasks

  • ✅ T028 Extend BackgroundSyncService._sync_once() to drain body_upload_queue after event queue drain completes
  • ✅ T029 Wire push_content() from body_transport into body drain loop with per-task result handling
  • ✅ T030 Implement per-task exponential backoff calculation (1s initial → 5min cap) and update_backoff() on retry
  • ✅ T031 Wire OfflineBodyUploadQueue into SyncRuntime.start() / SyncRuntime.stop() lifecycle
  • ✅ T032 Ensure shared DB connection between event queue and body queue (same .db file)
  • ✅ T033 Write tests/specify_cli/sync/test_background_body.py covering drain ordering, backoff behavior, and runtime wiring

Implementation Notes

  • Drain ordering invariant: events first, then body uploads (maximizes chance remote index exists before body arrives)
  • Backoff formula: min(1 * 2^retry_count, 300) seconds; stored as next_attempt_at timestamp
  • On successful upload or already_exists: remove from queue. On retryable failure: update backoff. On fatal failure: remove (poison row).
  • Runtime lifecycle: create OfflineBodyUploadQueue in start(), pass to BackgroundSyncService, close in stop()

Parallel Opportunities

  • T031/T032 (runtime wiring) can proceed alongside T028-T030 (background service changes).

Dependencies

  • Depends on WP02 (OfflineBodyUploadQueue).
  • Depends on WP04 (push_content()).

Risks & Mitigations

  • Risk: Thread safety when draining both queues. Mitigation: existing _lock in BackgroundSyncService already serializes sync cycles.

Work Package WP07: Diagnostics, Logging & End-to-End Tests (Priority: P2)

Goal: Add body upload diagnostics to diagnose.py, implement per-artifact result logging (FR-012), and write end-to-end integration tests covering online, offline, retry, and idempotency scenarios. Independent Test: pytest tests/specify_cli/sync/test_body_integration.py passes; mypy --strict passes on all new modules. Prompt: /tasks/WP07-diagnostics-logging-and-e2e-tests.md Estimated Prompt Size: ~400 lines

Included Subtasks

  • ✅ T034 Add body upload queue stats and inspection to src/specify_cli/sync/diagnose.py
  • ✅ T035 Implement per-artifact upload result logging per FR-012 (uploaded, already_exists, queued, skipped, failed with reason)
  • ✅ T036 Write tests/specify_cli/sync/test_body_integration.py — end-to-end tests: online sync, offline queue+replay, retry with backoff, idempotency, cross-namespace isolation
  • ✅ T037 Run mypy --strict on all new modules and fix any type errors

Implementation Notes

  • Diagnostics should mirror existing event queue inspection pattern in diagnose.py
  • Logging uses rich console for human-readable output when running interactively
  • Integration tests need: mock SaaS server (httpx mock), real SQLite queue, real filesystem with feature artifacts
  • mypy --strict: all new modules must have full type annotations, no Any escape hatches

Parallel Opportunities

  • T034 and T035 are independent of T036/T037.

Dependencies

  • Depends on WP05 (dossier pipeline for e2e tests).
  • Depends on WP06 (background sync for e2e replay tests).

Risks & Mitigations

  • Risk: E2E test complexity. Mitigation: use pytest fixtures to compose mock SaaS + real queue + real filesystem.

Dependency & Execution Summary

        WP01 (foundation)
       /    \
    WP02    WP04          ← Parallel wave 2
   /    \      \
WP03  WP06*    |          ← WP03 parallel with WP06 (if WP04 done)
  |      |     |
WP05   WP06    |          ← WP06 needs WP02 + WP04
  \     /
   WP07                   ← Final integration
  • Wave 1: WP01 (foundation)
  • Wave 2: WP02 + WP04 (parallel — queue and transport are independent)
  • Wave 3: WP03 + WP06 (parallel — upload prep and background integration)
  • Wave 4: WP05 (pipeline orchestration, after WP03)
  • Wave 5: WP07 (diagnostics + e2e, after WP05 + WP06)

MVP Scope: WP01 → WP02 → WP03 → WP04 → WP05 (online body upload works). WP06 adds offline replay. WP07 adds diagnostics.


Subtask Index (Reference)

Subtask IDSummaryWork PackagePriorityParallel?
T001Create NamespaceRef dataclassWP01P0No
T002Add NamespaceRef.from_project_identity() factoryWP01P0No
T003Create SupportedInlineFormat enum + helperWP01P0Yes
T004Create UploadOutcome dataclassWP01P0Yes
T005Write test_namespace.pyWP01P0No
T006Add body_upload_queue SQLite schemaWP02P0No
T007Create OfflineBodyUploadQueue classWP02P0No
T008Implement idempotent enqueue()WP02P0No
T009Implement drain() with backoffWP02P0No
T010Implement mark_uploaded/failed/backoffWP02P0No
T011Implement queue statsWP02P0Yes
T012Write test_body_queue.pyWP02P0No
T013Implement surface filtering (FR-004)WP03P1Yes
T014Implement format filtering (FR-005)WP03P1Yes
T015Implement size limit + binary skip (FR-006)WP03P1Yes
T016Implement read + re-hash guardWP03P1No
T017Implement prepare_body_uploads() orchestrationWP03P1No
T018Write test_body_upload.pyWP03P1No
T019Implement push_content() HTTP POSTWP04P1No
T020Implement response classificationWP04P1No
T021Implement 404 sub-code dispatchWP04P1No
T022Build request body with namespace tupleWP04P1No
T023Write test_body_transport.pyWP04P1No
T024Create sync_feature_dossier() entrypointWP05P1No
T025Wire indexer → events → body uploadsWP05P1No
T026Handle partial failures (non-fatal)WP05P1No
T027Write test_dossier_pipeline.pyWP05P1No
T028Extend _sync_once() for body drainWP06P1No
T029Wire push_content() into drain loopWP06P1No
T030Implement per-task exponential backoffWP06P1No
T031Wire body queue into SyncRuntime lifecycleWP06P1Yes
T032Ensure shared DB connectionWP06P1Yes
T033Write test_background_body.pyWP06P1No
T034Add body queue stats to diagnose.pyWP07P2Yes
T035Implement per-artifact result logging (FR-012)WP07P2Yes
T036Write test_body_integration.py (e2e)WP07P2No
T037Run mypy --strict on all new modulesWP07P2No

> Replace all placeholder text above with feature-specific content. Keep this template structure intact so downstream automation can parse work packages reliably.

<!-- status-model:start -->

Canonical Status (Generated)

<!-- status-model:end -->

  • WP01: done
  • WP02: done
  • WP03: done
  • WP04: done
  • WP05: done
  • WP06: done
  • WP07: done