Work Packages: CLI 2.x Readiness Sprint

Inputs: Design documents from /kitty-specs/039-cli-2x-readiness/ Prerequisites: plan.md (required), spec.md (user stories), research.md, data-model.md, contracts/, quickstart.md

Tests: Included per WP — this sprint is "debug, fix, harden" and tests are required to validate fixes.

Organization: Fine-grained subtasks (Txxx) roll up into work packages (WPxx). Each work package must be independently deliverable and testable.

Prompt Files: Each work package references a matching prompt file in /tasks/ generated by /spec-kitty.tasks. Treat this file as the high-level checklist; keep deep implementation detail inside the prompt files.

Delivery Branch: All work targets the 2.x branch (588 commits diverged from main). Do NOT merge to main.

Subtask Format: [Txxx] [P?] Description

  • [P] indicates the subtask can proceed in parallel (different files/components).
  • Include precise file paths or modules.

Wave 1: Independent Work Packages (Parallel)


Work Package WP01: Fix setup-plan NameError on 2.x (Priority: P0)

Goal: Make spec-kitty agent feature setup-plan complete without import errors on the 2.x branch. Independent Test: Run spec-kitty agent feature setup-plan in a test project and verify plan.md is created. Prompt: /tasks/WP01-fix-setup-plan-nameerror.md

Included Subtasks

  • ✅ T001 Apply missing get_feature_mission_key import to src/specify_cli/cli/commands/agent/feature.py on 2.x
  • ✅ T002 Investigate test_full_planning_workflow_no_worktrees xfail — fix or re-document
  • ✅ T003 Verify all planning workflow tests pass (test_planning_workflow.py, test_task_workflow.py)

Implementation Notes

  • The fix is already on main at commit 5332408f — cherry-pick or manually apply the same import
  • On 2.x, feature.py may have diverged from main — check the actual file before applying
  • The xfail test may be related to typer availability in the test environment

Parallel Opportunities

  • Entirely independent of all other Wave 1 WPs

Dependencies

  • None

Risks & Mitigations

  • 2.x feature.py may differ significantly from main → diff both versions before applying fix

Work Package WP02: Fix batch error surfacing and diagnostics (Priority: P0)

Goal: Surface per-event error details from batch sync responses instead of bare "Synced: 0 Errors: 105". Independent Test: Mock batch endpoint responses and verify grouped error summary output. Prompt: /tasks/WP02-fix-batch-error-surfacing.md

Included Subtasks

  • ✅ T004 Parse per-event results[] array from HTTP 200 batch responses in src/specify_cli/sync/batch.py
  • ✅ T005 Parse details field from HTTP 400 error responses (not just error)
  • ✅ T006 [P] Implement error categorization: schema_mismatch, auth_expired, server_error, unknown
  • ✅ T007 Print actionable summary: Synced: N, Duplicates: N, Failed: N (schema_mismatch: X, auth_expired: Y)
  • ✅ T008 Selective queue removal: remove synced + duplicate events, retain failures with incremented retry_count in src/specify_cli/sync/queue.py
  • ✅ T009 [P] Add --report <file.json> flag for JSON dump of per-event failure details
  • ✅ T010 Write tests for batch response parsing, categorization, and queue operations

Implementation Notes

  • batch.py:135 currently only reads top-level error field, discarding details
  • Batch response format per contract: {"results": [{"event_id": "...", "status": "success|duplicate|rejected", "error": "..."}]}
  • Error categorization should inspect the error string in rejected results for keywords

Parallel Opportunities

  • T006 (categorization) and T009 (report flag) can run alongside T004/T005

Dependencies

  • None

Risks & Mitigations

  • Server response format may not match documented contract → test with mock responses, document delta in handoff doc

Work Package WP03: Fix sync status --check to use real token (Priority: P0)

Goal: Replace hardcoded test token in sync status --check with the user's real auth token. Independent Test: Run sync status --check and verify it uses stored credentials (not a test token). Prompt: /tasks/WP03-fix-sync-status-check.md

Included Subtasks

  • ✅ T011 Load real access token from ~/.spec-kitty/credentials in the status check path
  • ✅ T012 Attempt token refresh if expired; handle missing credentials with clear "run spec-kitty auth login" message
  • ✅ T013 Probe actual batch endpoint with real token instead of hardcoded test token
  • ✅ T014 Write tests for auth-aware status check (valid token, expired token, no credentials)

Implementation Notes

  • Current hardcoded test token is at approximately sync.py:531 on 2.x (may be in sync CLI commands or sync/runtime.py)
  • Use existing auth.py credential loading functions
  • Token refresh logic already exists in auth.py — reuse it

Parallel Opportunities

  • Entirely independent of other Wave 1 WPs

Dependencies

  • None

Risks & Mitigations

  • Test token location may have moved on 2.x → search for hardcoded Bearer/token strings

Work Package WP05: Extend sync status with queue health (Priority: P1)

Goal: Show queue depth, oldest event age, retry distribution, and top failing event types in sync status. Independent Test: Populate a test queue and verify extended status output. Prompt: /tasks/WP05-extend-sync-status.md

Included Subtasks

  • ✅ T020 Add aggregate query methods to src/specify_cli/sync/queue.py: total queued, oldest event age, retry-count distribution
  • ✅ T021 [P] Group pending events by event_type for top-failing-types display
  • ✅ T022 [P] Format extended status output with Rich tables/panels
  • ✅ T023 Integrate aggregate data into existing sync status command output
  • ✅ T024 Write tests for aggregate queries and formatted output

Implementation Notes

  • SQLite queries should target the actual queue table: SELECT COUNT(*) FROM queue, SELECT MIN(timestamp) FROM queue
  • Retry histogram buckets: 0 retries, 1-3 retries, 4+ retries
  • Use Rich Table for formatted output to match existing CLI style

Parallel Opportunities

  • T021 and T022 can proceed in parallel (data query vs. formatting)

Dependencies

  • None

Risks & Mitigations

  • Queue table schema on 2.x may differ from documented model → inspect actual schema first

Work Package WP06: Test and document 7-to-4 lane collapse mapping (Priority: P2)

Goal: Add comprehensive tests for the 7→4 lane mapping and verify the contract doc matches implementation. Independent Test: Run parametrized tests covering all 7 input lanes. Prompt: /tasks/WP06-lane-mapping-tests.md

Included Subtasks

  • ✅ T025 Add parametrized tests in tests/specify_cli/status/test_sync_lane_mapping.py covering all 7 lanes with expected 4-lane outputs
  • ✅ T026 Test invalid target lane handling via emit_status_transition(...) raises TransitionError
  • ✅ T027 Verify lane collapse mapping remains centralized in src/specify_cli/status/emit.py (_SYNC_LANE_MAP)
  • ✅ T028 Verify contracts/lane-mapping.md matches _SYNC_LANE_MAP in status/emit.py — flag any discrepancies

Implementation Notes

  • Current mapping at status/emit.py:46 on 2.x: planned→planned, claimed→planned, in_progress→doing, for_review→for_review, done→done, blocked→doing, canceled→planned
  • LANE_ALIASES = {"doing": "in_progress"} in status/transitions.py — test alias resolution separately

Parallel Opportunities

  • Entirely independent of other WPs

Dependencies

  • None

Risks & Mitigations

  • Mapping in status/emit.py may have changed since last inspection → read actual 2.x source first

Work Package WP08: Converge global runtime resolution (Priority: P1)

Goal: Make ~/.kittify the global runtime path with project-level overrides, no legacy fallback warnings after migration. Independent Test: After spec-kitty migrate, verify resolve_template_path() includes ~/.kittify/ in the resolution chain. Prompt: /tasks/WP08-global-runtime-convergence.md

Included Subtasks

  • ✅ T034 Audit current resolution chain in src/specify_cli/core/project_resolver.py on 2.x
  • ✅ T035 Add ~/.kittify/ to resolution chain: project → global → package defaults
  • ✅ T036 Eliminate legacy fallback warnings when ~/.kittify/ exists
  • ✅ T037 Emit one-time "run spec-kitty migrate" message if ~/.kittify/ doesn't exist (not a warning flood)
  • ✅ T038 Make spec-kitty migrate idempotent for global runtime install
  • ✅ T039 Write tests for resolution chain with ~/.kittify/ (exists, doesn't exist, idempotent migrate)

Implementation Notes

  • 2.x has partial global runtime bootstrap — audit what already exists before adding
  • Resolution order: project .kittify/missions/{key}/templates/ → project .kittify/templates/~/.kittify/missions/{key}/templates/~/.kittify/templates/ → package defaults
  • Credential path: ~/.spec-kitty/credentials stays separate from ~/.kittify/ — document this decision

Parallel Opportunities

  • Entirely independent of sync WPs

Dependencies

  • None

Risks & Mitigations

  • ~/.kittify migration may break existing 2.x alpha users → make migrate idempotent

Wave 2: Dependent Work Packages


Work Package WP04: Add sync diagnose command (Priority: P1)

Goal: Add spec-kitty sync diagnose for local event schema validation before sending. Independent Test: Queue malformed events and verify diagnose reports specific field errors. Prompt: /tasks/WP04-sync-diagnose-command.md

Included Subtasks

  • ✅ T015 Create src/specify_cli/sync/diagnose.py with event validation logic
  • ✅ T016 Validate events against Pydantic Event model from spec_kitty_events.models
  • ✅ T017 [P] Validate WPStatusChanged payloads against StatusTransitionPayload
  • ✅ T018 Register sync diagnose CLI command in sync command group
  • ✅ T019 Write tests: valid events pass, malformed events report specific field errors

Implementation Notes

  • Reuse error categorization from WP02 (T006) for consistent error grouping
  • Read events from SQLite queue using existing queue.py read methods
  • Output format: per-event validation (event_id, valid/invalid, error list)

Parallel Opportunities

  • T017 can proceed alongside T016 (different payload types)

Dependencies

  • Depends on WP02 (error categorization in T006)

Risks & Mitigations

  • Pydantic model validation may be strict in unexpected ways → test with real queue data from 2.x

Work Package WP07: SaaS handoff contract document (Priority: P0)

Goal: Produce a contract doc that enables the SaaS team to validate their batch endpoint against CLI payloads. Independent Test: Fixture data validates against CLI-side Pydantic models. Prompt: /tasks/WP07-saas-handoff-contract.md

Included Subtasks

  • ✅ T029 Document complete event envelope fields with types, constraints, and examples
  • ✅ T030 [P] Document batch request/response format: headers, compression, URL, body structure
  • ✅ T031 [P] Document authentication flow: JWT login, refresh, authorization header
  • ✅ T032 Create 3-5 complete fixture request/response examples covering success, duplicate, rejected, and mixed
  • ✅ T033 Write contract test: validate fixtures against Pydantic Event model

Implementation Notes

  • contracts/batch-ingest.md and contracts/lane-mapping.md already exist from Phase 1 planning — extend them with fixtures
  • Cross-reference lane mapping from WP06 testing (T025-T028)
  • Fixture data must include emitted event types: WPStatusChanged, WPCreated, WPAssigned, FeatureCreated, FeatureCompleted, HistoryAdded, ErrorLogged, DependencyResolved

Parallel Opportunities

  • T030 and T031 can proceed in parallel (request format vs. auth flow)

Dependencies

  • Depends on WP02 (error format from T006/T007)

Risks & Mitigations

  • SaaS endpoint may not match documented contract → fixtures enable the SaaS team to test independently

Wave 3: Integration


Work Package WP09: End-to-end CLI smoke test (Priority: P0)

Goal: Exercise the full create-feature → setup-plan → implement → review sequence against a temp repo. Independent Test: Test is self-contained — creates and cleans up its own temp repository. Prompt: /tasks/WP09-e2e-smoke-test.md

Included Subtasks

  • ✅ T040 Create tests/e2e/ directory with __init__.py and conftest.py
  • ✅ T041 Write temp repo fixture: git init, spec-kitty init, .kittify setup
  • ✅ T042 Implement full test sequence: create-feature → setup-plan → finalize-tasks → implement → move-task
  • ✅ T043 [P] Add pytest.mark.e2e marker to pyproject.toml
  • ✅ T044 Verify test passes locally and document CI considerations

Implementation Notes

  • Use typer.testing.CliRunner or subprocess.run for CLI invocations
  • Test must verify intermediate artifacts exist: spec.md, plan.md, tasks/, worktree
  • Mark with pytest.mark.e2e for optional CI separation
  • Final state: WP01 in for_review lane, all artifacts exist

Parallel Opportunities

  • T043 is independent file edit, can proceed alongside T040-T042

Dependencies

  • Depends on WP01 (setup-plan must work)

Risks & Mitigations

  • E2E test may be flaky in CI → ensure robust cleanup; use pytest.mark.e2e for separation

Dependency & Execution Summary

  • Wave 1 (parallel): WP01, WP02, WP03, WP05, WP06, WP08 — all independent, run concurrently
  • Wave 2 (depends on Wave 1): WP04 (→WP02), WP07 (→WP02)
  • Wave 3 (integration): WP09 (→WP01)
  • MVP Scope: WP01 + WP02 + WP09 (planning workflow + sync diagnostics + smoke test)
  • Parallelization: Up to 6 agents can work simultaneously in Wave 1

Subtask Index (Reference)

Subtask IDSummaryWork PackagePriorityParallel?
T001Apply missing import fix to feature.pyWP01P0No
T002Investigate xfail planning testWP01P0No
T003Verify planning workflow tests passWP01P0No
T004Parse per-event results[] from 200 responsesWP02P0No
T005Parse details field from 400 responsesWP02P0No
T006Implement error categorizationWP02P0Yes
T007Print actionable summary with countsWP02P0No
T008Selective queue removal for synced/dup eventsWP02P0No
T009Add --report flag for JSON failure dumpWP02P0Yes
T010Write batch parsing and queue testsWP02P0No
T011Load real access token from credentialsWP03P0No
T012Handle token refresh and missing credentialsWP03P0No
T013Probe actual endpoint with real tokenWP03P0No
T014Write auth-aware status check testsWP03P0No
T015Create diagnose.py validation moduleWP04P1No
T016Validate events against Event modelWP04P1No
T017Validate WPStatusChanged payloadsWP04P1Yes
T018Register sync diagnose CLI commandWP04P1No
T019Write diagnose validation testsWP04P1No
T020Add aggregate query methods to queue.pyWP05P1No
T021Group events by event_typeWP05P1Yes
T022Format output with Rich tables/panelsWP05P1Yes
T023Integrate aggregates into sync status commandWP05P1No
T024Write aggregate query and output testsWP05P1No
T025Parametrized tests for all 7 lanesWP06P2No
T026Test invalid lane transition raises TransitionErrorWP06P2No
T027Verify _SYNC_LANE_MAP is centralizedWP06P2No
T028Verify contract doc matches implementationWP06P2No
T029Document event envelope fieldsWP07P0No
T030Document batch request/response formatWP07P0Yes
T031Document auth flow (JWT login/refresh)WP07P0Yes
T032Create fixture request/response examplesWP07P0No
T033Write contract test validating fixturesWP07P0No
T034Audit resolution chain on 2.xWP08P1No
T035Add ~/.kittify to resolution chainWP08P1No
T036Eliminate legacy fallback warningsWP08P1No
T037Emit one-time "run migrate" messageWP08P1No
T038Make spec-kitty migrate idempotentWP08P1No
T039Write resolution chain testsWP08P1No
T040Create tests/e2e/ directory structureWP09P0No
T041Write temp repo fixtureWP09P0No
T042Implement full E2E test sequenceWP09P0No
T043Add pytest.mark.e2e markerWP09P0Yes
T044Verify test passes locallyWP09P0No

<!-- status-model:start -->

Canonical Status (Generated)

<!-- status-model:end -->

  • WP01: done
  • WP02: done
  • WP03: done
  • WP04: done
  • WP05: done
  • WP06: done
  • WP07: done
  • WP08: done
  • WP09: done