Work Packages: CLI 2.x Readiness Sprint
Inputs: Design documents from /kitty-specs/039-cli-2x-readiness/ Prerequisites: plan.md (required), spec.md (user stories), research.md, data-model.md, contracts/, quickstart.md
Tests: Included per WP — this sprint is "debug, fix, harden" and tests are required to validate fixes.
Organization: Fine-grained subtasks (Txxx) roll up into work packages (WPxx). Each work package must be independently deliverable and testable.
Prompt Files: Each work package references a matching prompt file in /tasks/ generated by /spec-kitty.tasks. Treat this file as the high-level checklist; keep deep implementation detail inside the prompt files.
Delivery Branch: All work targets the 2.x branch (588 commits diverged from main). Do NOT merge to main.
Subtask Format: [Txxx] [P?] Description
- [P] indicates the subtask can proceed in parallel (different files/components).
- Include precise file paths or modules.
Wave 1: Independent Work Packages (Parallel)
Work Package WP01: Fix setup-plan NameError on 2.x (Priority: P0)
Goal: Make spec-kitty agent feature setup-plan complete without import errors on the 2.x branch. Independent Test: Run spec-kitty agent feature setup-plan in a test project and verify plan.md is created. Prompt: /tasks/WP01-fix-setup-plan-nameerror.md
Included Subtasks
- ✅ T001 Apply missing
get_feature_mission_keyimport tosrc/specify_cli/cli/commands/agent/feature.pyon 2.x - ✅ T002 Investigate
test_full_planning_workflow_no_worktreesxfail — fix or re-document - ✅ T003 Verify all planning workflow tests pass (
test_planning_workflow.py,test_task_workflow.py)
Implementation Notes
- The fix is already on main at commit 5332408f — cherry-pick or manually apply the same import
- On 2.x,
feature.pymay have diverged from main — check the actual file before applying - The xfail test may be related to typer availability in the test environment
Parallel Opportunities
- Entirely independent of all other Wave 1 WPs
Dependencies
- None
Risks & Mitigations
- 2.x
feature.pymay differ significantly from main → diff both versions before applying fix
Work Package WP02: Fix batch error surfacing and diagnostics (Priority: P0)
Goal: Surface per-event error details from batch sync responses instead of bare "Synced: 0 Errors: 105". Independent Test: Mock batch endpoint responses and verify grouped error summary output. Prompt: /tasks/WP02-fix-batch-error-surfacing.md
Included Subtasks
- ✅ T004 Parse per-event
results[]array from HTTP 200 batch responses insrc/specify_cli/sync/batch.py - ✅ T005 Parse
detailsfield from HTTP 400 error responses (not justerror) - ✅ T006 [P] Implement error categorization:
schema_mismatch,auth_expired,server_error,unknown - ✅ T007 Print actionable summary:
Synced: N, Duplicates: N, Failed: N (schema_mismatch: X, auth_expired: Y) - ✅ T008 Selective queue removal: remove synced + duplicate events, retain failures with incremented retry_count in
src/specify_cli/sync/queue.py - ✅ T009 [P] Add
--report <file.json>flag for JSON dump of per-event failure details - ✅ T010 Write tests for batch response parsing, categorization, and queue operations
Implementation Notes
batch.py:135currently only reads top-levelerrorfield, discardingdetails- Batch response format per contract:
{"results": [{"event_id": "...", "status": "success|duplicate|rejected", "error": "..."}]} - Error categorization should inspect the
errorstring in rejected results for keywords
Parallel Opportunities
- T006 (categorization) and T009 (report flag) can run alongside T004/T005
Dependencies
- None
Risks & Mitigations
- Server response format may not match documented contract → test with mock responses, document delta in handoff doc
Work Package WP03: Fix sync status --check to use real token (Priority: P0)
Goal: Replace hardcoded test token in sync status --check with the user's real auth token. Independent Test: Run sync status --check and verify it uses stored credentials (not a test token). Prompt: /tasks/WP03-fix-sync-status-check.md
Included Subtasks
- ✅ T011 Load real access token from
~/.spec-kitty/credentialsin the status check path - ✅ T012 Attempt token refresh if expired; handle missing credentials with clear "run
spec-kitty auth login" message - ✅ T013 Probe actual batch endpoint with real token instead of hardcoded test token
- ✅ T014 Write tests for auth-aware status check (valid token, expired token, no credentials)
Implementation Notes
- Current hardcoded test token is at approximately
sync.py:531on 2.x (may be in sync CLI commands or sync/runtime.py) - Use existing
auth.pycredential loading functions - Token refresh logic already exists in
auth.py— reuse it
Parallel Opportunities
- Entirely independent of other Wave 1 WPs
Dependencies
- None
Risks & Mitigations
- Test token location may have moved on 2.x → search for hardcoded Bearer/token strings
Work Package WP05: Extend sync status with queue health (Priority: P1)
Goal: Show queue depth, oldest event age, retry distribution, and top failing event types in sync status. Independent Test: Populate a test queue and verify extended status output. Prompt: /tasks/WP05-extend-sync-status.md
Included Subtasks
- ✅ T020 Add aggregate query methods to
src/specify_cli/sync/queue.py: total queued, oldest event age, retry-count distribution - ✅ T021 [P] Group pending events by
event_typefor top-failing-types display - ✅ T022 [P] Format extended status output with Rich tables/panels
- ✅ T023 Integrate aggregate data into existing
sync statuscommand output - ✅ T024 Write tests for aggregate queries and formatted output
Implementation Notes
- SQLite queries should target the actual
queuetable:SELECT COUNT(*) FROM queue,SELECT MIN(timestamp) FROM queue - Retry histogram buckets:
0 retries,1-3 retries,4+ retries - Use Rich Table for formatted output to match existing CLI style
Parallel Opportunities
- T021 and T022 can proceed in parallel (data query vs. formatting)
Dependencies
- None
Risks & Mitigations
- Queue table schema on 2.x may differ from documented model → inspect actual schema first
Work Package WP06: Test and document 7-to-4 lane collapse mapping (Priority: P2)
Goal: Add comprehensive tests for the 7→4 lane mapping and verify the contract doc matches implementation. Independent Test: Run parametrized tests covering all 7 input lanes. Prompt: /tasks/WP06-lane-mapping-tests.md
Included Subtasks
- ✅ T025 Add parametrized tests in
tests/specify_cli/status/test_sync_lane_mapping.pycovering all 7 lanes with expected 4-lane outputs - ✅ T026 Test invalid target lane handling via
emit_status_transition(...)raisesTransitionError - ✅ T027 Verify lane collapse mapping remains centralized in
src/specify_cli/status/emit.py(_SYNC_LANE_MAP) - ✅ T028 Verify
contracts/lane-mapping.mdmatches_SYNC_LANE_MAPinstatus/emit.py— flag any discrepancies
Implementation Notes
- Current mapping at
status/emit.py:46on 2.x: planned→planned, claimed→planned, in_progress→doing, for_review→for_review, done→done, blocked→doing, canceled→planned LANE_ALIASES = {"doing": "in_progress"}instatus/transitions.py— test alias resolution separately
Parallel Opportunities
- Entirely independent of other WPs
Dependencies
- None
Risks & Mitigations
- Mapping in
status/emit.pymay have changed since last inspection → read actual 2.x source first
Work Package WP08: Converge global runtime resolution (Priority: P1)
Goal: Make ~/.kittify the global runtime path with project-level overrides, no legacy fallback warnings after migration. Independent Test: After spec-kitty migrate, verify resolve_template_path() includes ~/.kittify/ in the resolution chain. Prompt: /tasks/WP08-global-runtime-convergence.md
Included Subtasks
- ✅ T034 Audit current resolution chain in
src/specify_cli/core/project_resolver.pyon 2.x - ✅ T035 Add
~/.kittify/to resolution chain: project → global → package defaults - ✅ T036 Eliminate legacy fallback warnings when
~/.kittify/exists - ✅ T037 Emit one-time "run
spec-kitty migrate" message if~/.kittify/doesn't exist (not a warning flood) - ✅ T038 Make
spec-kitty migrateidempotent for global runtime install - ✅ T039 Write tests for resolution chain with
~/.kittify/(exists, doesn't exist, idempotent migrate)
Implementation Notes
- 2.x has partial global runtime bootstrap — audit what already exists before adding
- Resolution order: project
.kittify/missions/{key}/templates/→ project.kittify/templates/→~/.kittify/missions/{key}/templates/→~/.kittify/templates/→ package defaults - Credential path:
~/.spec-kitty/credentialsstays separate from~/.kittify/— document this decision
Parallel Opportunities
- Entirely independent of sync WPs
Dependencies
- None
Risks & Mitigations
~/.kittifymigration may break existing 2.x alpha users → make migrate idempotent
Wave 2: Dependent Work Packages
Work Package WP04: Add sync diagnose command (Priority: P1)
Goal: Add spec-kitty sync diagnose for local event schema validation before sending. Independent Test: Queue malformed events and verify diagnose reports specific field errors. Prompt: /tasks/WP04-sync-diagnose-command.md
Included Subtasks
- ✅ T015 Create
src/specify_cli/sync/diagnose.pywith event validation logic - ✅ T016 Validate events against Pydantic
Eventmodel fromspec_kitty_events.models - ✅ T017 [P] Validate WPStatusChanged payloads against
StatusTransitionPayload - ✅ T018 Register
sync diagnoseCLI command in sync command group - ✅ T019 Write tests: valid events pass, malformed events report specific field errors
Implementation Notes
- Reuse error categorization from WP02 (T006) for consistent error grouping
- Read events from SQLite queue using existing
queue.pyread methods - Output format: per-event validation (event_id, valid/invalid, error list)
Parallel Opportunities
- T017 can proceed alongside T016 (different payload types)
Dependencies
- Depends on WP02 (error categorization in T006)
Risks & Mitigations
- Pydantic model validation may be strict in unexpected ways → test with real queue data from 2.x
Work Package WP07: SaaS handoff contract document (Priority: P0)
Goal: Produce a contract doc that enables the SaaS team to validate their batch endpoint against CLI payloads. Independent Test: Fixture data validates against CLI-side Pydantic models. Prompt: /tasks/WP07-saas-handoff-contract.md
Included Subtasks
- ✅ T029 Document complete event envelope fields with types, constraints, and examples
- ✅ T030 [P] Document batch request/response format: headers, compression, URL, body structure
- ✅ T031 [P] Document authentication flow: JWT login, refresh, authorization header
- ✅ T032 Create 3-5 complete fixture request/response examples covering success, duplicate, rejected, and mixed
- ✅ T033 Write contract test: validate fixtures against Pydantic
Eventmodel
Implementation Notes
contracts/batch-ingest.mdandcontracts/lane-mapping.mdalready exist from Phase 1 planning — extend them with fixtures- Cross-reference lane mapping from WP06 testing (T025-T028)
- Fixture data must include emitted event types: WPStatusChanged, WPCreated, WPAssigned, FeatureCreated, FeatureCompleted, HistoryAdded, ErrorLogged, DependencyResolved
Parallel Opportunities
- T030 and T031 can proceed in parallel (request format vs. auth flow)
Dependencies
- Depends on WP02 (error format from T006/T007)
Risks & Mitigations
- SaaS endpoint may not match documented contract → fixtures enable the SaaS team to test independently
Wave 3: Integration
Work Package WP09: End-to-end CLI smoke test (Priority: P0)
Goal: Exercise the full create-feature → setup-plan → implement → review sequence against a temp repo. Independent Test: Test is self-contained — creates and cleans up its own temp repository. Prompt: /tasks/WP09-e2e-smoke-test.md
Included Subtasks
- ✅ T040 Create
tests/e2e/directory with__init__.pyandconftest.py - ✅ T041 Write temp repo fixture: git init, spec-kitty init, .kittify setup
- ✅ T042 Implement full test sequence: create-feature → setup-plan → finalize-tasks → implement → move-task
- ✅ T043 [P] Add
pytest.mark.e2emarker topyproject.toml - ✅ T044 Verify test passes locally and document CI considerations
Implementation Notes
- Use
typer.testing.CliRunnerorsubprocess.runfor CLI invocations - Test must verify intermediate artifacts exist: spec.md, plan.md, tasks/, worktree
- Mark with
pytest.mark.e2efor optional CI separation - Final state: WP01 in
for_reviewlane, all artifacts exist
Parallel Opportunities
- T043 is independent file edit, can proceed alongside T040-T042
Dependencies
- Depends on WP01 (setup-plan must work)
Risks & Mitigations
- E2E test may be flaky in CI → ensure robust cleanup; use
pytest.mark.e2efor separation
Dependency & Execution Summary
- Wave 1 (parallel): WP01, WP02, WP03, WP05, WP06, WP08 — all independent, run concurrently
- Wave 2 (depends on Wave 1): WP04 (→WP02), WP07 (→WP02)
- Wave 3 (integration): WP09 (→WP01)
- MVP Scope: WP01 + WP02 + WP09 (planning workflow + sync diagnostics + smoke test)
- Parallelization: Up to 6 agents can work simultaneously in Wave 1
Subtask Index (Reference)
| Subtask ID | Summary | Work Package | Priority | Parallel? |
|---|---|---|---|---|
| T001 | Apply missing import fix to feature.py | WP01 | P0 | No |
| T002 | Investigate xfail planning test | WP01 | P0 | No |
| T003 | Verify planning workflow tests pass | WP01 | P0 | No |
| T004 | Parse per-event results[] from 200 responses | WP02 | P0 | No |
| T005 | Parse details field from 400 responses | WP02 | P0 | No |
| T006 | Implement error categorization | WP02 | P0 | Yes |
| T007 | Print actionable summary with counts | WP02 | P0 | No |
| T008 | Selective queue removal for synced/dup events | WP02 | P0 | No |
| T009 | Add --report flag for JSON failure dump | WP02 | P0 | Yes |
| T010 | Write batch parsing and queue tests | WP02 | P0 | No |
| T011 | Load real access token from credentials | WP03 | P0 | No |
| T012 | Handle token refresh and missing credentials | WP03 | P0 | No |
| T013 | Probe actual endpoint with real token | WP03 | P0 | No |
| T014 | Write auth-aware status check tests | WP03 | P0 | No |
| T015 | Create diagnose.py validation module | WP04 | P1 | No |
| T016 | Validate events against Event model | WP04 | P1 | No |
| T017 | Validate WPStatusChanged payloads | WP04 | P1 | Yes |
| T018 | Register sync diagnose CLI command | WP04 | P1 | No |
| T019 | Write diagnose validation tests | WP04 | P1 | No |
| T020 | Add aggregate query methods to queue.py | WP05 | P1 | No |
| T021 | Group events by event_type | WP05 | P1 | Yes |
| T022 | Format output with Rich tables/panels | WP05 | P1 | Yes |
| T023 | Integrate aggregates into sync status command | WP05 | P1 | No |
| T024 | Write aggregate query and output tests | WP05 | P1 | No |
| T025 | Parametrized tests for all 7 lanes | WP06 | P2 | No |
| T026 | Test invalid lane transition raises TransitionError | WP06 | P2 | No |
| T027 | Verify _SYNC_LANE_MAP is centralized | WP06 | P2 | No |
| T028 | Verify contract doc matches implementation | WP06 | P2 | No |
| T029 | Document event envelope fields | WP07 | P0 | No |
| T030 | Document batch request/response format | WP07 | P0 | Yes |
| T031 | Document auth flow (JWT login/refresh) | WP07 | P0 | Yes |
| T032 | Create fixture request/response examples | WP07 | P0 | No |
| T033 | Write contract test validating fixtures | WP07 | P0 | No |
| T034 | Audit resolution chain on 2.x | WP08 | P1 | No |
| T035 | Add ~/.kittify to resolution chain | WP08 | P1 | No |
| T036 | Eliminate legacy fallback warnings | WP08 | P1 | No |
| T037 | Emit one-time "run migrate" message | WP08 | P1 | No |
| T038 | Make spec-kitty migrate idempotent | WP08 | P1 | No |
| T039 | Write resolution chain tests | WP08 | P1 | No |
| T040 | Create tests/e2e/ directory structure | WP09 | P0 | No |
| T041 | Write temp repo fixture | WP09 | P0 | No |
| T042 | Implement full E2E test sequence | WP09 | P0 | No |
| T043 | Add pytest.mark.e2e marker | WP09 | P0 | Yes |
| T044 | Verify test passes locally | WP09 | P0 | No |
<!-- status-model:start -->
Canonical Status (Generated)
<!-- status-model:end -->
- WP01: done
- WP02: done
- WP03: done
- WP04: done
- WP05: done
- WP06: done
- WP07: done
- WP08: done
- WP09: done