Tasks: 3.2.0a5 Tranche 1 — Release Reset & CLI Surface Cleanup
Mission ID: 01KQ7YXHA5AMZHJT3HQ8XPTZ6B (mid8 01KQ7YXH) Mission Slug: release-3-2-0a5-tranche-1-01KQ7YXH Branch contract: planning/base/merge target = release/3.2.0a5-tranche-1 (branch_matches_target = true) Spec: spec.md · Plan: plan.md · Research: research.md · Bulk-edit map: occurrence_map.yaml
Overview
Eight work packages, 39 subtasks total. Seven WPs are independent and lane-parallelizable; WP02 lands last as the CHANGELOG / release-metadata consolidator. Two WPs (WP04, WP06) are at the upper end of complexity for this tranche; the rest are focused 3–5 subtask packages. WP08 was added live during /spec-kitty.tasks when finalize-tasks rejected the mission's own DecisionPoint event — same class of "tooling that bites real workflows" bug as FR-002.
Subtask Index
| ID | Description | WP | Parallel |
|---|---|---|---|
| T001 | Swap call order in runner.py:163-164 so metadata.save() precedes _stamp_schema_version() | WP01 | |
| T002 | Extend tests/cross_cutting/versioning/test_upgrade_version_update.py with schema_version persistence assertion | WP01 | [D] |
| T003 | New tests/e2e/test_upgrade_post_state.py smoke covering upgrade → branch-context | WP01 | [D] |
| T004 | Run mypy --strict and ruff check on changed surfaces; address any drift | WP01 | |
| T005 | Bump pyproject.toml::[project].version from 3.2.0a4 → 3.2.0a5 | WP02 | |
| T006 | Split CHANGELOG.md heading: convert [Unreleased - 3.2.0] → [3.2.0a5] — <date> and insert new [Unreleased] placeholder above | WP02 | |
| T007 | Consolidate per-FR CHANGELOG entries under [3.2.0a5] (collected from each landed WP's PR description) | WP02 | |
| T008 | Run tests/release/test_dogfood_command_set.py and tests/release/test_release_prep.py; update fixtures if drifted | WP02 | |
| T009 | Verify spec-kitty --version reports 3.2.0a5 after editable reinstall | WP02 | |
| T010 | Replace .python-version contents with 3.11 | WP03 | |
| T011 | Run mypy --strict src/specify_cli/mission_step_contracts/executor.py; triage and fix any errors | WP03 | |
| T012 | Add new test tests/cross_cutting/test_mypy_strict_mission_step_contracts.py invoking mypy in-process and asserting clean exit | WP03 | [D] |
| T013 | Re-run tests/cross_cutting/ and tests/missions/ to confirm no regressions from python-version change | WP03 | |
| T014 | Run ruff check .python-version pyproject.toml src/specify_cli/mission_step_contracts/ | WP03 | [D] |
| T015 | Delete deprecated /spec-kitty.checklist source template AND its override copy | WP04 | |
| T016 | Remove /spec-kitty.checklist entries from .kittify/command-skills-manifest.json and _legacy_codex_hashes.py | WP04 | [D] |
| T017 | Delete every deprecated checklist snapshot, regression baseline, and upgrade fixture per occurrence_map.yaml | WP04 | |
| T018 | Update tests/specify_cli/skills/{test_registry,test_command_renderer,test_installer}.py and tests/missions/test_command_templates_canonical_path.py to drop checklist expectations | WP04 | |
| T019 | Add aggregate regression tests/specify_cli/test_no_checklist_surface.py (recursive grep for /spec-kitty.checklist and checklist* filenames across src/tests/docs/agent dirs) | WP04 | [D] |
| T020 | Add artifact-preservation test tests/missions/test_specify_creates_requirements_checklist.py proving kitty-specs/<slug>/checklists/requirements.md still gets created | WP04 | [D] |
| T021 | Update doc references: README.md, docs/reference/{slash-commands,file-structure,supported-agents}.md per occurrence_map (REMOVE surface mentions, KEEP artifact-name mentions) | WP04 | [D] |
| T022 | Add non-git-target detection in src/specify_cli/cli/commands/init.py near the existing git not detected branch (~line 360); print one yellow info line containing both "not a git repository" and "git init" | WP05 | |
| T023 | Append a "next: run git init" item to the post-init quick-start summary in init.py when target is not a git repo | WP05 | |
| T024 | Remove the /spec-kitty.checklist quick-start line at init.py:723 (FR-003 boundary owned by WP05 to keep init.py ownership single-WP) | WP05 | |
| T025 | Add tests/specify_cli/cli/commands/test_init_non_git_message.py covering both unit assertions and CliRunner-driven smoke | WP05 | |
| T026 | Create src/specify_cli/diagnostics/__init__.py and src/specify_cli/diagnostics/dedup.py exposing report_once, mark_invocation_succeeded, invocation_succeeded, reset_for_invocation | WP06 | |
| T027 | Wrap Not authenticated, skipping sync callsites at sync/background.py:270 and :325 with report_once("sync.unauthenticated") gate | WP06 | |
| T028 | Locate the token-refresh-failed logger in src/specify_cli/auth/ and wrap with report_once("auth.token_refresh_failed") | WP06 | |
| T029 | In the agent mission create JSON-payload writer ONLY, call mark_invocation_succeeded() immediately after the final print(json.dumps(...)). Auditing other JSON-emitting commands is explicitly out of scope. | WP06 | |
| T030 | Update atexit handlers at sync/background.py:456 and sync/runtime.py:381 to consult invocation_succeeded() and downgrade warnings on success | WP06 | |
| T031 | Add tests/sync/test_diagnostic_dedup.py covering ContextVar gate + reset behavior | WP06 | [D] |
| T032 | Add tests/e2e/test_mission_create_clean_output.py covering JSON cleanup + dedup + no-red-after-success | WP06 | [D] |
| T033 | Add tests/specify_cli/cli/test_no_visible_feature_alias.py (typer walk + --help grep + hidden=True assertion) | WP07 | [D] |
| T034 | Add tests/e2e/test_feature_alias_smoke.py (passing --feature to one historically-accepting command behaves identically to --mission) | WP07 | [D] |
| T035 | Add tests/specify_cli/cli/test_decision_command_shape_consistency.py (typer walk + multi-source grep + --help listing assertion) | WP07 | [D] |
| T036 | Add an event_type-presence guard in read_events() (src/specify_cli/status/store.py:209 per-line loop) that skips events carrying a top-level event_type field (the wire-format discriminator for mission-level events), with a # Why: comment naming Decision Moment Protocol as the cooperating writer. Preserves the existing fail-loud contract for malformed lane-transition events. | WP08 | |
| T037 | Add tests/status/test_read_events_tolerates_decision_events.py exercising mixed lane-transition + DecisionPoint event logs | WP08 | [D] |
| T038 | Re-run this mission's finalize-tasks against the fixed reader to confirm the live regression is closed (no bypass needed) | WP08 | |
| T039 | Run mypy --strict src/specify_cli/status/store.py and ruff check src/specify_cli/status/ tests/status/test_read_events_tolerates_decision_events.py | WP08 | [D] |
Work Packages
WP01 — FR-002 schema_version clobber fix + regression
- Goal: After
spec-kitty upgrade --yessucceeds,spec_kitty.schema_versionMUST persist in.kittify/metadata.yaml, and a subsequentspec-kitty agent mission branch-context --jsonMUST exit 0 (noPROJECT_MIGRATION_NEEDEDblock). - Priority: P0 — foundational; the dev-experience blocker every other WP would hit.
- Independent test:
tests/e2e/test_upgrade_post_state.py(new) drives a tmp project end-to-end. - Subtasks:
- ✅ T001 Swap call order in
runner.py:163-164sometadata.save()precedes_stamp_schema_version()(WP01) - ✅ T002 Extend
tests/cross_cutting/versioning/test_upgrade_version_update.pywith schema_version persistence assertion (WP01) - ✅ T003 New
tests/e2e/test_upgrade_post_state.pysmoke covering upgrade → branch-context (WP01) - ✅ T004 Run
mypy --strictandruff checkon changed surfaces; address any drift (WP01) - Implementation sketch: One-line code change (move
_stamp_schema_versioncall belowmetadata.save) plus two test files. Read-modify-atomic-write in_stamp_schema_versionalready handles the post-save file safely. - Parallel opportunities: T002 and T003 can be drafted in parallel by the same agent after T001.
- Dependencies: none.
- Risks: minimal; covered by both unit and e2e tests.
- Estimated prompt size: ~350 lines.
- Prompt: tasks/WP01-fr002-schema-version-clobber-fix.md
WP02 — NFR-002 release metadata coherence (final consolidator)
- Goal:
pyproject.toml,CHANGELOG.md,.python-version, and the release-prep test fixtures all agree on the next prerelease state (3.2.0a5). - Priority: P0 — gates
tests/release/; lands LAST so per-WP CHANGELOG entries can be consolidated. - Independent test:
tests/release/test_dogfood_command_set.pyandtests/release/test_release_prep.pypass. - Subtasks:
- ✅ T005 Bump
pyproject.toml::[project].versionfrom3.2.0a4→3.2.0a5(WP02) - ✅ T006 Split
CHANGELOG.mdheading: convert[Unreleased - 3.2.0]→[3.2.0a5] — <date>and insert new[Unreleased]placeholder above (WP02) - ✅ T007 Consolidate per-FR CHANGELOG entries under
[3.2.0a5](collected from each landed WP's PR description) (WP02) - ✅ T008 Run
tests/release/test_dogfood_command_set.pyandtests/release/test_release_prep.py; update fixtures if drifted (WP02) - ✅ T009 Verify
spec-kitty --versionreports3.2.0a5after editable reinstall (WP02) - Implementation sketch: Mechanical edits + run release-prep tests + add CHANGELOG entries summarizing each WP01/WP03..WP08 fix.
- Parallel opportunities: none — single-file consolidator.
- Dependencies: WP01, WP03, WP04, WP05, WP06, WP07, WP08 (lands last).
- Risks: release-prep fixtures may have hardcoded version strings that need updating beyond pyproject — covered by T008.
- Estimated prompt size: ~250 lines.
- Prompt: tasks/WP02-release-metadata-coherence.md
WP03 — FR-001 .python-version + restore strict mypy
- Goal:
.python-versionno longer pins a higher floor thanpyproject.toml::requires-python;mypy --strictis clean onmission_step_contracts/executor.pyand stays clean. - Priority: P1 — local agent productivity.
- Independent test:
tests/cross_cutting/test_mypy_strict_mission_step_contracts.py(new). - Subtasks:
- ✅ T010 Replace
.python-versioncontents with3.11(WP03) - ✅ T011 Run
mypy --strict src/specify_cli/mission_step_contracts/executor.py; triage and fix any errors (WP03) - ✅ T012 Add new test
tests/cross_cutting/test_mypy_strict_mission_step_contracts.pyinvoking mypy in-process and asserting clean exit (WP03) - ✅ T013 Re-run
tests/cross_cutting/andtests/missions/to confirm no regressions from python-version change (WP03) - ✅ T014 Run
ruff check .python-version pyproject.toml src/specify_cli/mission_step_contracts/(WP03) - Implementation sketch: One-byte file change for
.python-version; minor type-annotation cleanups; one new test file. - Parallel opportunities: T012 and T014 can be drafted in parallel after T011.
- Dependencies: none.
- Risks: T011 may surface latent type errors that pre-existed but were never enforced; budget extra time inside the WP.
- Decision: Decision Moment
01KQ7ZSQKT9DVH7B4GGXWS8DTWchose3.11floor. - Estimated prompt size: ~250 lines.
- Prompt: tasks/WP03-python-version-and-strict-mypy.md
WP04 — FR-003 + FR-004 /spec-kitty.checklist bulk removal
- Goal: Zero references to
/spec-kitty.checklistacross every supported agent's rendered surface;kitty-specs/<mission>/checklists/requirements.mdstill gets created by/spec-kitty.specify. - Priority: P1 — largest WP; bulk-edit gated by
occurrence_map.yaml. - Independent test:
tests/specify_cli/test_no_checklist_surface.py(new) +tests/missions/test_specify_creates_requirements_checklist.py(new). - Subtasks:
- ✅ T015 Delete deprecated
/spec-kitty.checklistsource template AND its override copy (WP04) - ✅ T016 Remove
/spec-kitty.checklistentries from.kittify/command-skills-manifest.jsonand_legacy_codex_hashes.py(WP04) - ✅ T017 Delete every deprecated checklist snapshot, regression baseline, and upgrade fixture per
occurrence_map.yaml(WP04) - ✅ T018 Update
tests/specify_cli/skills/{test_registry,test_command_renderer,test_installer}.pyandtests/missions/test_command_templates_canonical_path.pyto drop checklist expectations (WP04) - ✅ T019 Add aggregate regression
tests/specify_cli/test_no_checklist_surface.py(WP04) - ✅ T020 Add artifact-preservation test
tests/missions/test_specify_creates_requirements_checklist.py(WP04) - ✅ T021 Update doc references per
occurrence_map.yaml(WP04) - Implementation sketch: Mechanical removal driven by
occurrence_map.yaml. Implementing agent MUST load thespec-kitty-bulk-edit-classificationskill before starting and verify the diff against the occurrence map before commit. - Parallel opportunities: T016, T019, T020, T021 can be done in parallel after T015 lands.
- Dependencies: none.
- Risks: missed reference creates a DIRECTIVE_035 violation; mitigated by T019's aggregate scanner. Snapshot tests in T017/T018 must be regenerated for ALL 12 slash-command agents.
- Boundary:
init.py:723(one occurrence) is owned by WP05 (T024) to keepinit.pyownership single-WP. - Estimated prompt size: ~500 lines.
- Prompt: tasks/WP04-checklist-surface-bulk-removal.md
WP05 — FR-005 init non-git message (+ FR-003 init.py boundary line)
- Goal: Running
spec-kitty initin a non-git directory emits a single actionable message; the deprecated/spec-kitty.checklistquick-start line atinit.py:723is removed in the same WP that ownsinit.py. - Priority: P2 — UX polish.
- Independent test:
tests/specify_cli/cli/commands/test_init_non_git_message.py(new). - Subtasks:
- ✅ T022 Add non-git-target detection in
src/specify_cli/cli/commands/init.pynear the existinggit not detectedbranch; print one yellow info line containing both "not a git repository" and "git init" (WP05) - ✅ T023 Append a "next: run
git init" item to the post-init quick-start summary ininit.pywhen target is not a git repo (WP05) - ✅ T024 Remove the
/spec-kitty.checklistquick-start line atinit.py:723(FR-003 boundary owned by WP05 to keepinit.pyownership single-WP) (WP05) - ✅ T025 Add
tests/specify_cli/cli/commands/test_init_non_git_message.pycovering both unit assertions and CliRunner-driven smoke (WP05) - Implementation sketch: Single subprocess check using
git rev-parse --is-inside-work-tree; small UX additions; one new test file. - Parallel opportunities: T025 can be drafted in parallel with T022/T023/T024 by the same agent.
- Dependencies: none.
- Risks: subprocess to
gitmay fail with the binary missing; existingis_git_available()branch already handles that case — reuse it. - Estimated prompt size: ~250 lines.
- Prompt: tasks/WP05-init-non-git-message.md
WP06 — FR-008 + FR-009 diagnostic dedup + atexit success-flag
- Goal: One-per-cause diagnostic gating per CLI invocation; no red shutdown noise after a successful JSON-output command.
- Priority: P2 — visible noise but not blocking.
- Independent test:
tests/sync/test_diagnostic_dedup.py(new) +tests/e2e/test_mission_create_clean_output.py(new). - Subtasks:
- ✅ T026 Create
src/specify_cli/diagnostics/__init__.pyandsrc/specify_cli/diagnostics/dedup.pyexposingreport_once,mark_invocation_succeeded,invocation_succeeded,reset_for_invocation(WP06) - ✅ T027 Wrap
Not authenticated, skipping synccallsites atsync/background.py:270and:325withreport_once("sync.unauthenticated")gate (WP06) - ✅ T028 Locate the token-refresh-failed logger in
src/specify_cli/auth/and wrap withreport_once("auth.token_refresh_failed")(WP06) - ✅ T029 In the
agent mission createJSON-payload writer ONLY, callmark_invocation_succeeded()immediately after the finalprint(json.dumps(...)). Auditing other JSON-emitting commands is explicitly out of scope. (WP06) - ✅ T030 Update atexit handlers at
sync/background.py:456andsync/runtime.py:381to consultinvocation_succeeded()and downgrade warnings on success (WP06) - ✅ T031 Add
tests/sync/test_diagnostic_dedup.pycovering ContextVar gate + reset behavior (WP06) - ✅ T032 Add
tests/e2e/test_mission_create_clean_output.pycovering JSON cleanup + dedup + no-red-after-success (WP06) - Implementation sketch: New
diagnosticspackage usingcontextvars.ContextVarfor dedup + module-level boolean for success flag. Wrap two existing log sites; callmark_invocation_succeeded()from JSON-emitting command paths; consultinvocation_succeeded()from atexit handlers. - Parallel opportunities: T031 and T032 (test-only) can run in parallel after T026 + T030 land.
- Dependencies: none.
- Risks:
mark_invocation_succeeded()must NOT be called on failure paths; tests in T032 cover the failure path explicitly. - Estimated prompt size: ~500 lines.
- Prompt: tasks/WP06-diagnostic-dedup-and-atexit.md
WP07 — FR-006 + FR-007 close-with-evidence regressions
- Goal: Lock down "already-fixed-on-main" status of
--featurehidden alias (#790) andspec-kitty agent decisioncommand shape (#774) by adding regression tests that prevent future drift. - Priority: P3 — close-with-evidence per
start-here.md"Done Criteria". - Independent test: the three new test files themselves.
- Subtasks:
- ✅ T033 Add
tests/specify_cli/cli/test_no_visible_feature_alias.py(typer walk +--helpgrep +hidden=Trueassertion) (WP07) - ✅ T034 Add
tests/e2e/test_feature_alias_smoke.py(passing--featureto one historically-accepting command behaves identically to--mission) (WP07) - ✅ T035 Add
tests/specify_cli/cli/test_decision_command_shape_consistency.py(typer walk + multi-source grep +--helplisting assertion) (WP07) - Implementation sketch: Three new test files; no production code changes. Each test introspects the typer app and grep-checks docs/snapshots/templates.
- Parallel opportunities: All three tests are independent — agent can implement in any order.
- Dependencies: none.
- Risks: minimal — purely additive tests over already-working code.
- Estimated prompt size: ~150 lines.
- Prompt: tasks/WP07-close-with-evidence-regressions.md
WP08 — FR-010 status event reader robustness fix
- Goal:
read_events()insrc/specify_cli/status/store.pyMUST tolerate non-lane-transition events (e.g.DecisionPointOpened,DecisionPointResolved) instatus.events.jsonlinstead of raisingKeyError('wp_id'). - Priority: P0 — currently blocks
finalize-tasks(and every other reader) for any mission that has used the Decision Moment Protocol. Discovered live during this very/spec-kitty.tasksrun; the mission's own DecisionPoint event triggered the bug. - Independent test:
tests/status/test_read_events_tolerates_decision_events.py(new). - Subtasks:
- ✅ T036 Add an
event_type-presence guard inread_events()(per-line loop) that skips events carrying a top-levelevent_typefield, with a# Why:comment naming Decision Moment Protocol as the cooperating writer. Preserves the existing fail-loud contract for malformed lane-transition events. (WP08) - ✅ T037 Add
tests/status/test_read_events_tolerates_decision_events.pyexercising mixed lane-transition + DecisionPoint event logs (WP08) - ✅ T038 Re-run this mission's
finalize-tasksagainst the fixed reader to confirm the live regression is closed (WP08) - ✅ T039 Run
mypy --strict src/specify_cli/status/store.pyandruff check src/specify_cli/status/ tests/status/test_read_events_tolerates_decision_events.py(WP08) - Implementation sketch: ~5 LOC inside
read_events()per-line loop + a sibling unit test. Uses the duck-type approach instead of an event-type allowlist for future-proofing. - Parallel opportunities: T037 and T039 can be drafted in parallel after T036.
- Dependencies: none.
- Risks: low — additive guard; existing lane-transition path unaffected.
- Live evidence: this mission's own
kitty-specs/release-3-2-0a5-tranche-1-01KQ7YXH/status.events.jsonlstarts with aDecisionPointOpenedevent andfinalize-tasksfailed on it. After T036 lands, T038 confirms the live regression is closed. - Estimated prompt size: ~250 lines.
- Prompt: tasks/WP08-status-event-reader-robustness.md
MVP Scope
MVP = WP01 + WP08 + WP02. Without WP01, every other WP's implementer hits the PROJECT_MIGRATION_NEEDED gate documented in spec.md. Without WP08, every other WP's implementer hits the finalize-tasks reader bug as soon as their mission opens any Decision Moment. Without WP02, the release-prep tests fail. Everything else is desirable but not on the critical path for "the next prerelease tag exists and works".
Parallelization
WP01, WP03, WP04, WP05, WP06, WP07, WP08 are all dependency-free and can be lane-parallelized. WP02 lands last to consolidate CHANGELOG entries.