Implementation Plan: Post-Merge Reliability And Release Hardening
Mission: 068-post-merge-reliability-and-release-hardening Branch: main (planning, base, and merge target — all main) Date: 2026-04-07 Spec: spec.md Plan input: spec.md FR-001..FR-022 + FR-023, NFR-001..NFR-006, C-001..C-006 Validated against: Fresh clone /tmp/spec-kitty-20260407-090957 at commit 7307389a1f529dae9e90279ea972609bb0b420aa
Summary
Final workflow-stabilization mission for spec-kitty core. Five work packages drive the open backlog to zero:
| WP | Scope | Owns FRs | Issues |
|---|---|---|---|
| WP01 | Post-merge stale-assertion analyzer (new src/specify_cli/post_merge/ package + new agent tests CLI group) | FR-001..FR-004, FR-022 | #454 |
| WP02 | --strategy wiring + squash default + status-events safe_commit fix (all in _run_lane_based_merge) | FR-005..FR-009, FR-019, FR-020 | #456, #416 (fix) |
| WP03 | Diff-coverage policy validation + close-or-tighten | FR-010..FR-012 | #455 |
| WP04 | Release-prep CLI populating the existing agent/release.py stub | FR-013..FR-023 | #457 |
| WP05 | scan_recovery_state + implement --base main fix (FR-021), verification report, mission-close ledger | FR-016..FR-018, FR-021 | #415, #416 (verification) |
The technical approach is locked by the spec; this plan captures the architectural decisions made during planning interrogation, the file/module layout, the contracts between WPs, and the cross-WP sequencing constraints.
Technical Context
Language/Version: Python 3.11+ (existing spec-kitty requirement) Primary Dependencies:
typer— CLI framework (existing)rich— console output (existing)ruamel.yaml— YAML parsing for.kittify/config.yaml(existing)- Stdlib
ast— Python AST parsing for WP01 stale-assertion analysis (no new dependency) - Stdlib
subprocess— git invocation (existing pattern inspecify_cli.git.commit_helpers) safe_commitfromspecify_cli.git(existing helper, re-exported fromcommit_helpers.py:38)specify_cli.lanes.recovery.scan_recovery_state(existing, extended by FR-021)specify_cli.status.emit.emit_status_transition(existing, called by_mark_wp_merged_done)
Storage: Filesystem only (no database). Mission state lives in:
kitty-specs/<mission>/status.events.jsonl— append-only event log (canonical lane state)kitty-specs/<mission>/status.json— derived snapshot.kittify/config.yaml— project configuration (merge.strategykey added by WP02).kittify/runtime/merge/<mission_id>/state.json— ephemeral runtime state (out of scope for the FR-019 fix)
Testing: pytest. All new tests SHALL be added to the existing pytest suite and SHALL run without network access (NFR-005).
Target Platform: Cross-platform CLI (macOS, Linux). FSEvents-specific timing concerns are explicitly out of scope per the Assumptions section.
Project Type: Single Python package (src/specify_cli/). No web frontend, no mobile target.
Performance Goals:
- WP01 stale-assertion analyzer: ≤ 30 seconds wall-clock on spec-kitty core (~9000+ tests) — NFR-001
- WP04 release-prep command: ≤ 5 seconds wall-clock on a mission with up to 16 WPs — NFR-004
- WP02 mission→target merge: 100% success against
require_linear_history = trueon the integration test matrix — NFR-003
Constraints:
mypy --strictclean (NFR-006, charter)- ruff clean (charter)
- Critical-path diff coverage threshold pinned at commit
7307389a(NFR-006, with WP03 carve-out) - WP01 analyzer ≤ 5 false-positive findings per 100 LOC of merged change on a curated benchmark (NFR-002)
- No GitHub API calls (C-002)
- No re-implementation of existing recovery/merge subsystems (C-003)
Scale/Scope: Five work packages. ~8 new functional surfaces (analyzer module, agent tests CLI group, strategy wiring, safe_commit fix, scan_recovery_state extension, --base support, release-prep command, validation report). Two pre-identified residual gaps to fix; one verification report to author; one mission close ledger to maintain.
Charter Check
GATE STATUS: ✅ PASS (pre-Phase-0)
Charter file: .kittify/charter/charter.md (loaded via spec-kitty charter context --action plan --json)
| Charter requirement | Compliance | Notes |
|---|---|---|
| typer as CLI framework | ✅ | All new commands use typer (WP01 agent tests, WP04 agent release prep) |
| rich for console output | ✅ | Stale-assertion report and release-prep payload both use rich for human output |
| ruamel.yaml for YAML parsing | ✅ | .kittify/config.yaml merge.strategy key parsed via existing ruamel.yaml infrastructure |
| pytest with 90%+ test coverage for new code | ✅ | NFR-006 enforces critical-path coverage threshold; FR-020 and the FR-021 regression test land alongside production code |
| mypy --strict must pass | ✅ | NFR-006 |
| Integration tests for CLI commands | ✅ | FR-020 (lane merge end-to-end), FR-022 fallback test, FR-021 recovery integration test, FR-023 release-prep CLI test |
| DIRECTIVE_010 Specification Fidelity | ✅ | Plan derives from spec FRs verbatim; no deviations introduced |
| DIRECTIVE_003 Decision Documentation | ✅ | Architectural decisions captured in research.md; cross-WP sequencing captured here and in spec FR-019 |
Charter Check post-Phase-1 re-evaluation: see end of this document.
Project Structure
Documentation (this feature)
kitty-specs/068-post-merge-reliability-and-release-hardening/
├── spec.md # Mission spec (already authored)
├── plan.md # This file
├── research.md # Phase 0: architectural decisions and current-main analysis
├── data-model.md # Phase 1: dataclasses, event shapes, payload schemas
├── quickstart.md # Phase 1: maintainer-facing how-to for the new commands and bug fixes
├── contracts/ # Phase 1: CLI command + library function signatures
│ ├── stale_assertions.md # WP01 contract
│ ├── merge_strategy.md # WP02 contract (CLI flag, config schema, library functions)
│ ├── diff_coverage_policy.md # WP03 contract (validation report shape)
│ ├── release_prep.md # WP04 contract (CLI command + JSON payload)
│ └── recovery_extension.md # WP05 contract (scan_recovery_state + --base main)
├── meta.json # Mission identity (already authored)
├── checklists/
│ └── requirements.md # Spec quality checklist
├── tasks/ # Phase 2: WP files (NOT created by /spec-kitty.plan)
└── mission-close-ledger.md # Created by WP05 at mission close (per C-005)
Source Code (repository root)
src/specify_cli/
├── post_merge/ # NEW package (WP01)
│ ├── __init__.py # Re-exports run_check, StaleAssertionFinding, StaleAssertionReport
│ └── stale_assertions.py # AST-based source identifier extraction + AST-based test scan
├── cli/commands/
│ ├── merge.py # MODIFIED (WP02): wire --strategy, default to squash, safe_commit fix
│ ├── implement.py # MODIFIED (WP05 FR-021): accept --base main flag
│ └── agent/
│ ├── __init__.py # MODIFIED: register new `tests` subapp
│ ├── release.py # POPULATED (WP04): replace stub with real `prep` command
│ └── tests.py # NEW (WP01): `stale-check` subcommand
├── lanes/
│ ├── merge.py # MODIFIED (WP02): honor strategy parameter from upper layer
│ └── recovery.py # MODIFIED (WP05 FR-021): scan_recovery_state consults status events
├── git/
│ └── commit_helpers.py # USED AS-IS (safe_commit imported by WP02)
├── status/
│ ├── emit.py # USED AS-IS (emit_status_transition called by mark-done loop)
│ ├── store.py # USED AS-IS (append_event)
│ └── reducer.py # USED AS-IS (materialize for WP05 verification)
└── release/ # NEW package (WP04, locked — package split committed at plan time)
├── __init__.py
├── changelog.py # Build draft changelog from mission/WP artifacts
├── version.py # Version bump per channel
└── payload.py # Build structured release-prep payload
.kittify/
├── config.yaml # MODIFIED (WP02): merge.strategy schema added
└── charter/charter.md # USED AS-IS
.github/workflows/
└── ci-quality.yml # POSSIBLY MODIFIED (WP03, only if validation finds residual gap)
tests/
├── post_merge/
│ └── test_stale_assertions.py # NEW (WP01): FR-002, FR-022, NFR-001, NFR-002 coverage
├── cli/commands/
│ ├── test_merge_strategy.py # NEW (WP02 FR-005..FR-009): strategy wiring + push-error parser
│ ├── test_merge_status_commit.py # NEW (WP02 FR-019, FR-020): events committed to git
│ └── test_implement_base_flag.py # NEW (WP05 FR-021): --base main flag
├── lanes/
│ └── test_recovery_post_merge.py # NEW (WP05 FR-021): scan_recovery_state with merged-deleted branches
├── cli/commands/agent/
│ ├── test_release_prep.py # NEW (WP04 FR-013..FR-023): release-prep CLI + JSON payload
│ └── test_tests_stale_check.py # NEW (WP01 FR-004): CLI subcommand wires through to library
└── (existing tests untouched unless WP03 changes ci-quality policy)
Structure Decision: This is a single Python project that extends an existing CLI tool. No web/mobile/multi-project structure. New code lands in:
src/specify_cli/post_merge/— new package for WP01src/specify_cli/cli/commands/agent/tests.py— new CLI subgroup for WP01src/specify_cli/release/— new package for WP04 (package split is locked at plan time; not inlined)- Existing files modified:
cli/commands/merge.py(WP02),cli/commands/implement.py(WP05),cli/commands/agent/release.py(WP04),cli/commands/agent/__init__.py(WP01 + WP04 registrations),lanes/merge.py(WP02),lanes/recovery.py(WP05)
Cross-WP Sequencing & Dependencies
The lane-planning step that runs after /spec-kitty.tasks will use this dependency graph to compute parallelism. Critical sequencing constraints:
WP01 ────────────────────────► (independent, parallel-safe)
new files only: post_merge/, agent/tests.py
touches agent/__init__.py for registration
WP02 ────────────────────────► (sequential within itself)
FR-005/006/007/008 ─► FR-009 ─► FR-019 ─► FR-020
│
└── all edits land in _run_lane_based_merge
in cli/commands/merge.py and lanes/merge.py
WP03 ────────────────────────► (verification-first, low-risk)
FR-010 (validation report) ─► FR-011 OR FR-012
│
└── only touches .github/workflows/ if FR-012 fires
WP04 ────────────────────────► (independent, parallel-safe)
touches cli/commands/agent/release.py + agent/__init__.py registration
reads kitty-specs/ artifacts read-only
WP05 ────────────────────────► (independent of WP02 now that FR-019/020 moved)
FR-021 ─► FR-016 (verification report) ─► FR-018 (mission close ledger)
touches lanes/recovery.py + cli/commands/implement.py
Lane-conflict matrix (file-level)
| File | Touched by |
|---|---|
src/specify_cli/cli/commands/merge.py | WP02 only |
src/specify_cli/lanes/merge.py | WP02 only |
src/specify_cli/lanes/recovery.py | WP05 only |
src/specify_cli/cli/commands/implement.py | WP05 only |
src/specify_cli/cli/commands/agent/__init__.py | WP01 + WP04 (both add registrations) — shared edit |
src/specify_cli/cli/commands/agent/release.py | WP04 only |
src/specify_cli/cli/commands/agent/tests.py | WP01 only (new file) |
src/specify_cli/post_merge/ | WP01 only (new package) |
src/specify_cli/release/ | WP04 only (new package) |
.kittify/config.yaml | WP02 only |
.github/workflows/ci-quality.yml | WP03 only (and only if FR-012 fires) |
Only shared edit: agent/__init__.py (WP01 + WP04 both add subapp registrations). Each WP appends a single app.add_typer(...) line at the bottom of the file, registering different subapp names (tests for WP01, release for WP04). This is a textbook trivially-mergeable concatenation conflict — git resolves it without human help. The lane planner MAY place WP01 and WP04 in separate lanes. If both lanes register their subapp under different names (which they do), the merge has zero overlap.
Recommended lane allocation (to be confirmed at /spec-kitty.tasks):
- Lane A: WP02 (merge command — large, sequential, longest chain)
- Lane B: WP01 (stale-assertion analyzer — new package, isolated)
- Lane C: WP04 (release-prep — populates existing stub, isolated)
- Lane D: WP05 (recovery + implement)
- Lane E: WP03 (verification-first, low-risk, can run last)
This gives 5 parallel lanes maximum, with the longest sequential chain inside Lane A (WP02's seven FRs — FR-005..FR-009 + FR-019/FR-020 — all touching the same file). The agent/__init__.py concatenation between lanes B and C is auto-resolvable.
If conflict-aversion is preferred over parallelism (e.g., to avoid even auto-resolvable merges), Lanes B and C can collapse into a single Lane B' that runs WP01 then WP04 sequentially. The default recommendation is full parallelism.
Phase 0 Output Pointer
Phase 0 research is delivered as kitty-specs/068-post-merge-reliability-and-release-hardening/research.md.
Phase 0 deliverables:
1. Decision log for the three planning answers (library choice, command surface, library-import wiring) 2. Current-main analysis for the existing modules WP01/WP04 will integrate with (stale_check.py, agent/release.py, commit_helpers.py) 3. Failure-mode reproduction for the FR-019 bug (recovered from session evidence and FROM the spec's Mission 067 Failure-Mode Evidence (A) section) 4. Failure-mode reproduction for the FR-021 bug (scan_recovery_state + --base main, from Mission 067 Failure-Mode Evidence (B)) 5. Library-import wiring rationale — why the merge runner imports run_check directly rather than spawning a subprocess
No [NEEDS CLARIFICATION] markers remain after Phase 0.
Phase 1 Output Pointers
Phase 1 design artifacts:
1. data-model.md — dataclasses for StaleAssertionFinding, StaleAssertionReport, ReleasePrepPayload, MergeStrategy, MergeConfig, RecoveryVerificationEntry, MissionCloseLedgerRow. Plus the canonical shape of the new done event the safe_commit fix persists. 2. contracts/stale_assertions.md — WP01 library + CLI signatures 3. contracts/merge_strategy.md — WP02 CLI flag, config schema, library function signatures, push-error parser token list 4. contracts/diff_coverage_policy.md — WP03 validation report shape 5. contracts/release_prep.md — WP04 CLI command, JSON payload, integration with existing version-bump infrastructure 6. contracts/recovery_extension.md — WP05 scan_recovery_state extension surface, --base main flag, mission-close ledger schema 7. quickstart.md — maintainer-facing walkthrough: run a synthetic merge that exercises FR-019, run release-prep, run the stale-assertion analyzer, exercise the FR-021 post-merge unblocking path
Agent context file update happens at the end of Phase 1 via the existing agent script.
Complexity Tracking
No charter violations. The spec was reviewed three times and all "complexity" the plan inherits is justified by either a tracked GitHub issue or a reproduced 067 failure mode. No items required.
| Violation | Why Needed | Simpler Alternative Rejected Because |
|---|---|---|
| (none) | n/a | n/a |
Post-Phase-1 Charter Re-evaluation
GATE STATUS: ✅ PASS
After Phase 1 artifacts (data-model.md, contracts/*, quickstart.md) landed, the charter check was re-run against the design surface:
| Charter requirement | Phase 1 verification | Status |
|---|---|---|
| typer as CLI framework | All new CLI surfaces (agent tests stale-check, agent release prep, --strategy, --base) use typer parameters in their contract files | ✅ |
| rich for console output | StaleAssertionReport and ReleasePrepPayload both render via rich Console; merge runner reuses existing rich infrastructure | ✅ |
| ruamel.yaml for YAML parsing | MergeConfig reads .kittify/config.yaml's merge.strategy key via the existing ruamel.yaml accessor in specify_cli.config | ✅ |
| pytest with 90%+ test coverage for new code | Each contract file lists a test surface table mapping FRs to tests; FR-020 has the explicit regression test pattern; FR-021 has Scenario 7 coverage; FR-022 has its own fallback test | ✅ |
| mypy --strict must pass | All new dataclasses in data-model.md are fully typed with Literal[...], Path, list[...], tuple[...], Enum — no Any leakage | ✅ |
| Integration tests for CLI commands | test_strategy_flag_flows_through, test_done_events_committed_to_git, test_implement_base_flag_creates_workspace_from_ref, test_prep_command_emits_json_with_flag, test_cli_subcommand_invokes_library — all integration-level | ✅ |
| DIRECTIVE_010 Specification Fidelity | Every contract file references the specific FR(s) it implements; no contract introduces behavior not in the spec | ✅ |
| DIRECTIVE_003 Decision Documentation | research.md captures all three planning decisions with rationale and rejected alternatives; cross-WP sequencing captured in plan.md and spec.md FR-019 | ✅ |
No new charter violations introduced by Phase 1 artifacts. Spec ↔ plan ↔ design alignment is verified.
NFR coverage map
| NFR | Threshold | Phase 1 verification location |
|---|---|---|
| NFR-001 | ≤ 30s wall clock | contracts/stale_assertions.md test test_runs_within_30s_on_spec_kitty_core |
| NFR-002 | ≤ 5 FP / 100 LOC | contracts/stale_assertions.md test test_fp_ceiling_under_5_per_100_loc + FR-022 fallback |
| NFR-003 | 100% success on protected linear-history matrix | contracts/merge_strategy.md test test_protected_linear_history_succeeds_default |
| NFR-004 | ≤ 5s for 16-WP missions | contracts/release_prep.md test test_runs_within_5s_for_16_wps |
| NFR-005 | 0 network calls in new tests | contracts/release_prep.md test test_payload_no_github_api_calls; charter requirement |
| NFR-006 | mypy strict + critical-path coverage at commit 7307389a | data-model.md types fully annotated; WP03 carve-out documented in contracts/diff_coverage_policy.md |