Implementation Plan: Coordination and Merge Stabilization
Branch: main (planning base & merge target; lane branches kitty/mission-coordination-merge-stabilization-01KTXRVR-lane-* at implement time) | Date: 2026-06-12 | Spec: spec.md Input: Feature specification from kitty-specs/coordination-merge-stabilization-01KTXRVR/spec.md
Summary
Fix the three live root-cause classes from the validated 3.2.0 Coordination & Merge cluster — (B) coordination worktree left behind its own branch after merge-pipeline update-ref calls (#1826), (C) finalize-tasks --validate-only mutating the git checkout (#1861 Part 1), (D) husk directories under .worktrees/ silently resolving as workspaces (#1833) — plus narrow residuals of two drained classes (A: finalize residue #1814, coord-unaware status reads #1735, baseline test gap #1827; F: merge-driver hardening #1736), and perform tracker hygiene closing seven stale-open fixed issues. Every fix is a small, localized guard or cleanup with a regression ratchet; architecture rework is explicitly out of scope (deferred to an umbrella under epic #1666). Authoritative defect locations with file:line evidence: validation/cluster-validation-brief.md, validation/debbie-analysis.md, validation/paula-analysis.md.
Technical Context
Language/Version: Python 3.11+ Primary Dependencies: typer (CLI), rich (console), ruamel.yaml (frontmatter), GitPython-free subprocess git plumbing (existing specify_cli.git helpers) Storage: Git repository state (branches, worktrees, index); JSONL append-only event log (status.events.jsonl); JSON metadata (meta.json, lanes.json) Testing: pytest (unit + integration with real temporary git repos, following existing tests/specify_cli/cli/commands/test_merge_coord_topology_1772.py fixture patterns); architectural ratchet tests in tests/architectural/; mypy --strict; ruff; ≥90% coverage on changed lines Target Platform: Developer workstations (macOS/Linux) running the spec-kitty CLI; CI (GitHub Actions, ci-quality.yml) Project Type: single (existing src/specify_cli + src/mission_runtime package layout — no new packages) Performance Goals: No regression in merge wall-clock; resync adds at most one git reset --hard + one git status --porcelain per ref-advance site (3 sites) Constraints: C-001 stability-only (no architecture rework); C-002 three known ref-advance sites only; C-003 cleanup-at-source, no exclusion-list widening; C-004 ordering (hygiene first; except-narrowing and backstop wording land with/after resync); C-005 PR-only landing on origin/main, terminology guard pre-push Scale/Scope: ~8 production files touched, ~10 new/extended test files, 13 GitHub issues dispositioned; 6 work-package shape
Charter Check
GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.
- CLI framework typer / console rich / YAML ruamel.yaml: PASS — no new dependencies introduced; all changes inside existing modules.
- pytest with 90%+ coverage for new code: PASS — plan mandates regression-test-first per class (NFR-004); every fix lands with its test.
- mypy --strict must pass: PASS — no suppressions planned (NFR-004); new error types are plain typed exceptions.
- Integration tests for CLI commands: PASS — end-to-end coordination-topology merge test (AC-B1), validate-only HEAD-stability test (AC-C1), husk-resolution test (AC-D1).
- No architecture rework (mission constraint C-001 reinforcing charter simplicity): PASS — guards and cleanups only; resolver APIs unchanged.
Post-design re-check (Phase 1 complete): PASS — design artifacts introduce no new entities requiring storage, no new public APIs, no new dependencies. The only new public surface is one structured error type per guard (Class B refusal, Class D resolution failure), consistent with existing error_code patterns.
Project Structure
Documentation (this mission)
kitty-specs/coordination-merge-stabilization-01KTXRVR/
├── plan.md # This file
├── research.md # Phase 0 output — root-cause verification & fix-shape decisions
├── data-model.md # Phase 1 output — state/invariant model for worktree & placement surfaces
├── quickstart.md # Phase 1 output — how to reproduce, fix, verify each class
├── contracts/ # Phase 1 output — behavioral contracts per fix class
│ ├── class-b-ref-advance-resync.md
│ ├── class-c-validate-only-readonly.md
│ ├── class-d-workspace-resolution.md
│ ├── class-a-residual-cleanups.md
│ └── class-f-merge-driver-hardening.md
├── validation/ # Committed Debbie/Paula source analyses (specify phase)
└── tasks.md # Phase 2 output (/spec-kitty.tasks — not created here)
Source Code (repository root)
src/specify_cli/
├── lanes/merge.py # Class B: update-ref sites :440,:474; Class F: _make_merge_env extraction
├── cli/commands/merge.py # Class B: bake update-ref :993-998; resync helper call sites
├── coordination/
│ ├── workspace.py # Class B: resolve() staleness awareness (read-only reference)
│ ├── transaction.py # Class B alt: self-heal hook in BookkeepingTransaction (decision: research.md R1)
│ └── status_transition.py # Class F: narrow except at :399-400
├── git/commit_helpers.py # FR-012: backstop message names divergence cause :321-339
├── cli/commands/agent/mission.py # Class C: validate-only guard :2462; Class A: finalize residue cleanup :99-131
├── cli/commands/agent/workflow.py # Class D: lock-after-create, worktree-add hard error :2237,:2243,:2265
├── cli/commands/agent/tasks.py # Class D: toplevel assertion before git calls :1346
├── workspace/context.py # Class D: ResolvedWorkspace.exists requires .git marker :148-150
├── retrospective/gate.py # Class A: route :597 through resolve_status_surface
├── cli/commands/agent_retrospect.py # Class A: route :432 through resolve_status_surface
├── cli/commands/upgrade.py # FR-013: dry-run message :987
└── doctor.py (or status/doctor.py) # FR-007: husk detection check (follow existing doctor-check registration pattern)
tests/
├── specify_cli/cli/commands/
│ ├── test_merge_coord_worktree_resync_1826.py # NEW — AC-B1/B2/B4
│ ├── test_finalize_tasks_validate_only_readonly.py # NEW — AC-C1/C2
│ ├── test_workspace_husk_resolution_1833.py # NEW — AC-D1/D2
│ └── test_merge_coord_topology_1772.py # EXTEND — AC-A3 baseline recording (unmock :224-225)
├── status/test_event_log_merge.py # EXTEND — AC-F2 mixed-timestamp ratchet
├── specify_cli/test_wp06_sc2_paused_mission_blockers.py # EXTEND — AC-A1 residue
└── architectural/
└── test_execution_context_parity.py # EXTEND — AC-A2 read-surface ratchet; AC-B3 no-raw-update-ref ratchet; AC-F1 env-helper ratchet
Structure Decision: Single-project layout (existing). All changes are in-place guards/cleanups inside src/specify_cli; no new modules except possibly one small git/ref_advance.py helper if research R1 selects the shared-helper shape (decision recorded in research.md).
Complexity Tracking
No charter violations. The only complexity decision is helper-vs-inline for the Class B resync (research.md R1) — both shapes respect C-001/C-002.
Implementation Concern Map
> Implementation concerns are NOT work packages. /spec-kitty.tasks translates these into executable WPs.
IC-01 — Tracker hygiene: close the stale-open fixed issues
- Purpose: Stop the tracker double-counting fixed defects; re-scope the four partially-fixed issues to residuals so triage reflects code reality.
- Relevant requirements: FR-011
- Affected surfaces: GitHub issues #1770, #1789, #1816, #1771, #1571, #1784, #1735 (close); #1814, #1736, #1833, #1861 (re-scope); file one follow-up umbrella issue under epic #1666 capturing the deferred non-goals (C-001 list).
- Sequencing/depends-on: none (do first — unblocks triage; per C-004)
- Risks: Closing requires citing the correct landed commits (8544012fa, 9c8bff06f, c5a10ce56, PR #1719); validation comments already posted 2026-06-12 carry the citations.
IC-02 — Class C: validate-only must not mutate the checkout
- Purpose:
finalize-tasks --validate-onlyis read-only; eliminates the operator-trust defect (#1861 Part 1). - Relevant requirements: FR-002; AC-C1..C3
- Affected surfaces:
src/specify_cli/cli/commands/agent/mission.py:2462(_ensure_branch_checked_outcall); new testtest_finalize_tasks_validate_only_readonly.py - Sequencing/depends-on: none (one-line guard + test; independent)
- Risks: Minimal. Must confirm no downstream validate-only step depends on being on the target branch (post-WP07 reads anchor on the primary feature dir — verified in debbie-analysis Class C section).
IC-03 — Class B: coordination worktree resync after ref advance
- Purpose: Unattended coordination-topology merges complete; the safe-commit backstop never fires on self-inflicted divergence (#1826 — the only fully-valid blocker).
- Relevant requirements: FR-001, FR-012, NFR-001, NFR-002, NFR-003; AC-B1..B4
- Affected surfaces:
src/specify_cli/lanes/merge.py:440,:474;src/specify_cli/cli/commands/merge.py:993-998; possiblycoordination/transaction.py(self-heal alternative);git/commit_helpers.py:321-339(backstop message names the divergence — FR-012, lands with this concern per C-004); new testtest_merge_coord_worktree_resync_1826.py; ratchet intests/architectural/(no raw update-ref outside the chosen path — AC-B3). - Sequencing/depends-on: IC-01 recommended first (hygiene); otherwise independent. IC-06's except-narrowing lands with/after this (C-004).
- Risks:
reset --harddiscards state — the guard MUST verify the coord worktree holds no unique uncommitted state and fail loudly otherwise (NFR-002, spec Assumption 2). Shape decision (per-site inline vs shared helper) is research R1.
IC-04 — Class D: workspace resolution fall-through is failure
- Purpose: A directory that is not a real git worktree is never silently used as one; husk recovery is self-serve (#1833 residuals).
- Relevant requirements: FR-003, FR-004, FR-005, FR-007, NFR-003; AC-D1..D3
- Affected surfaces:
workspace/context.py:148-150(.git-marker existence);cli/commands/agent/workflow.py:2237/2243/2265(lock-after-create, hard error on worktree-add failure);cli/commands/agent/tasks.py:1346(toplevel assertion); doctor husk check (FR-007); new testtest_workspace_husk_resolution_1833.py - Sequencing/depends-on: none; ship the doctor check in the same release (edge-case risk below)
- Risks: Pre-existing husks on operator machines start erroring explicitly — intended, but the doctor check must land in the same release so recovery is one command (spec Edge Cases).
IC-05 — Class A residuals: residue-free finalize + canonical read surfaces + baseline test
- Purpose: Coordination missions never deadlock on finalize residue (#1814); the last two coord-unaware status reads route through the canonical surface (#1735); the #1827 baseline-recording fix gets the regression test it lacks.
- Relevant requirements: FR-006, FR-009, FR-010; AC-A1..A3
- Affected surfaces:
cli/commands/agent/mission.py:99-131(_stage_finalize_artifacts_in_coord_worktreecleanup-at-source — C-003 forbids wideningCOORD_OWNED_STATUS_FILES);retrospective/gate.py:597;cli/commands/agent_retrospect.py:432; extendtest_wp06_sc2_paused_mission_blockers.py,test_merge_coord_topology_1772.py(unmock baseline helpers :224-225),test_execution_context_parity.py(AC10 read-surface ratchet) - Sequencing/depends-on: none strictly; AC10 ratchet must land after the two read-sites are routed (else the ratchet fails on arrival)
- Risks: Cleanup-at-source must not delete operator-authored files — scope cleanup strictly to artifacts the stager itself wrote (assert by listing before/after in the test).
IC-06 — Class F: merge-driver hardening
- Purpose: The JSONL merge driver's environment, exception discipline, and mixed-schema sorting are pinned by tests so the class cannot silently regress (#1736 residuals).
- Relevant requirements: FR-008(b,c,d); AC-F1..F3
- Affected surfaces:
lanes/merge.py(_make_merge_env()extraction + all subprocess call sites);coordination/status_transition.py:399-400(narrowexcept Exception→(ValueError, FileNotFoundError)with documented GENESIS fallback); extendtests/status/test_event_log_merge.py(mixedat/timestamp/neither ratchet); env-helper ratchet test - Sequencing/depends-on: lands with/after IC-03 (C-004: narrowing may surface previously-swallowed errors on stale-worktree reads)
- Risks: Newly-propagating exceptions in coordination status reads — mitigated by ordering and by the documented fallback for the two expected error types.
IC-07 — Polish: honest messages (small, ride-along)
- Purpose:
upgrade --dry-runno longer claims success; failure messages name the resolution used (#1784 P3 crumbs). - Relevant requirements: FR-012 (message part, shipped inside IC-03), FR-013, NFR-003
- Affected surfaces:
cli/commands/upgrade.py:987 - Sequencing/depends-on: FR-013 independent; FR-012 inside IC-03
- Risks: none material.
Phase 2 Approach (executed by /spec-kitty.tasks — not here)
The IC map translates to the 6-WP shape hinted in the brief: WP01=IC-01, WP02=IC-02 (+IC-07 ride-along), WP03=IC-03, WP04=IC-04, WP05=IC-05, WP06=IC-06. WP02..WP05 are mutually independent; WP06 soft-depends on WP03 (C-004). Each WP is regression-test-first.
> Errata (2026-06-12, post-tasks; analysis finding I1): /spec-kitty.tasks collapsed this to 5 WPs — the owned-files no-overlap rule forced IC-06 (Class F) into WP03 (both edit src/specify_cli/lanes/merge.py, which also satisfies C-004's ordering internally) and the #1814 residue fix (IC-05 part) into WP02 (shares src/specify_cli/cli/commands/agent/mission.py with the validate-only guard). tasks.md §WP Shaping Note is authoritative for execution.