Spec Kitty 3.2.3 patch — debugger-debbie investigation

Branch fix/3.2.3-coord-surface-regressions @ 775ec32da (off v3.2.2 main), editable install active. Lens: LIVE-confirm each ticket + root-cause; classify split-brain (coord/primary surface-resolution); does SSOT/topology-alignment fix it. All reproductions are real runs (CLI or live Python against the editable install), not static reads. Repo left clean (read-only).


Per-ticket verdict table

# reproduced (Y/N + repro) root cause (file:func:line) split-brain? (Y/N + facet) SSOT/topology fixes it? (Y/N + seam)
#2122 Ypytest tests/specify_cli/test_acceptance_regressions.py::test_collect_feature_summary_anchors_feature_dir_on_primary_for_mid8_handle FAILS: AcceptanceError: Feature '01ABCDEF' has no tasks directory at .../kitty-specs/01ABCDEF/tasks. Real entry point collect_feature_summary(repo_root, mid8). acceptance/__init__.py:_iter_work_packages:406 (→ _wp_tasks_read_dir:827resolve_planning_read_dir) AND acceptance/__init__.py:normalize_feature_encoding:645 (→ _planning_read_dirresolve_planning_read_dir). Both pass the raw handle to resolve_planning_read_dir(repo_root, feature, kind=…)primary_feature_dir_for_mission:1212 which joins the handle LITERALLY to kitty-specs/<handle>/ (line 1240, no handle→slug step). Adjacent, NOT the stale-coord facet. Facet = handle-resolution-seam asymmetry: the gate has a handle-AWARE anchor (_primary_anchor_feature_dir:889resolve_mission) and a handle-BLIND read (resolve_planning_read_dir); two call sites use the blind one. Y — fix = resolve handle→canonical slug FIRST (via the same resolve_mission/_primary_anchor_feature_dir seam the summary already uses), then pass the slug to resolve_planning_read_dir. Restores the handle-awareness resolve_feature_dir_for_mission had pre-#2113.
#2120 Y — built a real coord-topology mission (materialized -coord worktree, coord branch, lanes.json); ran spec-kitty mission close --mission demo-coord-feat --discard --force. Output: ✓ Mission … discarded. yet git worktree list and git branch BOTH still show the coord worktree + branch. cli/commands/mission_type.py:close_cmd:595 resolves via resolve_feature_dir_for_mission (coord-aware) → returns the materialized coord dir (no meta.json, live-proved meta? False for all handle forms). Then _read_mission_mid8:632 returns "", so _teardown_coordination_worktree:717 (if not mid8_value: return) early-returns → no teardown. Line 626 prints unconditionally. Secondary: _discard_mission:658-659 deletes lane branches BEFORE removing the worktree they're checked out in (git branch -D fails, swallowed check=False). Y — THE canonical split-brain facet. close --discard resolves coord-aware; reopen resolves primary (resolve_mission/primary_feature_dir_for_mission) → two commands, same mission, different surfaces (DIRECTIVE_032 conceptual-alignment fork). Y — LIVE-PROVED: resolving the discard path via primary_feature_dir_for_mission yields meta? True, mid8='01ABCDEF', and CoordinationWorkspace.teardown then succeeds (WORKTREE GONE). Seam = the same primary-anchor reopen already uses (ticket option A). Add ordering fix (B) + non-zero exit on incomplete teardown (C).
#2119 Y — (a) with a materialized coord worktree, canonical_record_path(repo, handle) resolves the retrospective WRITE target to .worktrees/<slug>-<mid8>-coord/kitty-specs/<slug>-<mid8>/retrospective.yaml (live-printed, IS on coord worktree? True) — on the ephemeral coord branch. (b) spec-kitty agent worktree --helpNo such command 'worktree' (live). (1) retrospective/writer.py:canonical_record_path:36-49 resolves via resolve_feature_dir_for_slug (coord-topology-aware) → durable home lands on the ephemeral coord branch, never .kittify/missions/<mission_id>/. (2) Dead-end remediation text in coordination/surface_resolver.py:119 (_COORD_EMPTY_FALLBACK_WARNING) and :203 (CoordinationBranchDeleted.next_step) + cli/commands/doctor.py:3197/3221 all recommend the nonexistent spec-kitty agent worktree repair. Y (facet 1) — same write/read-resolved-through-coord-seam class as #2120, plus a teardown-vs-retro conflict (only copy lives in the worktree merge/close removes) and a dead-end-command facet (orthogonal docs bug). Y (facet 1) — anchor the retrospective DURABLE home on PRIMARY (e.g. primary_feature_dir_for_mission, or the .kittify/missions/<mission_id>/ the mission-review skill already expects), so merge teardown neither blocks on nor destroys it; flatten meta on merge. Partial for the rest: (3) make retrospect create tolerate a flattened/torn-down topology (don't hard-block on CoordinationBranchDeleted for an already-merged mission); (4) fix the remediation text — implement worktree repair or point at the real flatten path.
#2112 Yspec-kitty init <name> creates SUBDIR <name>/.kittify/config.yaml, never cds in and never git init. spec-kitty specify then errors SPEC_KITTY_REPO_NOT_INITIALIZED … Missing: …/.kittify/config.yaml — byte-matching the report (two modes: no .git anywhere → resolved-root=project but reported Missing; OR a parent .git exists → resolver walks UP to the parent, resolved-root=parent). /spec-kitty in Cursor works because the slash surface runs the agent prompt/scripts, not the Typer guard. core/paths.py:resolve_canonical_root:417-419 returns the first ancestor whose .git is a directory with no .kittify boundary check (walks past the project's own .kittify); and :437 raises WorkspaceRootNotFound when no git marker exists even with a valid .kittify/config.yaml. Surfaced via workspace/assert_initialized.py:assert_initialized:92-105cli/commands/lifecycle.py:_enforce_initialized:100-101. The sibling locate_project_root:254-265 DOES have the .kittify fallback — classic two-parser divergence. N — repo-root detection, NOT coord/primary split-brain. No worktree/coord branch/kitty-specs involved; a plain ancestor-walk landing on the wrong root for a new single-checkout project. priti's triage CONFIRMED. N (not the coord SSOT). Fix is upstream of topology: give resolve_canonical_root a .kittify-marker boundary/fallback so it agrees with locate_project_root (a root-authority dedup). Dedup: NOT a clean dup — it is the un-fixed residual of #2011 (same function; #2011's patch fixed the submodule pointer branch but left the .git-directory + no-git branches). #539 is a DIFFERENT bug (worktree-root misroute). Link to #2011, extend that fix; do not close as dup of #539.
#2116 N/A (tech-debt, not a live bug)wc -l tasks.py = 3365 LOC (vs ~1200 target); ruff … --select C901 = "All checks passed" (maxCC≤15 IS met). (a) body-thinning only — mega-commands move_task/mark_status/map_requirements undecomposed (agent/tasks.py, move_task def L899 ~770 LOC). (b) tasks.py:_skip_target_branch_commit:530 + _protected_branch_status_commit_error:497 govern a load-bearing exit-0 silent-skip + --json envelope reshape (status_events_path→coord, wp_file_update:"skipped", L1622-1644) that commit_for_mission cannot reproduce (its protected-primary path returns no_op_wrong_surface, an error-class refusal). (c) pre-existing fork: move_task skips exit-0 (L995/L1024) but mark_status:1798 / map_requirements:2425 REFUSE exit-1. N — coord-ADJACENT exit-semantics, NOT the stale-surface-read class. Here surface resolution is CORRECT; topology is correctly detected and the commit is skipped/refused. The defect is inconsistent exit semantics across sibling commands (DIRECTIVE_032 fork) — no stale read, no data loss. Partial. (a) orthogonal (pure decomposition). (b)/(c) addressable by teaching the router ONE topology-aware protected+coord exit policy (a router-contract change affecting all callers — exactly what #2114 deferred to stay behavior-neutral).

Cross-cutting structural read (DIRECTIVE_001 / 032 — divergence matrix)

The single recurring structural fault across #2120, #2119(facet 1), and #2122 is one boundary violation: a primary-anchored operation is routed through a coordination-topology-aware resolver, so it lands on the ephemeral -coord surface instead of the durable primary checkout.

Operation Wrong (coord-aware) seam used Correct (SSOT-aligned) seam Symptom
close --discard identity/mid8 read resolve_feature_dir_for_mission (mission_type.py:595) primary_feature_dir_for_mission / resolve_mission (as reopen does) silent no-op teardown, lie
retrospective DURABLE write resolve_feature_dir_for_slug (writer.py:48) primary anchor / .kittify/missions/<id>/ retro destroyed/blocked by teardown
accept-gate WP-task + encoding read handle passed blind to resolve_planning_read_dir (acceptance:406/645) resolve handle→slug FIRST, then primary read --mission <mid8> accept-gate crash / silent no-op

primary_feature_dir_for_mission is topology-blind and correct but handle-blind (literal join, no mid8→slug). So the unified fix has TWO obligations:

  1. Anchor primary-partition operations on the primary surface (#2120, #2119-write) — stop using the coord-aware resolver for durable/identity reads.
  2. Resolve handle→canonical slug at the input boundary before the literal primary join (#2122) — the gate already owns this seam (_primary_anchor_feature_dir/resolve_mission); the two stragglers must reuse it, not the blind primitive.

These are the same single-authority discipline the surface-resolver missions (01KVN754 / #2040) established; these three are residual non-adoption sites.

Coherent 3.2.3 SSOT/topology-alignment slice

  • Tier 1 — same-class split-brain (one coherent slice, ships together):
    • #2120 — discard path resolves PRIMARY (option A) + worktree-before-branch ordering (B) + non-zero exit on incomplete teardown (C). Red-first: the deterministic close-discard coord regression.
    • #2119 facet 1 — retrospective durable home anchored on PRIMARY (+ flatten meta on merge). Red-first: assert canonical_record_path for a materialized-coord mission is NOT under .worktrees/…-coord/.
    • #2122 — resolve handle→slug before resolve_planning_read_dir at _iter_work_packages + normalize_feature_encoding (and audit any other gate-read site for the same blind-handle bug). Red-first guard already exists (…_for_mid8_handle).
  • Tier 2 — coord-ADJACENT, fold in but separable:
    • #2119 facets 3/4retrospect create tolerate flattened topology; replace the phantom agent worktree repair text (surface_resolver.py:119/203, doctor.py:3197/3221) with the real flatten path (or implement the command).
    • #2116 (b)/(c) — unify the protected+coord exit semantics into ONE router-owned policy (topology-aware exit, not stale-read); (a) body-thinning is orthogonal cleanup.
  • NOT in this slice — different root authority:
    • #2112 — repo-root detection (resolve_canonical_root .kittify boundary/fallback). Residual of #2011, NOT coord split-brain. Fix independently; link to #2011 (not a dup of #539).