CaaCS findings ↔ #822 open sub-issues cross-check
Inputs
- Audit:
/home/stijn/Documents/_code/SDD/fork/spec-kitty/docs/architecture/audits/2026-05-spec-kitty-caacs.md(commitbc64dec6ee37dbbd6bc21a0a1aa3195f2bab1b57, 2026-05-08). - Epic: Priivacy-ai/spec-kitty#822 — "Epic: 3.2.0 stabilization and release readiness", state OPEN.
- Issues read (state captured 2026-05-08/09 via
gh issue view):
| # | State at read | Title (truncated) |
|---|---|---|
| 822 | OPEN | Epic: 3.2.0 stabilization and release readiness |
| 971 | OPEN | mypy strict gate fails on current baseline |
| 889 | OPEN | sync now misclassifies teamspace ingress rejection as server_error |
| 952 | CLOSED | Successful agent state transitions still leak SaaS final-sync errors (was reopened post-PR#959 per epic comment, now closed again at read time) |
| 662 | OPEN | Resolve automated CI workflow duplication |
| 825 | OPEN | Restore push-time SonarCloud after project quality gate cleanup |
| 595 | OPEN | Backlog: close Sonar quality-gate debt on release-path coverage and hotspot review |
| 771 | OPEN | spec-kitty merge: auto-rebase stale lanes onto updated mission branch |
| 631 | OPEN | Document workaround for MCP agent root confusion with worktrees on Windows |
| 630 | OPEN | Replace shell=True subprocess calls in review/baseline.py and acceptance_matrix.py |
| 726 | OPEN | sorted() in scan_for_plans should key on filename, not full path |
| 728 | OPEN | clear_mission_brief should use unlink(missing_ok=True) |
| 729 | OPEN | --show truncates brief_hash to 16 chars |
| 629 | OPEN | Add @pytest.mark.windows_ci test for os.symlink fallback |
| 644 | OPEN | Encoding mixups: stop assuming UTF-8 by default |
| 740 | OPEN | Notify users when SpecKitty starts being used and no upgrade is available |
| 323 | OPEN | Printing a page from the dashboard loses the end of it |
| 306 | OPEN | Offline queue fills at 10,000 events and starts dropping new sync events |
| 260 | OPEN | Worktree 'incompatbility' when changing to worktree sub-directory (referenced from #631) |
| 253 | OPEN | Confusion with worktrees (referenced as "docs evidence only") |
| 303 | OPEN | Use node-id set union in CI selector audit (epic says "do not schedule without current repro") |
| 317 | OPEN | Unable to install via pip (epic says "do not schedule without current repro") |
Issues from the epic body that were already closed at read time and are therefore excluded from the main maps: #967, #966, #964, #968, #904, #848 (epic's "blocker tranche" — closed by mission stable-320-p1-release-confidence-01KQTPZC per epic comments). Note on #952 in the closed-issue addendum below.
Audit findings (canonical numbering for this doc)
The numbering follows the audit's executive summary, hotspot table, triage matrix, and cross-cutting observations. F1–F5 mirror the audit's own "Top findings" list; F6–F12 cover the additional structural items the audit raises explicitly elsewhere (triage matrix and cross-cutting observations).
- F1 — Bus factor effectively 1. 89.5% of
src/commits in 1y are by one author; 14 of 15 top hotspots >90% single-author. (Audit §1, "Bus factor / knowledge map".) - F2 —
cli/commands/agent/{tasks,workflow,mission}.pyis the strongest refactor target. Three files, ~7,955 SLOC, contain five of the seven worst-complexity functions in the repo (finalize_tasksCC=160,move_taskCC=139,statusCC=87,reviewCC=84,map_requirementsCC=74). Top-three temporal-coupling pair (45 co-changes betweentasks.pyandworkflow.py). Note:finalize_taskslives inmission.py, nottasks.py, despite the name — itself a smell. - F3 — Pipeline trust is healthy. Firefighting frequency is genuinely low (~0.3% of commits after stripping false positives). The team trusts the merge pipeline.
- F4 — Project is alive and accelerating. 1001 commits to
src/in 1y, 2026-Q1 exceeds the previous five months combined; last 30 days = 193 commits. - F5 — Three empty
src/leftover dirs.src/runtime/,src/dashboard/(top-level — distinct fromsrc/specify_cli/dashboard/), andsrc/constitution/contain only stale__pycache__/. Cleanup candidate. - F6 — Duplicated
task-prompt-template.md.missions/software-dev/templates/task-prompt-template.mdandtemplates/task-prompt-template.mdco-edited 15× in 1y. Likely connascence-of-meaning bug; one is dead-code-by-edit-count. - F7 — Template ↔ dispatcher coupling is load-bearing but high-volume. Pairs #11, #14, #18, #19 in the temporal-coupling table all show command-template
.mdfiles co-changing with their CLI dispatchers 12–18×/y. Designed coupling, but worth a template-loader abstraction. (Audit cross-cutting observation #2.) - F8 —
cli/commands/charter.pydecomposition. 2934 SLOC, MI=C, three E-rated functions (interviewE=38). Effectively four CLI verbs in one file. (Triage matrix "Important + not urgent".) - F9 —
cli/commands/init.pydecomposition. 1018 SLOC, F-94init. Worst MI in the CLI command layer aftercharter.py. (Triage matrix.) - F10 —
next/runtime_bridge.pyis a hub, not a bridge. 2552 SLOC, F-46, rank #21 by churn, rank #7 by SLOC. (Triage matrix.) - F11 — D-rated migrations need cleanup.
m_0_10_8_fix_memory_structure.py(F-47),m_3_1_1_charter_rename.py(D-27),m_3_2_0_codex_to_skills.py(D-24),m_3_2_3_unified_bundle.py(D-22). Parallelisable, low-risk. (Triage matrix.) - F12 —
sync/package is a coherent cluster needing a second maintainer, not a refactor. High internal coupling, low external coupling. (Cross-cutting observation #3 — explicit non-finding for refactor; finding for ownership.) - F13 — No
tests/overlay. Audit cannot tell a CC=160 function with 95% coverage from one with no tests. (Limitations #2 and follow-up #9 — explicit blind spot.) - F14 — No
kitty-specs/overlay. Strongest causal coupling in the codebase (feature-spec ↔ source) is invisible to the current run. (Limitations #1 and follow-up #10.)
Forward map: Finding → matching #822 sub-issues
For each F#, listing every OPEN #822 sub-issue that overlaps. STRONG = direct topical match; PARTIAL = same area / adjacent concern; WEAK = distant relationship that requires interpretation.
F1 — Bus factor = 1
No matching open sub-issue. The epic does not track the knowledge-transfer or co-maintainership question at all. Zero matches.
F2 — agent/{tasks,workflow,mission}.py refactor target
No matching open sub-issue. The epic frames 3.2.0 strictly as stabilization, not refactor; it does not track the high-CC concentration in agent/ despite this being the audit's strongest "both unstable and known-defective" signal. Zero matches.
F3 — Pipeline trust healthy
This is a non-finding (positive signal). No issue should match.
| Issue | State | Strength | Reason |
|---|---|---|---|
| #662 | OPEN | WEAK | "CI workflow duplication" is adjacent to pipeline trust — duplicate runs slow the gate but the gate itself is healthy. |
| #825 | OPEN | WEAK | Push-time SonarCloud being quarantined is a specific gate-quarantine, not a trust failure; matches the theme of pipeline maintenance. |
| #595 | OPEN | WEAK | Sonar quality-gate debt is gate hygiene, not trust failure. |
F4 — Project alive and accelerating
Non-finding (positive signal). Zero matches.
F5 — Three empty src/ leftover dirs
No matching open sub-issue. The epic does not track the cleanup. Zero matches.
F6 — Duplicate task-prompt-template.md
No matching open sub-issue. The epic body and currently-open issues do not mention the template duplication. Zero matches.
F7 — Template ↔ dispatcher coupling load-bearing
No matching open sub-issue. Zero matches.
F8 — cli/commands/charter.py decomposition
No matching open sub-issue. Zero matches.
F9 — cli/commands/init.py decomposition
No matching open sub-issue. Zero matches.
F10 — next/runtime_bridge.py is a hub
No matching open sub-issue. Zero matches.
F11 — D-rated migrations cleanup
| Issue | State | Strength | Reason |
|---|---|---|---|
| #629 | OPEN | WEAK | Touches a migration (m_0_8_0_worktree_agents_symlink.py) but for Windows test coverage, not complexity rating. Same area, different concern. |
F12 — sync/ cluster needs a second maintainer
| Issue | State | Strength | Reason |
|---|---|---|---|
| #889 | OPEN | PARTIAL | sync now misclassification touches sync/batch.py — one of the cluster files the audit calls out as healthy-but-monopolised. Issue is a behavioural bug, not an ownership question, but lives in the exact subsystem F12 names. |
| #306 | OPEN | PARTIAL | Offline queue drop-mode lives in sync/queue.py and sync/emitter.py — same cluster. Same caveat: behaviour bug, not ownership. |
F13 — No tests/ overlay (audit blind spot)
No matching open sub-issue. Zero matches. (Test-related issue #967 was closed before this read.)
F14 — No kitty-specs/ overlay (audit blind spot)
No matching open sub-issue. Zero matches.
Reverse map: Open #822 sub-issue → matching audit findings
For each OPEN #822 sub-issue, which F# (if any) it matches.
| Issue | Title | Matches | Strength | One-line reason |
|---|---|---|---|---|
| #971 | mypy strict gate fails on baseline | F13 (test/quality overlay) | WEAK | Strict-mypy is type-coverage; audit explicitly notes it cannot see test/coverage overlays. Adjacent concern, not a direct call-out. |
| #889 | sync now teamspace classification | F12 | PARTIAL | Lives in sync/batch.py, a file in the cluster F12 says is healthy-but-monopolised. |
| #952 | CLOSED at read time | — | — | Excluded from main maps (see addendum). |
| #662 | CI workflow duplication | F3 | WEAK | Pipeline-hygiene topic; audit confirms pipeline trust is healthy so duplication is wastage, not risk. |
| #825 | Restore push-time SonarCloud | F3 | WEAK | Same theme as #662. |
| #595 | Sonar quality-gate debt | F3, F13 | WEAK | Gate debt is pipeline hygiene; coverage portion overlaps F13 (audit could not see coverage). |
| #771 | merge auto-rebase stale lanes | F2 (peripherally) | WEAK | merge.py is in the audit's hottest cluster (rank #5 churn, rank #4 temporal coupling at 26 co-changes with implement.py). The audit flags merge.py as F-63 _run_lane_based_merge_locked. Issue is about UX behaviour, not the F-rated function, but lives in the same file. |
| #631 | MCP worktree root confusion (docs) | None | — | Docs/UX issue; no audit signal on the worktree-MCP boundary. |
| #630 | shell=True subprocess on Windows | None | — | review/baseline.py and acceptance_matrix.py are not in the audit's top-30 hotspots. No structural signal. |
| #726 | sorted() filename key | None | — | One-line nit; not in any hotspot. |
| #728 | unlink(missing_ok=True) | None | — | One-line nit; not in any hotspot. |
| #729 | brief_hash truncation UX | None | — | UX/docs question; not in any hotspot. |
| #629 | symlink fallback Windows test | F11 | WEAK | Touches a migration (m_0_8_0_…) — same area as F11, different concern. The migrations the audit flags as D-rated are different files (m_0_10_8, m_3_1_1, m_3_2_0, m_3_2_3). |
| #644 | Encoding policy / UTF-8 assumptions | None | — | Cross-cutting policy concern; the audit did not surface encoding-handling as a hotspot. |
| #740 | "no upgrade available" UX | None | — | UX-only; not in any hotspot. |
| #323 | Dashboard print clipping | None | — | UI bug; the dashboard JS file (dashboard/static/dashboard/dashboard.js) is rank #15 by churn but the audit treats it as expected coupling with dashboard/scanner.py (rank #16). The print-CSS issue is not surfaced by churn. |
| #306 | Offline queue drop mode | F12 | PARTIAL | Lives in sync/queue.py and sync/emitter.py — F12 cluster. Behaviour bug, but same subsystem. |
| #260 | Worktree sub-directory incompatibility | F2 (peripherally) | WEAK | core/worktree.py is in the audit's hot cluster (rank #24). Issue is an MCP/editor-config problem, not a code-structure problem. |
| #253 | "Confusion with worktrees" (docs/UX) | None | — | Pure UX/onboarding; the epic notes this as docs evidence only. |
| #303 | CI selector audit set-union | None | — | CI workflow logic; not in audit scope. Epic says "do not schedule without current repro". |
| #317 | pip install failure | None | — | Packaging issue; not in audit scope. Epic says "do not schedule without current repro". |
Two gap lists
Gap list 1 — Audit findings without an open #822 issue
These are forensic signals that nobody is currently tracking under the 3.2.0 epic.
- F1 — Bus factor. Audit's #1 finding. No issue tracking the knowledge-transfer / co-maintainership question.
- F2 —
agent/{tasks,workflow,mission}.pyrefactor target. Audit's strongest "both unstable and known-defective" signal. No issue tracking decomposition oftasks.py(3746 SLOC),workflow.py(1895 SLOC), ormission.py(2314 SLOC, contains the misnamedfinalize_tasksF-160). - F5 — Empty
src/leftover dirs (src/runtime/,src/dashboard/,src/constitution/). Trivial cleanup, not tracked. - F6 — Duplicate
task-prompt-template.mdat two paths, 15 co-edits/y. Connascence-of-meaning bug not tracked. - F7 — Template ↔ dispatcher coupling abstraction. Audit suggests evaluating a template-loader abstraction; not on the epic.
- F8 —
cli/commands/charter.pydecomposition (2934 SLOC, three E-rated functions). Not tracked. - F9 —
cli/commands/init.pydecomposition (F-94init, 1018 SLOC). Not tracked. - F10 —
next/runtime_bridge.py(2552 SLOC, F-46 — bridge or hub?). Not tracked. - F11 — D-rated migrations (
m_0_10_8,m_3_1_1,m_3_2_0,m_3_2_3). Migration cleanup not tracked. - F12 —
sync/cluster ownership. Audit explicitly recommends "a second maintainer" rather than refactor; not tracked as an ownership/onboarding issue. - F13 — Test-coverage overlay on F-rated functions. Audit follow-up #9 ("re-run with
tests/in scope"). Not tracked. - F14 —
kitty-specs/↔src/temporal coupling. Audit follow-up #10. Not tracked.
Gap list 2 — Open #822 sub-issues without forensic backing
These exist on the epic but the audit did not surface a structural signal under them. Some are valid scope-by-design (UX nits, docs); some may be over-tracked relative to the structural risk.
- #631 — MCP/worktree docs. UX/docs concern.
- #630 —
shell=Trueon Windows. Files not in top-30 hotspots. - #726 —
sorted()key. One-line nit. - #728 —
unlink(missing_ok=True). One-line nit. - #729 —
brief_hashtruncation. UX question. - #629 — Windows symlink test. Test gap (audit cannot see tests).
- #644 — Encoding policy. Cross-cutting policy; not a hotspot.
- #740 — Upgrade-available notification. UX-only.
- #323 — Dashboard print CSS. UI-only.
- #260 — Worktree sub-directory MCP confusion. Editor-config issue.
- #253 — Worktree onboarding confusion. Docs/UX.
- #303 — CI selector audit. CI logic, out-of-scope.
- #317 — pip install failure. Packaging, out-of-scope.
Items where the forensic backing is WEAK / PARTIAL rather than absent: #971 (F13 weak), #889 (F12 partial), #662 (F3 weak), #825 (F3 weak), #595 (F3+F13 weak), #771 (F2 weak), #306 (F12 partial). These are the only issues with any audit signal at all; everything else in gap list 2 has zero structural backing.
Methodology / caveats
- Issue state was checked at read time (2026-05-08/09). Two issues that the prior research pass treated as open are now closed:
- #952 ("Successful agent state transitions still leak SaaS final-sync errors") — per the 2026-05-04 epic comment, was reopened during mission
stable-320-p1-release-confidence-01KQTPZCafter WP02 hosted-sync drain proof. At read time it shows CLOSED again. Excluded from main maps. - The epic's "blocker tranche" (#967, #966, #964, #968, #904, #848) was closed in the same mission. These were prominent in the epic body but are not currently open work; they are excluded from the main maps.
- #952 ("Successful agent state transitions still leak SaaS final-sync errors") — per the 2026-05-04 epic comment, was reopened during mission
- No interpretive quoting of issue content was needed. Every match in the maps above is based on file/path references in the issue body or labels (e.g., #889 explicitly names
sync/batch.py, #306 namessync/queue.pyandsync/emitter.py, #629 namesm_0_8_0_…). - No issues failed to fetch. All 21 issues read returned full JSON via
gh issue view. - Match strength is conservative. Files-in-the-same-package counts as PARTIAL only when the audit explicitly names the package as a cluster (F12
sync/). Files-in-the-same-cluster but unnamed counts as WEAK (e.g., #771 formerge.pyin F2's broader cluster). - The audit's tactic is forensic-history, not behavioural. That is why so many open issues — UX bugs, one-line correctness nits, packaging concerns — have no audit backing. This is not evidence the issues are wrong; it is evidence that history-based forensics cannot see them. Cross-checking is direction-agnostic; a behavioural-test-oriented audit would produce a different gap list.
Closed-issue note (informational only)
Five of the six already-closed epic-blocker issues had clear behavioural roots that the audit would surface if it had tests/ and behavioural data in scope:
- #967 (status test suite hangs) → would map to F13 (test overlay missing).
- #966 (task-board progress reporting) → would map to F2 (
agent/status.pyis rank #30, in the same cluster). - #964 (skill YAML frontmatter) → no audit signal; mission template / packaging concern.
- #968 (retired checklist leftovers) → would map to F11 (migration / cleanup theme).
- #904 (stale rejected-review verdict) → would map to F2 (review-cycle policy, lives in
agent/workflow.pycluster). - #848 (uv.lock / installed pin drift) → no audit signal; packaging concern.
Listed here only because they are mentioned prominently in the epic body and a reader of #822 may expect them in the main maps. They are out of scope for the active forward/reverse maps because they are no longer open.