2026-05-11 — findings vs issues (#645, #822 refresh, open bugs)

Inputs

  • Audit referenced: docs/architecture/audits/2026-05-spec-kitty-caacs.md (commit bc64dec6ee37dbbd6bc21a0a1aa3195f2bab1b57, 2026-05-08).
  • Prior crosscheck: docs/architecture/audits/2026-05-822-crosscheck.md (dated 2026-05-08).
  • Multi-window refresh (2026-05-11) introduced two new slow-burn refactor candidates: src/specify_cli/orchestrator_api/commands.py (1097 SLOC, F#28 full-history, R#24 4-mo) and src/specify_cli/agent_utils/status.py (570 SLOC, F#29 full-history, R#26 4-mo, contains the _display_status_board F-53 renderer).
  • Provisional doctrine paradigm: brownfield-onboarding (investigate before changing; document/transfer first; hierarchical reference bundles).
  • Issue tracker reads (all via gh against Priivacy-ai/spec-kitty, 2026-05-11):
    • gh issue view 645 --comments --json …
    • gh issue view 822 --comments --json …
    • gh api repos/Priivacy-ai/spec-kitty/issues/822/comments --paginate (since-filter for created_at >= 2026-05-08)
    • gh issue list --label bug --state open --limit 100 --json …
    • body fetches for every open bug-labeled ticket (#1009, #992, #988, #989, #990, #991, #983, #984, #985, #986, #987, #971, #889, #822, #662, #644, #391)
    • searches: orchestrator_api, agent_utils/status, glossary middleware
    • gh release list --limit 15 (release-tag progression)
    • gh issue view 613 (glossary ownership)

Section 1: Issue #645

Title: Epic: Stable Application API Surface (UI / CLI / MCP / SDK) State: OPEN Labels: dashboard, epic Author: stijn-dejongh Created: 2026-04-15 — Updated: 2026-05-04

Body summary (≤200 words)

Issue 645 is a multi-step architectural epic (rescoped 2026-05-03 from "Frontend Decoupling and Application API Platform" to "Stable Application API Surface") aimed at giving Spec Kitty a single, documented, stable retrieval surface that the dashboard UI, CLI, future MCP adapter, and future external SDKs all consume — instead of each independently walking the filesystem under kitty-specs/*/. The current proof points cited as the problem are stdlib server bootstrap, hand-rolled routing, dashboard-local TypedDict response contracts, and monolithic frontend controller logic — all dashboard-package internal. The epic sequences nine steps: (0) codify single-entry-point doctrine + architectural tests, (1–4) terminology, handler-to-service extraction, contract hardening, transport migration (all now DONE on feature/650-dashboard-ui-ux-overhaul), (5) stable retrieval surface via MissionRegistry+cache (#956, DONE), (6) resource-oriented endpoints + WorkPackageAssignment schema (#957/#958, DONE in mission 01KQQRF2), (7) glossary/lint service-extraction follow-ups (#954/#955), (8) async update transport (WebSocket/SSE), (9) generated clients and new consumer slices. HATEOAS-LITE _links convention is in scope. Out of scope: visual redesign (#650), public docs site (#651), DRG/doctrine refactors not tied to access layers.

Linked work

Reference State (2026-05-11) Role in epic
#459 .d.ts codegen CLOSED Folded into FastAPI mission
#460 FastAPI/OpenAPI migration CLOSED Done — mission frontend-api-fastapi-openapi-migration-01KQN2JA
#447 weighted progress CLOSED Done
#537 WPState/Lane consumer migration CLOSED Done — mission 080
#538 status emission / dirty worktree OPEN Partial; Mode B hardening + dirty-worktree recovery remain
#391 cross-cutting remediation umbrella OPEN Parent debt epic
#956 MissionRegistry + cache DONE (in mission 01KQPDBB)
#957 resource-oriented endpoints + WorkPackageAssignment DONE (mission 01KQQRF2)
#958 OpenAPI tag grouping DONE (folded into 957's mission)
#954 / #955 glossary & lint service extraction OPEN follow-ups
#650 UI/UX shared design system OPEN sibling epic
#361 historical predecessor CLOSED

Specific code paths referenced

  • src/specify_cli/dashboard/server.py:61 (legacy stdlib server, retained for rollback)
  • src/specify_cli/dashboard/handlers/router.py:14 (hand-rolled router)
  • src/specify_cli/dashboard/api_types.py:1 (TypedDicts — Phase 1)
  • src/specify_cli/dashboard/static/dashboard/dashboard.js:1 (monolithic frontend)
  • src/dashboard/ (new canonical service package: services/mission_scan.py, services/project_state.py, services/sync.py, file_reader.py)
  • src/dashboard/api/ (FastAPI subpackage with 12 routers)
  • src/charter/context.py, src/charter/sync.py (charter chokepoint, upstream dependency of #460)
  • tests/architectural/test_dashboard_boundary.py (FR-010 invariant guard)
  • tests/architectural/test_fastapi_handler_purity.py (≤6-LOC handler bodies)

Relationship to findings

Finding Strength Reason
F1 (Bus factor=1) NONE The epic is purely architectural; it does not address single-author concentration.
F2 (cli/commands/agent/{tasks,workflow,mission}.py) NONE The epic is dashboard/application-API-shaped. The F2 hotspot is the orchestration command layer, untouched by #645.
F15 (test-update lag on F2 hotspots) NONE Same scope mismatch as F2.
F16 (glossary/middleware.py under-tested) WEAK Step 7 (#954) extracts glossary handler from dashboard transport; this could surface but does not target middleware.py.
F17 (mission↔src/ co-change limited) NONE Structural pipeline observation, not addressed.
F18 (agent_utils/status.py under-tested) PARTIAL The epic's MissionRegistry (#956) and WorkPackageAssignment schema (#957) create a canonical materialization layer that agent_utils/status.py should eventually consume; if executed, it would naturally reduce that file's filesystem-walking footprint and make its renderer thinner. Not a direct target, but in the structural path.
orchestrator_api/commands.py slow-burn WEAK The epic explicitly says "do not overload the orchestrator API with product/application concerns" — orthogonal-by-design.
Brownfield-onboarding paradigm PARTIAL The epic's Step 0 explicitly "codifies doctrine + architectural tests" before refactoring further, which is the brownfield "document first, refactor second" pattern. Also: docs/architecture/05_ownership_map.md updates are part of each merged mission. The doctrine layer is exactly where brownfield-onboarding would land.

Overall: #645 is a strong-execution architectural epic, but its scope only weakly overlaps the audit's structural-remediation findings. It is also visibly close to done — Steps 1–6 have shipped, Steps 7–9 are the remainder.


Section 2: Issue #822 — delta since 2026-05-08 crosscheck

Title: Epic: 3.2.0 stabilization and release readiness State: OPEN (unchanged) Labels: bug, workflow, release, epic (unchanged)

New comments since 2026-05-08

Zero new comments on the epic body itself. The last visible comment is dated 2026-05-05 (the final-gate rerun + mission hygiene comment from robertDouglass). Both gh issue view 822 --comments and the paginated gh api .../issues/822/comments query produced no entries with created_at >= 2026-05-08.

Sub-issue state changes since 2026-05-08

Comparing the prior crosscheck's table against the current open-bug list:

# Prior state (2026-05-08) Current state (2026-05-11) Delta
#967 CLOSED CLOSED
#966 CLOSED CLOSED
#964 CLOSED CLOSED
#968 CLOSED CLOSED
#904 CLOSED CLOSED
#848 CLOSED CLOSED
#971 mypy strict OPEN OPEN
#889 sync misclassification OPEN OPEN
#952 SaaS sync leak reopened then closed at read not in current open list confirmed closed
#662, #825, #595, #771, #631, #630, #726, #728, #729, #629, #644, #740, #323, #306, #260, #253, #303, #317 all OPEN #662 and #644 confirmed still OPEN in current bug query; the rest are still OPEN (no closure event seen) no change
NEW since prior crosscheck: #983, #984, #985, #986, #987, #988, #989, #990, #991, #992, #1009 not present all OPEN +11 new open bug tickets

The largest delta is the opening of 11 new bug-labeled issues between 2026-05-05 and 2026-05-07, most of them surfaced during mission auth-local-trust-and-multi-process-hardening-01KQW587 and stable-320-release-blocker-cleanup-01KQW4DF. Notably, #992 ("Epic: drain the bug queue by repairing domain boundaries") was opened 2026-05-05 and is itself an architectural meta-epic — it diagnoses the same kind of structural seam problem the audit identified (commands independently inferring/mutating the same truth), though from a different angle (lifecycle invariants rather than file-level complexity).

Release-tag progression

Tag Date Type
v3.2.0a10 2026-05-04 prerelease
v3.2.0rc1 2026-05-05 prerelease
v3.2.0rc2 2026-05-06 prerelease
v3.2.0rc3 2026-05-06 prerelease
v3.2.0rc4 2026-05-10 prerelease
v3.1.8 2026-04-29 Latest stable

rc1, rc2, rc3, and rc4 have all cut since 2026-05-05. The epic narrative has not been updated to reflect the rc progression — the epic body still anchors on "3.2.0a10" status. rc4 is the most recent and is dated yesterday (2026-05-10). Stable 3.2.0 has not been tagged. The fact that four rc tags have shipped in ~5 days while the epic body has not been updated is itself a signal: either the rc bar is moving (each rc triggered by newly-discovered blockers from the 11 newly-opened bugs) or the epic is no longer the live coordination surface for the release.

Does the prior picture still hold?

Mostly yes, with one important shift:

  • The audit's framing — that #822's blocker tranche maps to mission-flow correctness issues and not to the structural F2 hotspot — remains correct.
  • However, #992 (opened 2026-05-05) materially changes the picture. It is an explicit epic to repair domain boundaries across the same command surfaces the audit identified as F2 (spec-kitty next, agent action implement/review, agent tasks move-task, agent tasks status, review, merge --dry-run, merge, dashboard materializers, SaaS sync). It is the first issue-tracker artifact that names the F2 structural concern as the root cause of a bug cluster rather than as individual bugs. The audit's F2 finding now has an issue-tracker analog, just not one that's filed under #822.

Section 3: Open "bug"-labeled issues — relating table

Open bug-labeled issues as of 2026-05-11 (17 total, including the two epics that carry the bug label):

# Title (short) Age (days) F1 F2 F15 F16 F18 orch_api Brownfield Strongest match
1009 profile-invocation lifecycle records do not match issued step id 4 NONE PARTIAL NONE NONE NONE NONE NONE F2 PARTIAL
992 Epic: drain bug queue by repairing domain boundaries 6 WEAK STRONG PARTIAL NONE PARTIAL PARTIAL STRONG F2 STRONG + Brownfield STRONG
991 merge dry-run misses review artifact consistency failures 6 NONE PARTIAL NONE NONE NONE NONE NONE F2 PARTIAL
990 review-cycle artifact generation can wrap prior cycle frontmatter/body 6 NONE PARTIAL NONE NONE NONE NONE NONE F2 PARTIAL
989 new missions without baseline_merge_commit skip dead-code review 6 NONE PARTIAL NONE NONE NONE NONE NONE F2 PARTIAL
988 spec-kitty next --json can miss claimable WPs 6 NONE PARTIAL NONE NONE NONE PARTIAL NONE F2 PARTIAL
987 mission-review gate commands can fall through to global pytest in fresh clones 6 NONE NONE NONE NONE NONE NONE NONE NONE
986 contract and architectural gates race on shared pytest cache venv 6 NONE NONE NONE NONE NONE NONE NONE NONE
985 spec-kitty review does not enforce mission-review hard-gate artifacts 6 NONE PARTIAL NONE NONE NONE NONE NONE F2 PARTIAL
984 agent tasks status can resolve wrong checkout from detached worktree 6 NONE PARTIAL NONE NONE STRONG NONE NONE F18 STRONG
983 merge is not idempotent after partial mission-number assignment 6 NONE PARTIAL NONE NONE NONE NONE NONE F2 PARTIAL
971 mypy strict gate fails on current baseline 7 NONE WEAK NONE NONE NONE NONE NONE NONE
889 sync misclassifies teamspace ingress rejection as server_error 11 NONE NONE NONE NONE NONE NONE NONE NONE
822 Epic: 3.2.0 stabilization (carries bug label) 14 NONE WEAK NONE NONE NONE NONE WEAK WEAK
662 CI workflow duplication 25 NONE NONE NONE NONE NONE NONE NONE NONE
644 Encoding mixups: stop assuming UTF-8 26 NONE NONE NONE NONE NONE NONE WEAK WEAK
391 EPIC: 3.x tech/functional debt remediation (carries bug label) 36 WEAK PARTIAL NONE PARTIAL NONE NONE PARTIAL Brownfield PARTIAL

Per-issue notes for PARTIAL/STRONG matches

  • #1009 — Lifecycle record write keyed off wrong identifier. Likely sits in or around src/specify_cli/next/runtime_bridge.py / lifecycle persistence (probably the next action emission seam). Adjacent to the F2 hotspot via the next/action-dispatch wiring.
  • #992 — Direct topical hit on F2. The body enumerates exactly the command surfaces the audit named (tasks, workflow, mission, review, merge, dashboard, sync) and proposes the same remedy (centralize invariants, route every surface through them). Also resonates with brownfield-onboarding: its "North Star Invariants" section is the kind of doctrine artifact the new paradigm would prescribe.
  • #991 — Merge dry-run/real-merge parity failure. Sits on the merge command surface, part of F2's cli/commands/agent/ cluster.
  • #990 — Review-cycle artifact generation bug. Lives in review command path; F2-adjacent.
  • #989 — Review command skip-path bug (missing baseline_merge_commit). F2-adjacent.
  • #988next --json claimability parity with agent action implement. Two surfaces independently inferring claimability — classic F2 symptom.
  • #985spec-kitty review not enforcing mission-review artifacts. F2-adjacent (review command surface).
  • #984agent tasks status reads from wrong checkout. STRONG match on F18 — this is the under-tested agent_utils/status.py (570 SLOC, F#29) failing precisely on a status-board reading invariant. Also PARTIAL on F2 (the bug surface is cli/commands/agent/tasks.py).
  • #983merge non-idempotent after partial mission-number assignment. F2-adjacent (merge command).
  • #391 — Generic tech-debt umbrella epic. Has a partial match on the orchestrator_api / domain-boundary work via its child #613 ("Establish glossary as a clearly owned functional module") and #645's chain of dashboard service extractions. Brownfield-onboarding resonates with #391's intent.
  • #644 — Encoding policy. Brownfield-adjacent only in the sense that it asks for explicit contracts at lifecycle boundaries (a brownfield-onboarding hallmark).

Notes on the WEAK/NONE rows

  • #987, #986 — Test infrastructure / fresh-clone hygiene. Orthogonal to the audit's structural findings.
  • #971 — mypy strict. Crosses many files but does not target any specific F-finding.
  • #889 — SaaS sync error classification. Cross-repo concern (SaaS service), orthogonal.
  • #662 — CI duplication, infrastructural.

Section 4: Gap analysis

Open bug tickets that STRONGLY match the new audit findings

  • #992 (STRONG on F2 + Brownfield): The team has already filed an architectural epic that names the F2 structural concern as the root cause of a bug cluster. The audit's F2 finding and #992 converge on the same diagnosis: multiple command surfaces independently owning the same truth. #992's proposed remedy (centralize invariants, single-execution per domain) is the issue-tracker form of an F2 refactor. This is a strong confirmation that the audit is in agreement with the team's own structural read of the codebase. Importantly, #992 is not listed in #822's linked-work table — it is a sibling/successor epic, opened after the prior crosscheck.
  • #984 (STRONG on F18): The audit's F18 (agent_utils/status.py under-tested at 19.0% ratio) has a concrete bug repro at #984 — the very file the audit flagged as under-tested is producing wrong-checkout reads from detached worktrees. This is direct forensic backing for F18.

The two new slow-burn candidates — issue-tracker status

Slow-burn candidate Direct open bug tickets Architectural tickets
src/specify_cli/orchestrator_api/commands.py (1097 SLOC, F#28) None. No open issue names this file. Historical: #177 (CLOSED, "read commands mutate status.json; JSON-envelope contract inconsistencies") — closed but the file remains a slow-burn refactor candidate per the multi-window analysis. #391 (OPEN umbrella) touches orchestrator API concerns indirectly.
src/specify_cli/agent_utils/status.py (570 SLOC, F#29, contains F-53 _display_status_board) #984 (STRONG, see above). None named directly.

Net-new vs filed:

  • orchestrator_api/commands.py is net-new. No live issue tracks it; the only historical reference (#177) is closed.
  • agent_utils/status.py has partial issue-tracker backing via #984, but only on one symptom (wrong-checkout reads). The full slow-burn picture (complexity, churn, the F-53 renderer that's hard to test) is not filed.

Bug tickets without forensic backing

The audit did not examine the following bug clusters, and they appear legitimate as orthogonal scope:

  • #987, #986 — pytest/venv infrastructure hygiene. Test-infra concerns the audit explicitly did not have a tests/ overlay for.
  • #971 — mypy strict gate. Tooling/quality gate work, separate from forensics.
  • #889 — SaaS sync error classification. Cross-repo concern; the audit is single-repo.
  • #662 — CI workflow duplication. Infrastructural.
  • #644 — Encoding policy. Cross-cutting product correctness; audit is structural.

These are not contradicted by the audit; they are simply outside the scope of the file-level / churn-based forensic lens.

Newly-opened bug cluster (post-prior-crosscheck) and its shape

The 11 new open bugs (#983–#992, #1009) split into two clusters:

  1. F2 cluster (8 of 11): #983, #984, #985, #988, #989, #990, #991, #992, #1009 — all touch the cli/commands/agent/{tasks,workflow,mission}.py complex or its lifecycle dependencies (next, review, merge). This is a strong post-hoc validation of F2 and F15. The "test-update lag" claim from F15 is consistent with so many lifecycle-correctness bugs shipping in the rc-cycle window.
  2. Test-infra cluster (2 of 11): #986, #987 — orthogonal to the audit.

Section 5: Methodology / caveats

Issues fetched successfully

  • #645 (full body + comments)
  • #822 (full body + comments, plus paginated comment API)
  • All 17 open bug-labeled issues (bodies for the 16 relevant + the two epics that carry the bug label)
  • #613 (glossary ownership, referenced from F16)

Issues NOT fetched

  • Sub-issues of #822 explicitly closed in earlier crosscheck (#967, #966, #964, #968, #904, #848) — relying on the prior crosscheck's record that they were closed; no re-verification done here.
  • Historical orchestrator_api ticket #177 — closed, body not deep-read (title alone was sufficient signal).
  • Older deferred items in #822 (#306, #303, #317, #260, #253, etc.) — relying on prior crosscheck's per-issue scoring; no rescoring done here as the prior crosscheck remains canonical.

State changes during the read

None observed. gh returned consistent state across the read window (a few minutes on 2026-05-11). One minor note: the gh issue list --label bug --state open count of 17 includes both the #822 and #391 epics (because they carry the bug label); these are excluded from per-issue forensic scoring but included in the table for completeness.

Limits

  • The bug label filter is not exhaustive. Issues like #538 (status emission), #954, #955 (service extraction follow-ups), #771 (auto-rebase), and #613 (glossary ownership) carry other labels and would not surface via the bug filter — but they are visible inside #645 and #822 linkage and are referenced in this document.
  • No PR-level scan was performed for in-flight work that might already address some of these. The audit references are file-level structural; the issue-tracker view is symptom-level. Where these two views agree (e.g., #992 ↔ F2; #984 ↔ F18), the agreement is high-confidence.
  • The four rc tags (rc1-rc4) in 5 days suggest the release coordination surface for 3.2.0 stable may have moved off #822 and onto release-PR conversations. This research did not enumerate those PRs.
  • No subjective product-priority judgment was applied: STRONG/PARTIAL/WEAK ratings reflect topical/structural overlap with the audit findings, not whether the team should prioritize the matched ticket.