Code Review — 2026-03-25

Field Value
Version 2.1.2
Date 2026-03-25
Reviewer OpenCode (claude-sonnet-4.6)
Scope Full source tree (src/) against architecture/2.x
Tools ruff, mypy --strict, pytest, manual architectural trace

Summary

Severity Count Category
Critical 3 Real bugs: dropped kwarg, type mismatch, dashboard blocking retry loop
High 14 Broken __all__ exports, undefined names, None-dereference
Medium ~104 Lost tracebacks (65), complexity violations (39)
Low / Style 823 Auto-fixable: deprecated typing, f-string, unused imports
Architecture 5 Boundary violations (4 documented, 1 undocumented)
Canon 5+ Feature terminology in non-deprecated user-facing surfaces

1. Critical Bugs

C1 — Dropped --mission argument on spec-kitty specify

File: src/specify_cli/cli/commands/lifecycle.py:33

# Current — WRONG
agent_feature.create_feature(mission_slug=slug, mission=mission, json_output=json_output)

create_feature (defined at cli/commands/agent/feature.py:501) has no mission kwarg. Its parameters are mission_type and mission_legacy.

The result: every call to spec-kitty specify --mission <type> silently drops the mission type. The feature is created with no mission type, falling back to whatever default the resolver provides. mypy confirms: error: Unexpected keyword argument "mission" for "create_feature".

C2 — Type mismatch in merge/executor.py

File: src/specify_cli/merge/executor.py:596–614

# Line 596 — no text=True → CompletedProcess[bytes]
result = subprocess.run(["git", "rev-parse", ...], capture_output=True, check=False)

# Line 606 — REASSIGNS to text=True → CompletedProcess[str]
result = subprocess.run(["git", "status", "--porcelain"], ..., text=True, encoding="utf-8")

# Line 614 — reads .stdout as str, but variable declared as CompletedProcess[bytes]
if result.stdout.strip():

Both runs reuse the same result variable. The first run is typed as CompletedProcess[bytes] by mypy, the second reassigns it to CompletedProcess[str]. The .stdout.strip() call on line 614 operates on the second (str) result and works at runtime, but the type contract is broken and the variable reuse obscures which result is being checked where.

C3 — Dashboard retry loop blocks on dead process; test timeout

File: src/specify_cli/dashboard/lifecycle.py:364–369

retry_delays = [0.1] * 10 + [0.25] * 40 + [0.5] * 20  # ~20 seconds total
for delay in retry_delays:
    if _check_dashboard_health(port, project_dir_resolved, token):
        ...
        return url, port, True
    time.sleep(delay)

The process liveness check (_is_process_alive(pid)) only runs after the full 20-second retry loop completes. When the dashboard process dies immediately after startup, the loop still runs to completion before reporting failure.

This causes the failing test test_health_timeout_with_dead_process_still_fails to time out (>10 s) inside time.sleep. It also means real users wait 20 seconds before seeing a startup failure.


2. High Severity

H1 — Broken __all__ exports in glossary/models.py

Files: src/specify_cli/glossary/models.py and 7 downstream modules

models.py does not export SemanticConflict, Severity, ConflictType, SenseRef, or TermSurface via __all__. mypy reports attr-defined errors propagating through:

  • glossary/strictness.py (3 errors)
  • glossary/scope.py (1 error)
  • glossary/checkpoint.py (1 error)
  • glossary/exceptions.py (2 errors)
  • glossary/resolution.py (1 error)
  • glossary/pipeline.py (3 errors)
  • glossary/conflict.py (5 errors, including Name "models.SemanticConflict" is not defined)
  • glossary/attachment.py (1 error)
  • glossary/__init__.py (9 errors)

The entire glossary subsystem has broken type exports. In strict mypy this is 27 errors from a single missing __all__.

H2 — Undefined names (F821/F822)

File Name Risk
cli/commands/agent/tasks.py:726 Any used in "Any \| None" annotation NameError if annotation evaluated at runtime
cli/commands/sync.py:39 timedelta in "timedelta" annotation Same risk
core/agent_config.py:26 Path used in annotation without import Same risk
scripts/tasks/acceptance_support.py:55–66 __all__ declares 10 names populated by globals().update() Breaks static analysis; any downstream from acceptance_support import * is opaque

H3 — Potential None dereference

File: src/specify_cli/merge/status_resolver.py:492

Expression type is str | None, variable type is str. No None-guard before use. mypy: error: Incompatible types in assignment.


3. Medium Severity

M1 — Lost exception tracebacks (65 instances, B904)

Every CLI command error handler pattern across cli/commands/ raises typer.Exit(1) bare inside an except block:

except SomeError as exc:
    console.print(f"[red]Error:[/red] {exc}")
    raise typer.Exit(1)   # loses exc as __cause__

The original exception is not chained. When a user reports an error with a log, the traceback has no cause chain to exc. Use raise typer.Exit(1) from exc or raise typer.Exit(1) from None (if intentional suppression).

M2 — Cyclomatic complexity violations (39 functions, C901)

The project threshold is 15. Worst offenders:

Function Complexity File
merge 75 cli/commands/agent/feature.py
run_enhanced_verify 68 verify_enhanced.py
move_task 63 cli/commands/agent/tasks.py
implement (workflow) 57 cli/commands/agent/workflow.py
merge_workspace_per_wp 49 cli/commands/agent/feature.py
map_requirements 47 requirement_mapping.py
execute_merge 36 merge/executor.py
review 32 cli/commands/agent/workflow.py
implement (cmd) 31 cli/commands/implement.py
status 30 cli/commands/agent/workflow.py

merge at complexity 75 is effectively untestable by mutation testing. The mutmut config does not cover cli/commands/agent/feature.py, meaning regressions in merge semantics are undetected by the mutation gate.

M3 — Silent try/except/pass (30 instances, S110)

Notable cases:

  • cli/commands/agent/feature.py:749 — auto-commit failure is silently swallowed. A failed commit produces no observable signal to the caller.
  • cli/commands/accept.py:32 — event emission failure suppressed entirely.

4. Low / Style (Auto-fixable)

Rule Count Category
UP045 186 Optional[X]X \| None
UP006 134 List[X], Dict[X,Y]list[X], dict[X,Y]
F541 109 f-strings without placeholders (f"Progress: ")
F401 85 Unused imports
UP035 43 Deprecated typing imports
UP017 36 timezone.utcdatetime.UTC
SIM10x 33 Collapsible conditions, reimplemented builtins
UP037 10 Quoted annotations

All 592 flagged as [*] are auto-fixable with ruff --fix src/.

Additionally, manifest.py:92 contains a logic error:

if not in_frontmatter and in_frontmatter == False:

The second clause is a tautology of the first. The condition is equivalent to if not in_frontmatter. This is either dead code or a miswritten transition guard.


5. Architecture Divergences vs. architecture/2.x

Four divergences are already documented in docs/architecture/04_implementation_mapping/README.md. One is not.

DIV-1 — CLI bypasses Control Plane → Orchestration boundary (documented)

lifecycle.py:33 calls agent_feature.create_feature() directly from the Control Plane layer without routing through Orchestration. Multiple handlers in cli/commands/agent/feature.py write directly to kitty-specs/ without going through status/emit.py.

DIV-2 — Kitty-core and Orchestration both write kitty-specs/ (documented)

Planning artifacts (spec.md, plan.md, meta.json) are written directly at lines 600–810 of cli/commands/agent/feature.py, bypassing the Event Store interface contract.

DIV-3 — Connector concept is implicit (documented)

No formal Connector interface. Current "adapter" is a rendered markdown template. No dispatch mechanism abstraction exists for SDK, shell, or remote API alternatives.

DIV-4 — Dashboard reads filesystem directly (documented)

dashboard/scanner.py reads WP frontmatter directly rather than querying through the Event Store interface.

DIV-5 — doctrine has undocumented reverse dependency on specify_cli (NOT documented)

The architecture/2.x documents state doctrine has "zero dependency on specify_cli". This is false:

  • src/doctrine/missions/glossary_hook.py:83 lazy-imports from specify_cli.glossary.attachment import GlossaryAwarePrimitiveRunner at call time.
  • src/charter/catalog.py:25 lazy-imports from specify_cli.runtime.home import get_package_asset_root at call time.

The glossary_hook.py case is partially addressed in ADR 2026-03-25-1-glossary-type-ownership and pr305-review-resolution-plan.md (Track 2), but the fix is not yet merged. The charter/catalog.py reverse dependency has no ADR and is not mentioned in the implementation mapping.


6. Terminology Canon Violations

Per AGENTS.md: canonical term is Mission; Feature is prohibited in non-deprecated user-facing language.

Active violations in user-visible help strings:

File Line Violation
cli/commands/lifecycle.py 27 help="Feature name or slug (e.g., user-authentication)"
cli/commands/agent/feature.py 49 help="Feature lifecycle commands for AI agents"
cli/commands/agent/feature.py 503 help="Feature slug (e.g., 'user-auth')"
cli/commands/research.py 44 tracker.add("feature", ...) — internal label, surfaces to users
cli/commands/implement.py 563 Example slug "001-my-feature" in --mission help — perpetuates the term

7. Security Notes

Rule File Assessment
S608 sync/queue.py:442,461,517,523 False positive — parameterised ? placeholders are safe. Add # noqa: S608 with justification comment.
S310 dashboard/lifecycle.py:146,231,460,475,482 All target http://127.0.0.1:<port> — localhost only, low risk. httpx (already a dependency) would be a cleaner replacement.
S108 dossier/tests/test_api.py, test_indexer.py Hardcoded /tmp/ in tests — non-portable on Windows. Use tmp_path pytest fixture.

8. mypy Quarantine Scope

The [tool.mypy.overrides] ignore_errors = true block in pyproject.toml covers 74 modules — the majority of production specify_cli code. Strict mode is effectively inactive for most of the codebase. The quarantine comment says "Transitional", but the list has grown with recent features (e.g., mission_v1, dossier, orchestrator_api) rather than shrinking.


Test Results

1701 passed, 7 skipped, 18 xfailed, 90 warnings
1 FAILED: test_health_timeout_with_dead_process_still_fails (C3 above)

90 warnings include 18 DeprecationWarning: --mission is deprecated; use --mission instead triggered by tests that still use the deprecated --mission flag.