Research — Phase 0: Decompose agent/tasks.py god-module

Mission: 01KVWVARJKSH9T2QNHJVE4ZC7Y · Date: 2026-06-24 · Branch: main (c3814ec5a) Method: 3 parallel read-only Explore agents (seam map · coupling/commit-router trace · test-coverage audit) over src/specify_cli/cli/commands/agent/tasks.py (4633 LOC) and its collaborators.


TL;DR / Decisions

#DecisionResolutionEvidence
D1Seam set (FR-003)5 extracted seam modules + a thinned command shim. See §2.seam-map agent
D2Verification strategy (resolves spec C-005)Capture golden CLI characterization tests BEFORE refactoring, then keep the existing 72 unit tests green. Existing coverage is seam/import-heavy but contract-light (only 1 help-text assertion, 2 smoke tests) — function-importing tests would survive a move silently while the CLI contract drifts undetected.coverage agent §3,§5
D3Residual-size target (resolves spec NFR-004)Binding target is maxCC ≤ 15 for every function (the real defect driver). LOC target: tasks.py ≤ ~1200 LOC residual (from 4633), achieved by extracting ~1800 LOC of helpers into 5 seams AND internally decomposing the 6 mega-functions. A literal "tiny registration shim" is NOT realistic in one mission — the command bodies carry genuine orchestration.seam-map agent §1,§5
D4Commit-routing scope re-grounding (FR-006/007/008)The issue/PR-2060 premise has drifted — see §3. There are 3 tails, not 4; protected-primary now refuses → feature branch (the silent-skip-to-coord behavior was already removed in WP02/WP03); and an FR-008-style test already exists. Re-scope to: route the 3 tails through the centralized commit_for_mission router and delete the now-redundant bespoke pre-checks.coupling agent §2

1. The public CLI surface (the frozen contract — C-001/FR-001)

Single typer app app (tasks.py:716, name="tasks", no_args_is_help=True) with 9 commands:

CommandHandlerLines~LOCComplexity
move-taskmove_task()1824–2601778EXTREME
mark-statusmark_status()2956–3220265HIGH
list-taskslist_tasks()3223–330583Low
add-historyadd_history()3308–338477Low
finalize-tasksfinalize_tasks()3387–3604218HIGH
map-requirementsmap_requirements()3607–3988382EXTREME
validate-workflowvalidate_workflow()3991–408292Low
statusstatus()4085–4567483EXTREME
list-dependentslist_dependents()4570–463263Low

Plus a non-command helper that is itself a god-function: _validate_ready_for_review() (1377–1724, 348 LOC) — touches research-artifact validation, worktree state, merge ancestry, and contamination checks. It must be sub-decomposed regardless of which seam it lands in.

The complexity is concentrated in 6 functions (move_task, status, map_requirements, _validate_ready_for_review, mark_status, finalize_tasks ≈ 2474 LOC / 53% of the file). File-splitting alone will NOT satisfy maxCC ≤ 15 — these bodies must be internally flattened/extracted too.

2. Proposed seams (D1 — resolves FR-003)

Five extracted modules adjacent to tasks.py, plus the residual shim. One-way dependency only (seams never import tasks.py).

Seam (proposed file)Members (approx)~LOCExisting test trust
tasks_outline.py — tasks.md/manifest parsing, WP-id resolution_normalize_task_id_input, _match_history_wp_heading, _extract_pipe_table_wp_id, _resolve_history_wp_id, _is_pipe_table_task_row, _parse_pipe_table_header, _wp_id_exists, _resolve_wp_id, WP regex consts~180MEDIUM (+30 LOC isolated parser test)
tasks_materialization.py — frontmatter/file persistence, markdown mutation_collect_status_artifacts, _persist_inline_subtask_status, _materialize_inline_subtask_status, _update_pipe_table_status, _resolve_checkbox, _resolve_pipe_table, _persist_review_artifact_override, _persist_review_feedback~185MEDIUM (+80 LOC error cases)
tasks_finalize_validation.py — dependency/cycle validation, lane metadata_is_backward_transition, _lane_targets_for_emit, _wp_lane_from_status_events, _read_transactional_wp_lane, + cycle/frontmatter loop extracted from finalize_tasks~230HIGH (no new tests)
tasks_dependency_graph.py — dependency readiness/gating glue_check_dependent_warnings, _behind_commits_touch_only_planning_artifacts, dependent-gating extracted from move_task~165LOW (+50 LOC readiness test)
tasks_parsing_validation.py — readiness/verdict/issue-matrix validation_issue_matrix_* (7 fns), _validate_ready_for_review (sub-split), _self_review_fallback_option_error, _get_latest_review_cycle_verdict, _apply_review_status_flags~485STRONG (no new tests)
tasks.py (residual shim) — typer app + 9 thinned command bodies + orchestration gluethe 9 @app.command handlers, reduced to dispatch/option-unpack/emit~800–1200LOW (needs golden CLI test)

Suggested extraction order (dependency-aware, lowest-risk first): outline → materialization → finalize-validation → dependency-graph → parsing-validation → final shim-thinning pass.

Shared module-level state to relocate carefully (touched by multiple seams): TASKS_MD_FILENAME, SPEC_MD_FILENAME, _QUALIFIED_TASK_ID_RE, _VALID_VERDICTS, _FORWARD_ORDER, _RUNTIME_STATE_DENY_LIST, WP/inline regexes, and the TaskIdResolutionOutcome/Format enums + TaskIdResult dataclass.

3. Commit-routing: the premise has DRIFTED (D4 — critical)

The issue comment (from PR #2060, written against an older tasks.py) said: "the planning-commit tails were NOT migrated to commit_for_mission … 4 sites … silently skip commits on protected primaries rather than routing to the coord worktree." The current code tells a different story:

1. There are 3 commit tails, not 4:

  • move_task()safe_commit(...) at tasks.py:2486 (WORK_PACKAGE_TASK; uses resolve_placement_only)
  • mark_status()safe_commit(...) at tasks.py:3131 (TASKS_INDEX; uses resolve_placement_only)
  • map_requirements()safe_commit(...) at tasks.py:3947 (WORK_PACKAGE_TASK; resolves worktree via _planning_commit_worktree imported from mission.py:3928)

2. The tails are already kind-aware — they call resolve_placement_only(...) / _planning_commit_worktree(...) with explicit MissionArtifactKind, so they are NOT the naive hardcoded CommitTarget(ref=target_branch) the comment implies. They were partially migrated by the "write-surface-coherence WP02/WP03" work.

3. Protected-primary behavior changed: commit_for_mission (coordination/commit_router.py:98–141) now refuses planning commits to a protected primary and tells the user to use a feature branch (status="no_op_wrong_surface", commit_router.py:160–178). The historical "route to coordination worktree" fallback was removed (WP02/WP03). So the comment's "should route to coord worktree" remedy is stale.

4. An FR-008-style regression test already exists: tests/specify_cli/cli/commands/test_wp03_bypass_writers_fr008.py:139–206 asserts the refusal diagnostic names the feature-branch remedy and does NOT mention the coord worktree.

Re-grounded scope for FR-006/007/008 (still valuable, just re-aimed):

  • FR-006 → Route the 3 tails through the single commit_for_mission entry point instead of open-coding resolve_placement_only+safe_commit (and, for tail 3, the _planning_commit_worktree import from mission.py). This is the real "centralize the router" win the comment wanted. Gap: tail 3 must thread target_branch= for the WP09 ff-advance (currently absent at the call site).
  • FR-007 → Once routed, the bespoke pre-checks become dead: _skip_target_branch_commit() (954–971), _protected_branch_status_commit_error() (921–932), their guard conditionals in move_task (2448–2470) and mark_status (3021–3029), and the _planning_commit_worktree import (3928). The is_protected reads at 924 and 970 go with them.
  • FR-008 → Extend the existing test rather than duplicate it: add a tasks.py-level regression proving each of the 3 tails routes through commit_for_mission (assert router invoked / refusal surfaced), since current tests still patch the direct safe_commit path.

> ⚠️ Behavior-change nuance: routing through commit_for_mission shifts the protected-primary error from a pre-check message to a router result (no_op_wrong_surface diagnostic). The user-visible message text changes. This must be acknowledged (it touches FR-001's "output unchanged" — these are error paths on protected primaries, an intended behavior change under FR-006, not a regression). Decision point for the user — see questions below.

4. Coupling & circular-import risk

  • dependency_graph (core/dependency_graph.py, 391 LOC) is imported at tasks.py:53 and called at 1215/1221 (move_task warnings) and 4603/4604 (validate_workflow). It imports only from status.py, never from cli/commands/no cycle; LOW extraction risk. The two call sites stay in the shim; the seam needn't import it.
  • An extracted commit seam would re-import: git.safe_commit, mission_runtime.{CommitTarget,MissionArtifactKind,resolve_placement_only,resolve_topology,routes_through_coordination}, core.commit_guard.GuardCapability, git.protection_policy.ProtectionPolicy, core.paths.*, status.{Lane,resolve_lane_alias,feature_status_lock}, missions._read_path_resolver.
  • Enforce one-way imports: seams must never import back into tasks.py. Seam↔seam imports (e.g. materialization↔outline) are fine across separate files.

5. Test-coverage baseline (informs D2)

  • ~72 core tests across 9 files (notably test_tasks.py 1054 LOC, test_tasks_canonical_cleanup.py 863 LOC, test_tasks_helpers.py 602 LOC).
  • Strong: parsing-validation helpers, finalize-validation. Weak: command-registration contract (1 help assertion), dependency readiness, package-materialization error paths.
  • No golden CLI contract test pinning command names / flags / exit codes / --json envelope. This is the single biggest refactor risk and the reason for D2.
  • New test work estimate: ~410 LOC (golden CLI ~150, protected-primary routing ~100, materialization errors ~80, readiness ~50, isolated parser ~30).

Open questions / risks feeding /spec-kitty.plan

1. R1 (commit-routing behavior change): Adopting commit_for_mission changes the protected-primary error message text on the 3 tails. Confirm this intended behavior change is acceptable, and that we update/extend the affected assertions rather than treat them as regressions. (User decision — surfaced below.) 2. R2 (mega-function decomposition): Hitting maxCC ≤ 15 requires internally decomposing move_task (778), status (483), map_requirements (382), _validate_ready_for_review (348), finalize_tasks (218), mark_status (265) — this is more than file-splitting and is the bulk of the effort. Plan must budget for it. 3. R3 (scope size): ~1800 LOC moved + 6 mega-functions decomposed + ~410 LOC new tests + golden-test capture is a large mission. The plan should slice this into independently-mergeable WPs (likely one per seam + one for the commit-routing fix + one for golden-test capture), respecting dependency order. 4. R4 (_planning_commit_worktree shared helper): tail 3 imports it from mission.py. If commit_for_mission fully internalizes worktree resolution, confirm mission.py's own use is unaffected (out of scope to change mission.py here). 5. R5 (golden-test framework): Decide capture mechanism (inline snapshots vs pytest-snapshot) — no snapshot dependency exists in the suite today.