Research — Phase 0: Decompose agent/tasks.py god-module
Mission: 01KVWVARJKSH9T2QNHJVE4ZC7Y · Date: 2026-06-24 · Branch: main (c3814ec5a) Method: 3 parallel read-only Explore agents (seam map · coupling/commit-router trace · test-coverage audit) over src/specify_cli/cli/commands/agent/tasks.py (4633 LOC) and its collaborators.
TL;DR / Decisions
| # | Decision | Resolution | Evidence |
|---|---|---|---|
| D1 | Seam set (FR-003) | 5 extracted seam modules + a thinned command shim. See §2. | seam-map agent |
| D2 | Verification strategy (resolves spec C-005) | Capture golden CLI characterization tests BEFORE refactoring, then keep the existing 72 unit tests green. Existing coverage is seam/import-heavy but contract-light (only 1 help-text assertion, 2 smoke tests) — function-importing tests would survive a move silently while the CLI contract drifts undetected. | coverage agent §3,§5 |
| D3 | Residual-size target (resolves spec NFR-004) | Binding target is maxCC ≤ 15 for every function (the real defect driver). LOC target: tasks.py ≤ ~1200 LOC residual (from 4633), achieved by extracting ~1800 LOC of helpers into 5 seams AND internally decomposing the 6 mega-functions. A literal "tiny registration shim" is NOT realistic in one mission — the command bodies carry genuine orchestration. | seam-map agent §1,§5 |
| D4 | Commit-routing scope re-grounding (FR-006/007/008) | The issue/PR-2060 premise has drifted — see §3. There are 3 tails, not 4; protected-primary now refuses → feature branch (the silent-skip-to-coord behavior was already removed in WP02/WP03); and an FR-008-style test already exists. Re-scope to: route the 3 tails through the centralized commit_for_mission router and delete the now-redundant bespoke pre-checks. | coupling agent §2 |
1. The public CLI surface (the frozen contract — C-001/FR-001)
Single typer app app (tasks.py:716, name="tasks", no_args_is_help=True) with 9 commands:
| Command | Handler | Lines | ~LOC | Complexity |
|---|---|---|---|---|
move-task | move_task() | 1824–2601 | 778 | EXTREME |
mark-status | mark_status() | 2956–3220 | 265 | HIGH |
list-tasks | list_tasks() | 3223–3305 | 83 | Low |
add-history | add_history() | 3308–3384 | 77 | Low |
finalize-tasks | finalize_tasks() | 3387–3604 | 218 | HIGH |
map-requirements | map_requirements() | 3607–3988 | 382 | EXTREME |
validate-workflow | validate_workflow() | 3991–4082 | 92 | Low |
status | status() | 4085–4567 | 483 | EXTREME |
list-dependents | list_dependents() | 4570–4632 | 63 | Low |
Plus a non-command helper that is itself a god-function: _validate_ready_for_review() (1377–1724, 348 LOC) — touches research-artifact validation, worktree state, merge ancestry, and contamination checks. It must be sub-decomposed regardless of which seam it lands in.
The complexity is concentrated in 6 functions (move_task, status, map_requirements, _validate_ready_for_review, mark_status, finalize_tasks ≈ 2474 LOC / 53% of the file). File-splitting alone will NOT satisfy maxCC ≤ 15 — these bodies must be internally flattened/extracted too.
2. Proposed seams (D1 — resolves FR-003)
Five extracted modules adjacent to tasks.py, plus the residual shim. One-way dependency only (seams never import tasks.py).
| Seam (proposed file) | Members (approx) | ~LOC | Existing test trust |
|---|---|---|---|
tasks_outline.py — tasks.md/manifest parsing, WP-id resolution | _normalize_task_id_input, _match_history_wp_heading, _extract_pipe_table_wp_id, _resolve_history_wp_id, _is_pipe_table_task_row, _parse_pipe_table_header, _wp_id_exists, _resolve_wp_id, WP regex consts | ~180 | MEDIUM (+30 LOC isolated parser test) |
tasks_materialization.py — frontmatter/file persistence, markdown mutation | _collect_status_artifacts, _persist_inline_subtask_status, _materialize_inline_subtask_status, _update_pipe_table_status, _resolve_checkbox, _resolve_pipe_table, _persist_review_artifact_override, _persist_review_feedback | ~185 | MEDIUM (+80 LOC error cases) |
tasks_finalize_validation.py — dependency/cycle validation, lane metadata | _is_backward_transition, _lane_targets_for_emit, _wp_lane_from_status_events, _read_transactional_wp_lane, + cycle/frontmatter loop extracted from finalize_tasks | ~230 | HIGH (no new tests) |
tasks_dependency_graph.py — dependency readiness/gating glue | _check_dependent_warnings, _behind_commits_touch_only_planning_artifacts, dependent-gating extracted from move_task | ~165 | LOW (+50 LOC readiness test) |
tasks_parsing_validation.py — readiness/verdict/issue-matrix validation | _issue_matrix_* (7 fns), _validate_ready_for_review (sub-split), _self_review_fallback_option_error, _get_latest_review_cycle_verdict, _apply_review_status_flags | ~485 | STRONG (no new tests) |
tasks.py (residual shim) — typer app + 9 thinned command bodies + orchestration glue | the 9 @app.command handlers, reduced to dispatch/option-unpack/emit | ~800–1200 | LOW (needs golden CLI test) |
Suggested extraction order (dependency-aware, lowest-risk first): outline → materialization → finalize-validation → dependency-graph → parsing-validation → final shim-thinning pass.
Shared module-level state to relocate carefully (touched by multiple seams): TASKS_MD_FILENAME, SPEC_MD_FILENAME, _QUALIFIED_TASK_ID_RE, _VALID_VERDICTS, _FORWARD_ORDER, _RUNTIME_STATE_DENY_LIST, WP/inline regexes, and the TaskIdResolutionOutcome/Format enums + TaskIdResult dataclass.
3. Commit-routing: the premise has DRIFTED (D4 — critical)
The issue comment (from PR #2060, written against an older tasks.py) said: "the planning-commit tails were NOT migrated to commit_for_mission … 4 sites … silently skip commits on protected primaries rather than routing to the coord worktree." The current code tells a different story:
1. There are 3 commit tails, not 4:
move_task()→safe_commit(...)attasks.py:2486(WORK_PACKAGE_TASK; usesresolve_placement_only)mark_status()→safe_commit(...)attasks.py:3131(TASKS_INDEX; usesresolve_placement_only)map_requirements()→safe_commit(...)attasks.py:3947(WORK_PACKAGE_TASK; resolves worktree via_planning_commit_worktreeimported frommission.py:3928)
2. The tails are already kind-aware — they call resolve_placement_only(...) / _planning_commit_worktree(...) with explicit MissionArtifactKind, so they are NOT the naive hardcoded CommitTarget(ref=target_branch) the comment implies. They were partially migrated by the "write-surface-coherence WP02/WP03" work.
3. Protected-primary behavior changed: commit_for_mission (coordination/commit_router.py:98–141) now refuses planning commits to a protected primary and tells the user to use a feature branch (status="no_op_wrong_surface", commit_router.py:160–178). The historical "route to coordination worktree" fallback was removed (WP02/WP03). So the comment's "should route to coord worktree" remedy is stale.
4. An FR-008-style regression test already exists: tests/specify_cli/cli/commands/test_wp03_bypass_writers_fr008.py:139–206 asserts the refusal diagnostic names the feature-branch remedy and does NOT mention the coord worktree.
Re-grounded scope for FR-006/007/008 (still valuable, just re-aimed):
- FR-006 → Route the 3 tails through the single
commit_for_missionentry point instead of open-codingresolve_placement_only+safe_commit(and, for tail 3, the_planning_commit_worktreeimport frommission.py). This is the real "centralize the router" win the comment wanted. Gap: tail 3 must threadtarget_branch=for the WP09 ff-advance (currently absent at the call site). - FR-007 → Once routed, the bespoke pre-checks become dead:
_skip_target_branch_commit()(954–971),_protected_branch_status_commit_error()(921–932), their guard conditionals inmove_task(2448–2470) andmark_status(3021–3029), and the_planning_commit_worktreeimport (3928). Theis_protectedreads at924and970go with them. - FR-008 → Extend the existing test rather than duplicate it: add a
tasks.py-level regression proving each of the 3 tails routes throughcommit_for_mission(assert router invoked / refusal surfaced), since current tests still patch the directsafe_commitpath.
> ⚠️ Behavior-change nuance: routing through commit_for_mission shifts the protected-primary error from a pre-check message to a router result (no_op_wrong_surface diagnostic). The user-visible message text changes. This must be acknowledged (it touches FR-001's "output unchanged" — these are error paths on protected primaries, an intended behavior change under FR-006, not a regression). Decision point for the user — see questions below.
4. Coupling & circular-import risk
- dependency_graph (
core/dependency_graph.py, 391 LOC) is imported attasks.py:53and called at1215/1221(move_task warnings) and4603/4604(validate_workflow). It imports only fromstatus.py, never fromcli/commands/→ no cycle; LOW extraction risk. The two call sites stay in the shim; the seam needn't import it. - An extracted commit seam would re-import:
git.safe_commit,mission_runtime.{CommitTarget,MissionArtifactKind,resolve_placement_only,resolve_topology,routes_through_coordination},core.commit_guard.GuardCapability,git.protection_policy.ProtectionPolicy,core.paths.*,status.{Lane,resolve_lane_alias,feature_status_lock},missions._read_path_resolver. - Enforce one-way imports: seams must never import back into
tasks.py. Seam↔seam imports (e.g. materialization↔outline) are fine across separate files.
5. Test-coverage baseline (informs D2)
- ~72 core tests across 9 files (notably
test_tasks.py1054 LOC,test_tasks_canonical_cleanup.py863 LOC,test_tasks_helpers.py602 LOC). - Strong: parsing-validation helpers, finalize-validation. Weak: command-registration contract (1 help assertion), dependency readiness, package-materialization error paths.
- No golden CLI contract test pinning command names / flags / exit codes /
--jsonenvelope. This is the single biggest refactor risk and the reason for D2. - New test work estimate: ~410 LOC (golden CLI ~150, protected-primary routing ~100, materialization errors ~80, readiness ~50, isolated parser ~30).
Open questions / risks feeding /spec-kitty.plan
1. R1 (commit-routing behavior change): Adopting commit_for_mission changes the protected-primary error message text on the 3 tails. Confirm this intended behavior change is acceptable, and that we update/extend the affected assertions rather than treat them as regressions. (User decision — surfaced below.) 2. R2 (mega-function decomposition): Hitting maxCC ≤ 15 requires internally decomposing move_task (778), status (483), map_requirements (382), _validate_ready_for_review (348), finalize_tasks (218), mark_status (265) — this is more than file-splitting and is the bulk of the effort. Plan must budget for it. 3. R3 (scope size): ~1800 LOC moved + 6 mega-functions decomposed + ~410 LOC new tests + golden-test capture is a large mission. The plan should slice this into independently-mergeable WPs (likely one per seam + one for the commit-routing fix + one for golden-test capture), respecting dependency order. 4. R4 (_planning_commit_worktree shared helper): tail 3 imports it from mission.py. If commit_for_mission fully internalizes worktree resolution, confirm mission.py's own use is unaffected (out of scope to change mission.py here). 5. R5 (golden-test framework): Decide capture mechanism (inline snapshots vs pytest-snapshot) — no snapshot dependency exists in the suite today.