description: "Work package task list — Model-Discipline Dispatch Binding"
Work Packages: Model-Discipline Dispatch Binding
Inputs: Design documents from /kitty-specs/model-discipline-dispatch-binding-01KWPW36/ Prerequisites: plan.md (required), spec.md rev 2 (full-evaluator scope), research.md, adversarial-review.md
Tests: Red-first mandatory (NFR-001) — behavior proven through spec-kitty dispatch / ProfileInvocationExecutor.invoke() payload contents, never the Pydantic model in isolation, never a stubbed recommendation. SC-001 requires the recommendation to VARY with catalog content (anti-fake).
Organization: Six ownership-disjoint work packages tracking IC-01…IC-06. Dependency DAG: WP01/WP04/WP05 are foundations (no deps); WP02 needs WP01+WP04; WP03 needs WP01+WP02+WP05; WP06 needs WP05.
Work Package Summary
| WP | Title | Priority | Dependencies | Subtasks | Authoritative surface |
|---|---|---|---|---|---|
| WP01 | Catalog loader + action→task_type map | P0 | none | T001–T005 | src/doctrine/model_task_routing/ |
| WP02 | Routing evaluator (objective scorer) | P0 | WP01, WP04 | T007–T011 | src/doctrine/model_task_routing/ |
| WP03 | Advisory recommendation on the dispatch payload | P0 | WP01, WP02, WP05 | T012–T019 | src/specify_cli/invocation/ |
| WP04 | Agent-profile model/effort field | P1 | none | T018–T022 | src/doctrine/agent_profiles/ |
| WP05 | Doctrine artifact + DRG resolution + catalog data | P1 | none | T023–T029 | src/doctrine/ |
| WP06 | Durable no-dangling-reference invariant | P1 | WP05 | T030–T032 | tests/architectural/ |
Work Packages
Work Package WP01: Catalog loader + action→task_type map (Priority: P0)
Goal: Turn the dead model_task_routing catalog into a loadable, freshness-checked runtime source, and bridge dispatch verbs to catalog task_type vocabulary.
Independent Test: loader.load(catalog_path: Path | None = None) resolves the catalog by default via importlib.resources.files("doctrine.model_task_routing")/"catalog"/"model-to-task_type.yaml" (with an injectable override) → parses → validates against ModelToTaskType → applies freshness; task_class_map maps a dispatch action verb to a catalog task_type. Missing/whole-file-invalid catalog → returns absent (non-fatal).
Prompt: /tasks/WP01-catalog-loader-and-task-class-map.md Requirement Refs: FR-001, FR-002, C-002, C-004 Dependencies: none
Included Subtasks
- ✅ T001 [red-first]
tests/doctrine/test_model_task_routing_loader.py: valid catalog loads + validates; missing → None; whole-file-invalid → None (non-fatal); stale (freshness) → flagged. RECORD pre-fix RED. - ✅ T002 [red-first]
tests/doctrine/test_task_class_map.py: known action verb → task_type; unknown → None; map stays in sync withDEFAULT_ROLE_CAPABILITIESverbs. RECORD RED. - ✅ T003 [FR-001]
src/doctrine/model_task_routing/loader.py:load(catalog_path: Path | None = None)resolves the catalog viaimportlib.resources.files("doctrine.model_task_routing")/"catalog"/"model-to-task_type.yaml"by default (injectable override for tests), YAML load,ModelToTaskType.model_validate,freshness_policycheck. Pure/deterministic. - ✅ T004 [FR-002]
src/specify_cli/invocation/task_class_map.py: action/role verb → catalogtask_type. - ✅ T005 Turn T001/T002 green;
ruff/mypyclean; complexity ≤ 15; no ≥3× literals.
Dependencies / Risks
- None (foundation). Risk: catalog resolution is Python package data (
importlib.resources), matched exactly by WP05's shipped file location. Note:loader.py/task_class_map.pyare ORPHANS until WP03 wires them intoinvoke()— expected; the dead-modules invariant (and its_ALLOWLISTde-listing) is validated/owned at the WP03 tip, not here. The action→task_type map is a live maintenance seam.
Work Package WP02: Routing evaluator (Priority: P0)
Goal: Deterministic recommendation from task_fit × weights under objective (quality_first = the capability lever), applying tier_constraints + override_policy precedence (advisory: emit catalog + profile candidates with provenance, enforce neither).
Independent Test: evaluator.recommend(catalog, task_type, profile) returns a deterministic recommendation; changing task_fit/weights changes the winner; a profile model surfaces as a provenance-tagged candidate under advisory.
Prompt: /tasks/WP02-routing-evaluator.md Requirement Refs: FR-003, NFR-004, C-004 Dependencies: WP01, WP04
Included Subtasks
- ✅ T007 [red-first]
tests/doctrine/test_model_task_routing_evaluator.py: quality_first ranks strongest-fit for a high-judgment task_type;tier_constraintscap respected; deterministic (same inputs → same output). RECORD RED. - ✅ T008 [red-first] override precedence under
advisory: catalog pick + profile declaration both surfaced with provenance, neither enforced. RECORD RED. - ✅ T009 [FR-003]
src/doctrine/model_task_routing/evaluator.py: pure scorer (no I/O) consuming WP01's loaded catalog + task_type + the WP04 profile field. - ✅ T010 [NFR-004] determinism + edge cases (no match, empty task_fit) covered.
- ✅ T011
ruff/mypyclean; complexity ≤ 15.
Dependencies / Risks
- WP01 (catalog/task_type), WP04 (profile field — read
profile.preferred_model/profile.effort). Risk: capability-via-quality-objective must actually rank strongest-fit; precedence must be deterministic. Note:evaluator.pyis an ORPHAN until WP03 wires it intoinvoke()— expected; dead-modules validated at the WP03 tip.
Work Package WP03: Advisory recommendation on the dispatch payload (Priority: P0)
Goal: Surface the evaluator's recommendation on InvocationPayload through invoke(), advisory + non-fatal, in both --json and rich render.
Independent Test: red-first through spec-kitty dispatch / invoke() — with a catalog, the payload carries a recommendation that VARIES with catalog content; without a catalog, absent + dispatch succeeds.
Prompt: /tasks/WP03-advisory-payload-wiring.md Requirement Refs: FR-004, NFR-001, NFR-002, C-001 Dependencies: WP01, WP02, WP05
Included Subtasks
- ✅ T012 [red-first]
tests/invocation/test_dispatch_recommendation.py: throughinvoke()/dispatch --json, recommendation present with a catalog + VARIES when catalog scoring changes (anti-fake, SC-001); absent when catalog removed, dispatch still succeeds (NFR-002). RECORD pre-fix RED (no slot exists). - ✅ T013 [FR-004]
InvocationPayloadnew__slots__recommendation field (+to_dict()), runtime imports as needed. - ✅ T014 [FR-004] wire the loader+evaluator call into
ProfileInvocationExecutor.invoke()after profile+action resolve (~executor.py:196/205), before payload construction (~:273); non-fatal (missing/stale → absent). MANDATORY: extract a pure helper_compute_recommendation(profile, action) -> Recommendation | Noneholding the loader+evaluator call and the non-fatal envelope soinvoke()stays ≤15 cognitive complexity. - ✅ T015 [FR-004]
_render_rich_payload(cli/commands/dispatch.py) recommendation line. - ✅ T016 [NFR-002] non-fatal paths asserted (missing/stale/unmatched catalog).
- ✅ T017
ruff/mypyclean; complexity ≤ 15. - ✅ T018 [dead-modules tip] Remove
doctrine.model_task_routing.modelsfrom_ALLOWLISTintests/architectural/test_no_dead_modules.py; confirm loader.py/task_class_map.py/evaluator.py all gain theirinvoke()caller here so the no-dead-modules gate is green at the integration tip. - ✅ T019 [integration] With NO fixture override, run
spec-kitty dispatch/invoke()and assert a recommendation IS produced from the REAL shipped WP05 catalog through the loader's default path.
Dependencies / Risks
- WP01, WP02, WP05 (the shipped catalog — untested agreement between WP01's default path and WP05's file location without T019). Risk: must never fail dispatch; red-first through the CLI/
invoke()payload, not the model in isolation. This WP ownstest_no_dead_modules.py(de-allowlistsmodels) and the complexity-extraction ofinvoke().
Work Package WP04: Agent-profile model/effort field (Priority: P1)
Goal: Optional per-profile model/effort field that reaches the evaluator — added to the DOMAIN model (profile.py) + the schema SOURCE (schema_models.py, regenerated), not the generated .yaml.
Independent Test: a profile declaring model exposes it on the loaded AgentProfile object (not dropped by extra='ignore'); an existing profile without it validates unchanged.
Prompt: /tasks/WP04-agent-profile-model-field.md Requirement Refs: FR-005, NFR-003, C-005 Dependencies: none
Included Subtasks
- ✅ T018 [red-first]
tests/doctrine/test_agent_profile_model_field.py: a YAML profile withmodel:/effort:→ the value is present on the loadedAgentProfile(proves it reaches the domain model); a profile without it loads unchanged (NFR-003). RECORD pre-fix RED (field dropped today). - ✅ T019 [FR-005] add optional fields to
AgentProfileinsrc/doctrine/agent_profiles/profile.py: EXACTLYpreferred_model: str | None = Field(default=None, alias="model")andeffort: str | None = Field(default=None, alias="effort")(mirror therouting_priorityalias pattern, :264);preferred_modelavoids Pydantic v2'smodel_protected namespace deterministically — no conditional aliasing. - ✅ T020 [FR-005] add the field to
src/doctrine/agent_profiles/schema_models.py(schema source) and REGENERATEsrc/doctrine/schemas/agent-profile.schema.yaml(do not hand-edit). - ✅ T021 [NFR-003] back-compat: existing profiles load; schema parity test green.
- ✅ T022
ruff/mypyclean; complexity ≤ 15.
Dependencies / Risks
- None. Consumed by WP02. Risk: schema-only add is a silent no-op — MUST land on
profile.py. Regenerate; never hand-edit the.yaml.
Work Package WP05: Doctrine artifact + DRG resolution + catalog data (Priority: P1)
Goal: Make the charter model-discipline references real activated doctrine (DRG-driven) and ship a populated catalog.
Independent Test: charter context surfaces the model-discipline tactic guidance; model-task-routing (repointed token) + autonomous-operation-protocol resolve in the regenerated references.yaml via directive-reachability; the catalog validates against the schema.
Prompt: /tasks/WP05-doctrine-artifact-and-drg.md Requirement Refs: FR-006, FR-007, FR-008, C-002 Dependencies: none
Included Subtasks
- ✅ T023 [red-first]
tests/charter/test_model_task_routing_resolves.py: themodel-task-routingtactic loads via DRG traversal (not just string-present) +charter contextsurfaces its body;autonomous-operation-protocolresolves. RECORD pre-fix RED (both dangle). - ✅ T024 [FR-006]
src/doctrine/tactics/built-in/model-task-routing.tactic.yaml(kebab) — routing guidance + catalog pointer; schema-valid. - ✅ T025 [FR-006]
src/doctrine/graph.yaml: addtactic:model-task-routingnode + an activated-directive →tactic:model-task-routingsuggestsedge (pattern of DIRECTIVE_043/044/045). - ✅ T026 [FR-006] repoint the charter prose token in
.kittify/charter/charter.mdfrommodel_task_routing→model-task-routing. - ✅ T027 [FR-007] one-line activated-directive →
tactic:autonomous-operation-protocolsuggestsedge ingraph.yaml(tactic already exists). - ✅ T028 [FR-008] populated
model-to-task_typecatalog instance at exactlysrc/doctrine/model_task_routing/catalog/model-to-task_type.yaml(package data — matches WP01'simportlib.resourcesdefault path) (models + task_fit +objective: quality_first+override_policy: advisory+freshness_policy); validate against the schema. - ✅ T029 [C-002] regenerate
references.yaml+ the_LIBRARY/synthesis-manifest bundle (generated-lockfile step);ruff/mypy/terminology-guard clean.
Dependencies / Risks
- None for authoring. Risk: DRG-driven resolution — a references.yaml row without a
suggestsedge still dangles. Bundle regen reds CI if skipped. Kind = tactic (confirmed).
Work Package WP06: Durable no-dangling-reference invariant (Priority: P1)
Goal: A refactor-stable test asserting EVERY charter → token reference (all sections) resolves — guarding against section-level dangling recurrence.
Independent Test: fails on pre-fix HEAD (both tokens dangle), passes post-WP05; asserts by parsing → \token\`` prose + references.yaml id-suffix match, no literal token list pinned.
Prompt: /tasks/WP06-no-dangling-reference-invariant.md Requirement Refs: FR-009, SC-004 Dependencies: WP05
Included Subtasks
- ✅ T030 [FR-009]
tests/architectural/test_charter_references_resolve.py: parse all backticked-after-→tokens across every charter.md section; assert each resolves to areferences.yamlid-suffix. Exclude non-backticked prose. - ✅ T031 confirm RED on pre-fix HEAD (both tokens absent) + GREEN post-WP05; refactor-stable (no literal token list). Since this WP depends on WP05 (tokens already resolve on its branch), capture RED against the pre-WP05 base (e.g.
git stashWP05'sgraph.yaml/charter.mdedits, or run on the mission base branch before WP05 landed); RECORD the pre-fix failure; then confirm GREEN on the integrated branch. - ✅ T032
ruff/mypyclean.
Dependencies / Risks
- WP05 (goes green once tokens resolve). Risk: must be refactor-stable — vocabulary-derived, not line-pinned.