Implementation Plan: Local Custom Mission Loader
Mission: local-custom-mission-loader-01KQ2VNJ Date: 2026-04-25 Spec: spec.md Branch contract: current=main, planning_base=main, merge_target=main, branch_matches_target=true.
Summary
Add a v1 local custom mission loader so project-authored mission YAML under .kittify/missions/<key>/, .kittify/overrides/missions/<key>/, and existing mission-pack hooks can run as first-class peers to the four built-in missions. The loader extends the existing internal-runtime discovery (no parallel loader), adds a structural validator (with stable error codes and --json shape), introduces spec-kitty mission run <key> --mission <slug> as a thin CLI surface, and lifts the composition-dispatch hard-guard from software-dev-only to "any mission whose runtime template marks per-step profile resolution explicitly". Built-in mission behavior is preserved by leaving _ACTION_PROFILE_DEFAULTS untouched and gating composition on the new per-step agent_profile field — built-ins keep their current path, custom missions take a parallel-but-shared composition path. Retrospective execution stays out of scope; only structural marker validation.
Technical Context
| Field | Value |
|---|---|
| Language / Version | Python 3.11+ (charter) |
| Primary Dependencies | typer, rich, ruamel.yaml, pydantic (existing); spec_kitty_events (PyPI). No new external deps. |
| Storage | Filesystem only (YAML under .kittify/, runtime snapshots under existing .kittify/runtime/). |
| Testing | pytest with pytest --cov enforcing ≥ 90% on new code; mypy --strict; ruff check. |
| Target Platform | macOS / Linux dev environments + CI. |
| Project Type | Single project (CLI). |
| Performance Goals | Loader p95 < 250ms (NFR-001); ERP fixture suite < 10s (NFR-004). |
| Constraints | No spec_kitty_runtime imports; no legacy DAG fall-through; no SaaS / install. |
| Scale / Scope | One reference custom mission (ERP fixture); validator handles ≤ 50 steps per mission. |
Charter Check
Charter mode at action plan: compact (already loaded for specify, action-scoped doctrine reused).
| Charter constraint | Stance in this plan |
|---|---|
| pytest with 90%+ test coverage for new code | Enforced via pytest --cov configured to fail under 90% on the new modules listed in §Source Code. |
| mypy --strict must pass | All new modules type-checked under mypy --strict; new fields use Pydantic v2 typed models. |
| Integration tests for CLI commands | A new integration suite under tests/integration/test_mission_run_command.py exercises spec-kitty mission run end-to-end against the ERP fixture. |
| DIRECTIVE_003 (Decision Documentation) | All planning decisions recorded in research.md. |
| DIRECTIVE_010 (Specification Fidelity) | Each FR / NFR / C is cross-referenced from the design artifacts; mission-review will verify FR coverage. |
| Tactic: premortem-risk-identification | §Risks below applies the premortem lens. |
| Tactic: requirements-validation-workflow | §Requirements traceability ties every FR to a verifying test. |
| Tactic: adr-drafting-workflow | An ADR-style decision capture in research.md covers the load-time contract synthesis decision and the composition-gate widening. |
No new charter conflicts surfaced. Charter Check: PASS.
Project Structure
Documentation (this feature)
kitty-specs/local-custom-mission-loader-01KQ2VNJ/
├── plan.md # This file
├── research.md # Phase 0 output
├── data-model.md # Phase 1 output
├── quickstart.md # Phase 1 output
├── contracts/
│ ├── mission-run-cli.md # CLI command contract (args + JSON shape)
│ └── validation-errors.md # Stable error code enumeration
├── spec.md # already exists
├── checklists/requirements.md
├── meta.json
└── tasks/ # populated by /spec-kitty.tasks
Source Code (repository root)
src/specify_cli/
├── next/_internal_runtime/
│ ├── discovery.py # extend: tier ordering kept; add reserved-key shadow rejection
│ ├── schema.py # extend: PromptStep gains optional agent_profile (alias agent-profile) + optional retrospective marker convention; MissionTemplate gains validation hook
│ └── loader.py # NEW: thin façade combining discover_missions + load_mission_template + validate_custom_mission
├── mission_loader/ # NEW package: validator + custom mission entry point
│ ├── __init__.py
│ ├── errors.py # closed enum of error codes; structured payload model
│ ├── validator.py # validate_custom_mission(template) -> ValidationReport
│ ├── retrospective.py # has_retrospective_marker(template) -> bool
│ ├── contract_synthesis.py # build MissionStepContract records from a custom MissionTemplate
│ └── command.py # spec-kitty mission run logic (callable, decoupled from Typer)
├── cli/commands/
│ └── mission_type.py # extend: register `mission run` subcommand wired to mission_loader.command
├── mission_step_contracts/
│ └── executor.py # extend: profile_hint sourced from per-step agent_profile when present (no _ACTION_PROFILE_DEFAULTS expansion)
└── next/runtime_bridge.py # extend: composition gate widens to "mission_template declares composed steps" rather than only software-dev
tests/
├── unit/mission_loader/ # NEW
│ ├── test_validator_errors.py # FR-004, NFR-002 (every error code reachable)
│ ├── test_retrospective_marker.py # FR-005
│ ├── test_contract_synthesis.py # FR-008 / FR-006 wiring
│ └── test_loader_facade.py # FR-002 / FR-003 precedence + shadow rules
├── integration/
│ ├── test_mission_run_command.py # FR-001, FR-013 (json shape), FR-009 (ERP)
│ └── test_custom_mission_runtime_walk.py # FR-006, FR-007, FR-009 (decision_required + composition + paired invocation)
├── architectural/
│ └── test_shared_package_boundary.py # already passing; this plan keeps it green (C-002)
├── fixtures/missions/
│ └── erp-integration/mission.yaml # NEW reference fixture (FR-009)
└── specify_cli/next/test_runtime_bridge_composition.py # extend: assert built-in dispatch unchanged when custom missions register
docs/
└── reference/missions.md # extend: author guide + closed error code table (NFR-002)
Phase 0 — Outline & Research
See research.md. Key resolved decisions:
1. R-001 — Retrospective marker spelling. Lock to id == "retrospective" on the final declared step. No retrospective: true flag, no kind: retrospective. Rationale: minimum-invasive; uses existing PromptStep.id field; trivial to validate; no schema change. Alternatives considered: dedicated retrospective field, separate audit_steps re-purposing. Both rejected for being heavier and inviting silent drift. 2. R-002 — Shadow-of-built-in policy (resolves FR-011). REJECT shadowing any of the four built-in mission keys (software-dev, research, documentation, plan) with a stable MISSION_KEY_RESERVED error. Non-built-in shadowing (project_override over project_legacy, user_global, packs) warns with MISSION_KEY_SHADOWED and uses the higher-precedence layer. Rationale: prevents accidentally breaking software-dev; preserves the existing override semantics for everything else. 3. R-003 — Profile resolution surface. Add agent_profile: str | None = None to PromptStep with Pydantic field alias agent-profile. The composition dispatcher reads it and passes it as profile_hint into StepContractExecutionContext. _ACTION_PROFILE_DEFAULTS is not extended. 4. R-004 — Custom mission step contracts. Synthesize MissionStepContract records from the loaded MissionTemplate at load time. Each composed step s produces a contract mission=<mission-key>, action=<s.id> bound to the loaded YAML's contract steps (a synthetic single-step contract). Allow YAML to optionally point to an existing contract via contract_ref: <id> for advanced authors; default is auto-synthesis. The repository remains the system of record at runtime; the synthesizer registers in-process for the lifetime of the run. 5. R-005 — Composition gate widening. _COMPOSED_ACTIONS_BY_MISSION becomes a fallback table for built-ins. The new gate is: for any mission whose loaded runtime template has agent_profile populated on the just-completed step, dispatch via composition. Built-in dispatch path is unchanged (their templates still keep the legacy DAG path because they don't carry agent_profile). This gives custom missions composition without altering built-ins. 6. R-006 — Decision-required step shape. Custom mission YAML uses the existing requires_inputs: [<key>] field on a step to mark it as a decision_required gate. No new field. The engine planner already routes such steps through decision_required. 7. R-007 — Mission-pack discovery. No new code; mission packs already feed the project_config tier in _build_tiers. Tests cover that path. 8. R-008 — --json envelope. Validation errors emit {"result": "error", "error_code": "<CODE>", "message": "<text>", "details": {...}} on --json; the human channel uses rich.panel.Panel with the same fields. CLI exit code = 2 for any validation error; 0 on success; 1 on infrastructure failure.
Phase 1 — Design & Contracts
See:
- data-model.md — entities, fields, validation rules, invariants.
- contracts/mission-run-cli.md — CLI command shape + JSON envelope.
- contracts/validation-errors.md — closed enum of error codes.
- quickstart.md — operator-facing how-to.
Integration Points
| Surface | Integration |
|---|---|
src/specify_cli/next/_internal_runtime/discovery.py | Add RESERVED_BUILTIN_KEYS = frozenset({"software-dev", "research", "documentation", "plan"}); add validate_no_reserved_shadow(result: DiscoveryResult); keep _build_tiers unchanged. |
src/specify_cli/next/_internal_runtime/schema.py | Add `agent_profile: str \ |
src/specify_cli/mission_loader/validator.py (NEW) | Composes existing discovery + load_mission_template + structural validator; returns ValidationReport(template=..., errors=[...], warnings=[...]). |
src/specify_cli/mission_loader/contract_synthesis.py (NEW) | synthesize_contracts(template) -> list[MissionStepContract]; result registered into a per-process MissionStepContractRepository shadow at run start. |
src/specify_cli/mission_step_contracts/executor.py | _resolve_profile_hint already prefers context.profile_hint; the calling site in runtime_bridge._dispatch_via_composition learns to read the active step's agent_profile from the frozen template and pass it through. No change to _ACTION_PROFILE_DEFAULTS. |
src/specify_cli/next/runtime_bridge.py | (a) _should_dispatch_via_composition widens to include any step whose template entry has agent_profile; (b) _dispatch_via_composition reads the step's agent_profile and forwards as profile_hint. |
src/specify_cli/cli/commands/mission_type.py | Register @app.command("run") with args mission_key: str and option --mission <slug>, --json/--no-json. The handler delegates to mission_loader.command.run(...). |
docs/reference/missions.md | Add author guide (custom mission YAML shape, retrospective marker rule, profile rules) + the closed error-code table. |
Tracker spec_kitty_tracker | No tracker changes; events keep flowing through the existing snapshot. |
Data flow
1. Operator runs spec-kitty mission run erp-integration --mission erp-q3-rollout. 2. mission_type.run_cmd resolves project root → builds DiscoveryContext → calls mission_loader.command.run. 3. mission_loader.command.run calls discovery.discover_missions_with_warnings, then loader.load_mission_template(mission_key, ctx). 4. validator.validate_custom_mission(template) runs:
5. On error, render JSON or panel and exit 2. 6. On success, contract_synthesis.synthesize_contracts(template) registers in-process; the run starts via the existing runtime_bridge.get_or_start_run (extended to accept a custom mission template path). 7. decide_next_via_runtime advances. For composed steps, the dispatcher picks up agent_profile from the frozen template and forwards it. For decision_required steps, the existing planner takes over.
- schema check (
MissionTemplate.model_validate) - reserved-key check
- retrospective-marker check
- per-step profile / contract-binding check
- returns
ValidationReport.
Requirements traceability
| FR / NFR / C | Verified by |
|---|---|
| FR-001 | tests/integration/test_mission_run_command.py::test_run_command_starts_runtime_with_json_output |
| FR-002 | tests/unit/mission_loader/test_loader_facade.py::test_precedence_explicit_over_env, ..._project_override_over_legacy, etc. |
| FR-003 | ..._loads_from_kittify_missions, ..._loads_from_overrides, ..._loads_from_mission_pack_manifest |
| FR-004 | tests/unit/mission_loader/test_validator_errors.py (one test per closed error code in contracts/validation-errors.md) |
| FR-005 | test_retrospective_marker.py::test_missing_marker_rejected_with_stable_code |
| FR-006 | tests/integration/test_custom_mission_runtime_walk.py::test_composed_step_pairs_invocation_records |
| FR-007 | ..._decision_required_step_pauses_runtime_and_resumes |
| FR-008 | test_validator_errors.py::test_step_without_profile_or_contract_rejected; test_contract_synthesis.py::test_synthesizes_one_contract_per_step |
| FR-009 | ERP fixture suite + test_custom_mission_runtime_walk.py::test_erp_full_walk |
| FR-010 | tests/specify_cli/next/test_runtime_bridge_composition.py (existing 21 tests stay green) |
| FR-011 | test_loader_facade.py::test_reserved_key_shadow_rejected_with_MISSION_KEY_RESERVED (resolves via R-002) |
| FR-012 | test_custom_mission_runtime_walk.py::test_next_advances_custom_mission_after_run_started |
| FR-013 | test_mission_run_command.py::test_validation_error_json_envelope_shape_locked |
| NFR-001 | tests/perf/test_loader_perf.py::test_load_p95_under_250ms (uses pytest --benchmark or wall-clock assertion) |
| NFR-002 | test_validator_errors.py parametrized over the closed error-code enum |
| NFR-003 | pytest --cov with fail-under 90 on new modules; CI guard added to .github/workflows/ci-quality.yml if not already enforced for the new package |
| NFR-004 | test_custom_mission_runtime_walk.py::test_erp_full_walk_completes_under_10s |
| NFR-005 | mypy --strict src/specify_cli/mission_loader src/specify_cli/next/_internal_runtime in CI |
| C-001 | No SaaS surfaces touched; reviewer asserts in mission-review |
| C-002 | tests/architectural/test_shared_package_boundary.py (already covers this) |
| C-003 | test_runtime_bridge_composition.py::test_composition_success_skips_legacy_dispatch (parametrized; existing) |
| C-004 | New code calls ProfileInvocationExecutor only via StepContractExecutor; reviewer asserts grep |
| C-005 | Validator accepts retrospective marker structurally only — no execution wiring |
| C-006 | New code routes; does not generate. Reviewer asserts |
| C-007 | One new subcommand on existing mission group; no new groups |
| C-008 | All tests use tmp_path filesystem fixtures |
Risks (premortem)
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| Widening the composition gate breaks built-in dispatch path. | Med | High | Gate is conditional on agent_profile being set; built-in templates do not set it; existing 21-case parametrized test in test_runtime_bridge_composition.py stays green and is the regression trap. |
| Custom mission contract synthesis races with the on-disk repository. | Low | Med | Synthesized contracts live in a per-process registry that takes precedence within the run; on-disk repository unchanged. Lifetime ends when the run terminates. |
requires_inputs semantics already used by built-ins; reusing it for decision_required gates in custom missions could collide. | Low | Med | Built-in templates already use it the same way; the engine treats it identically. Tests assert behavior parity. |
Operators name a custom mission software-dev and discover surprising behavior. | Med | High | R-002 rejects with MISSION_KEY_RESERVED at load time. Test enforces. |
agent_profile alias parsing: YAML uses kebab, Python uses snake. | Med | Low | Pydantic field alias agent-profile accepts both at parse; internal field is agent_profile. Documented in docs/reference/missions.md. |
| Validator p95 > 250ms when many packs declared. | Low | Low | Fixture sized at typical project; perf test asserts threshold; if violated, batch parsing optimization is a follow-on. |
Constitution Gate
This plan does not modify any constitutional artifact. PASS.
Open Items
None. All planning open items resolved (R-001 … R-008). Mission ready for /spec-kitty.tasks.
Branch contract (2nd statement): current=main, planning_base=main, merge_target=main, branch_matches_target=true. Next command: /spec-kitty.tasks.