Work Packages: Planning Pipeline Integrity and Runtime Reliability
Mission: 069-planning-pipeline-integrity Inputs: kitty-specs/069-planning-pipeline-integrity/ Prerequisites: spec.md, plan.md, research.md, data-model.md
Subtask Index
| ID | Description | WP | Parallel |
|---|---|---|---|
| T001 | Change reduce() to derive materialized_at from last event timestamp | WP01 | |
| T002 | Add skip-write guard to materialize() — compare bytes, skip if identical | WP01 | |
| T003 | Fix materialize_if_stale() return to call reduce() not materialize() | WP01 | |
| T004 | Unit tests: reduce() determinism — same events → same materialized_at | WP01 | P |
| T005 | Unit tests: materialize() idempotency — two calls, one write | WP01 | P |
| T006 | Integration test: clean git tree after read-only status commands | WP01 | P |
| T007 | Define WorkPackageEntry and WpsManifest Pydantic models | WP02 | |
| T008 | Implement load_wps_manifest() — ruamel.yaml loader with presence tracking | WP02 | |
| T009 | Implement dependencies_are_explicit() helper — detect present-empty [] vs absent key | WP02 | |
| T010 | Implement generate_tasks_md_from_manifest() — human-readable tasks.md generator | WP02 | |
| T011 | Write src/specify_cli/schemas/wps.schema.json (JSON Schema Draft 2020-12) | WP02 | P |
| T012 | Unit tests: load, absent returns None, malformed raises ValidationError with field name | WP02 | P |
| T013 | Unit tests: dependencies_are_explicit for present-empty vs absent key | WP02 | P |
| T014 | Unit tests: generate_tasks_md_from_manifest preserves WP titles, deps, subtask counts | WP02 | P |
| T015 | Add wps.yaml detection at top of finalize_tasks() — load manifest, build wp_dependencies | WP03 | |
| T016 | Bypass prose parser when wps.yaml present (tier 0 takes precedence over tasks.md scanning) | WP03 | |
| T017 | Add tasks.md regeneration step when wps.yaml present | WP03 | |
| T018 | Integration test: wps.yaml present → manifest deps used, prose parser skipped | WP03 | P |
| T019 | Integration test: dependencies: [] in wps.yaml → WP has no deps after finalize | WP03 | P |
| T020 | Integration test: tasks.md overwritten with manifest-generated content | WP03 | P |
| T021 | Integration test: no wps.yaml → prose parser path unchanged (backward compat) | WP03 | P |
| T022 | Rewrite tasks-outline.md template — produce wps.yaml, not tasks.md | WP04 | |
| T023 | Rewrite tasks-packages.md template — read/update wps.yaml, generate WP files | WP04 | |
| T024 | Write m_3_2_0_update_planning_templates.py migration with detect + apply | WP04 | |
| T025 | Migration unit tests: detect returns True/False correctly | WP04 | P |
| T026 | Migration unit tests: apply overwrites stale files, respects agent config, is idempotent | WP04 | P |
| T027 | Add DecisionKind.query constant and is_query: bool = False to Decision dataclass | WP05 | |
| T028 | Implement query_current_state() in runtime_bridge.py — read state without advancing | WP05 | |
| T029 | Change result default to None in next_cmd.py; add query mode branch | WP05 | |
| T030 | Add explicit is_query branch to _print_human() — SC-003 verbatim label as first line | WP05 | |
| T031 | Unit test: bare spec-kitty next does not advance state machine | WP05 | P |
| T032 | Unit test: query output begins with [QUERY — no result provided, state not advanced] verbatim | WP05 | P |
| T033 | Unit test: --result success still advances (no regression) | WP05 | P |
| T034 | Unit test: JSON output includes "is_query": true | WP05 | P |
| T035 | Update KEBAB_CASE_PATTERN regex to accept digit-prefixed slugs | WP06 | |
| T036 | Update error message — add valid digit-prefix example, remove invalid example, add comment | WP06 | |
| T037 | Unit tests: digit-prefix slugs accepted; existing rejections unchanged | WP06 | P |
| T038 | Integration test: create "070-new-feature" passes slug validation | WP06 | P |
Work Package WP01: Fix status.json dirty-git (#524)
Goal: Make all status read paths idempotent — no write to disk unless the event log actually changed. Priority: P0 (blocks CI pipelines and agent workflows) Independent Test: After running spec-kitty agent tasks status against a clean git repo, git status --porcelain is empty. Prompt: tasks/WP01-fix-status-json-dirty-git.md Requirement Refs: FR-001, FR-002, FR-003, NFR-001
Included Subtasks
- □ T001 Change
reduce()to derivematerialized_atfrom last event timestamp (WP01) - □ T002 Add skip-write guard to
materialize()— compare bytes, skip if identical (WP01) - □ T003 Fix
materialize_if_stale()return to callreduce()notmaterialize()(WP01) - □ T004 [P] Unit tests:
reduce()determinism — same events → samematerialized_at(WP01) - □ T005 [P] Unit tests:
materialize()idempotency — two calls, one write (WP01) - □ T006 [P] Integration test: clean git tree after read-only status commands (WP01)
Implementation Notes
reducer.py:reduce()— both the empty case (line ~127) and the normal case (line ~157) call_now_utc(). Fix: normal case usessorted_events[-1].at; empty case uses"".reducer.py:materialize()— after computingjson_str, readout_pathif it exists and compare without_path.read_text(). Skipos.replaceif identical.views.py:materialize_if_stale()— finalreturn materialize(feature_dir)(line ~154) must becomereturn reduce(read_events(feature_dir)). NOTE: thewrite_derived_views()call earlier in this function also callsmaterialize()internally; this is handled by T002's skip-write guard. Both T002 and T003 are needed together for a fully clean fix.- Existing
test_reducer.pyusespytest.mark.fast— add new test cases to the same file following existing_make_event()helper.
Dependencies
None (independent).
Risks
- Consumers that assumed
materialized_atwas a "freshness" wall-clock timestamp may behave differently. The field's semantics shift to "timestamp of last event." Any consumer that comparedmaterialized_attodatetime.now()to detect stale cache should be updated or will incorrectly infer the cache is perpetually stale.
Work Package WP02: Add wps_manifest Module (#525 core)
Goal: New wps_manifest.py module providing the Pydantic data model, YAML loader, and tasks.md generator that WP03 and WP04 depend on. Priority: P0 (WP03 and WP04 depend on this) Independent Test: from specify_cli.core.wps_manifest import load_wps_manifest works; loading a valid wps.yaml returns a WpsManifest; loading an absent path returns None; loading malformed YAML raises ValidationError with the failing field name. Prompt: tasks/WP02-add-wps-manifest-module.md Requirement Refs: FR-004, FR-005, FR-007, NFR-002, NFR-003
Included Subtasks
- ✅ T007 Define
WorkPackageEntryandWpsManifestPydantic models (WP02) - ✅ T008 Implement
load_wps_manifest()— ruamel.yaml loader with presence tracking (WP02) - ✅ T009 Implement
dependencies_are_explicit()helper — detect present-empty[]vs absent key (WP02) - ✅ T010 Implement
generate_tasks_md_from_manifest()— human-readable tasks.md generator (WP02) - ✅ T011 [P] Write
src/specify_cli/schemas/wps.schema.json(JSON Schema Draft 2020-12) (WP02) - ✅ T012 [P] Unit tests: load, absent returns None, malformed raises ValidationError with field name (WP02)
- ✅ T013 [P] Unit tests:
dependencies_are_explicitfor present-empty vs absent key (WP02) - ✅ T014 [P] Unit tests:
generate_tasks_md_from_manifestpreserves WP titles, deps, subtask counts (WP02)
Implementation Notes
- Use
ruamel.yaml(charter standard) withYAML(typ="safe")for loading. - Track whether
dependencieskey was present: afteryaml.load()returns a raw dict, check"dependencies" in raw_wp_dictbefore constructing the Pydantic model. Store this as_dependencies_explicit: boolon the model (private field, not part of schema) or pass as a separate set. generate_tasks_md_from_manifest()output must include WP titles, dependency lines, and subtask counts — seedata-model.mdfor required fields.- JSON Schema file at
src/specify_cli/schemas/(create directory if absent). Use$schema: https://json-schema.org/draft/2020-12/schema. - T011 and T012-T014 are parallel-safe — the schema file and the tests can be written simultaneously with the implementation.
Dependencies
None (independent).
Risks
ruamel.yamlYAML(typ="safe")mode doesn't expose the raw dict the same way asdict. Useyaml.load()which returns a Python dict for safe-mode, and inspect"dependencies" in result["work_packages"][i]before Pydantic validation.
Work Package WP03: Integrate wps.yaml into finalize-tasks (#525 integration)
Goal: Update finalize_tasks() to use wps.yaml as the authoritative dependency source when present; regenerate tasks.md from the manifest after processing. Priority: P0 Independent Test: A wps.yaml with an explicit empty dep list for a work package, combined with misleading prose in tasks.md, leaves that work package with no deps after finalize-tasks runs. Prompt: tasks/WP03-integrate-wps-yaml-into-finalize-tasks.md Requirement Refs: FR-006, FR-007, FR-008, FR-011, FR-012, C-006
Included Subtasks
- □ T015 Add wps.yaml detection at top of
finalize_tasks()— load manifest, buildwp_dependencies(WP03) - □ T016 Bypass prose parser when wps.yaml present (tier 0 takes precedence) (WP03)
- □ T017 Add tasks.md regeneration step when wps.yaml present (WP03)
- □ T018 [P] Integration test: wps.yaml present → manifest deps used, prose parser skipped (WP03)
- □ T019 [P] Integration test: explicit empty dep list in wps.yaml protects WP from prose-inferred deps (WP03)
- □ T020 [P] Integration test: tasks.md overwritten with manifest-generated content (WP03)
- □ T021 [P] Integration test: no wps.yaml → prose parser path unchanged (WP03)
Implementation Notes
``python wps_manifest = load_wps_manifest(feature_dir) # returns None if absent if wps_manifest is not None: wp_dependencies = {e.id: e.dependencies for e in wps_manifest.work_packages if dependencies_are_explicit(e)} # Also include WPs with no explicit deps as empty: for e in wps_manifest.work_packages: if e.id not in wp_dependencies: wp_dependencies[e.id] = [] ``
``python tasks_md.write_text(generate_tasks_md_from_manifest(wps_manifest, mission_slug), encoding="utf-8") ``
- In
mission.py:finalize_tasks(), before the existingif tasks_md.exists():block (around line 1278), add a wps.yaml check: - After the existing WP frontmatter writing step, if
wps_manifest is not None, regenerate tasks.md: - The existing prose parser block (
if tasks_md.exists(): ... _shared_parse_deps(tasks_content)) is skipped entirely when wps.yaml is present. - Tests go in
tests/tasks/test_finalize_tasks_wps_yaml_unit.py(new file), following the mock-patch pattern intest_finalize_tasks_json_output_unit.py.
Dependencies
- WP02 (requires load_wps_manifest, dependencies_are_explicit, generate_tasks_md_from_manifest)
Risks
finalize_tasks()is a 700+ line function. Keep the wps.yaml addition minimal and well-isolated — a single block at the start of the dependency resolution section, with a clearly named variable (wps_manifest).
Work Package WP04: Update Planning Templates and Migration (#525 prompts)
Goal: Rewrite tasks-outline.md and tasks-packages.md source templates to produce/consume wps.yaml; write a new migration that propagates these changes to existing installations. Priority: P1 (planning UX improvement; not a runtime bug) Independent Test: After spec-kitty upgrade on a project with the old tasks-outline prompt, the deployed tasks-outline file instructs the LLM to write wps.yaml (not tasks.md). The detect() method of the new migration returns True for stale files and False after apply. Prompt: tasks/WP04-update-planning-templates-and-migration.md Requirement Refs: FR-009, FR-010
Included Subtasks
- □ T022 Rewrite
tasks-outline.mdtemplate — produce wps.yaml, not tasks.md (WP04) - □ T023 Rewrite
tasks-packages.mdtemplate — read/update wps.yaml, generate WP files (WP04) - □ T024 Write
m_3_2_0_update_planning_templates.pymigration with detect + apply (WP04) - □ T025 [P] Migration unit tests: detect returns True/False correctly (WP04)
- □ T026 [P] Migration unit tests: apply overwrites stale files, respects agent config, idempotent (WP04)
Implementation Notes
- tasks-outline.md new Purpose: "Create
wps.yaml— the structured WP manifest." Remove all instructions to writetasks.mdprose. Add the wps.yaml YAML schema example (from data-model.md). State explicitly: "Do NOT write tasks.md — it is generated by finalize-tasks from wps.yaml." - tasks-packages.md: Change Step 2 from "Read
feature_dir/tasks.md" to "Readfeature_dir/wps.yaml— the manifest written in the previous step." Update Step 4 (Dependencies) to "Update thedependenciesandowned_filesfields inwps.yamlfor each WP as you process it." Keep all WP prompt file generation instructions unchanged. - Migration
m_3_2_0_update_planning_templates.py: - Pattern exactly:
m_2_1_3_restore_prompt_commands.py(see lines 221–310) _STALE_MARKER = "Create \tasks.md\"(note: backtick-escaped in Python string)detect()— iterate configured agent dirs, check for stale marker in anyspec-kitty.tasks-outline.*fileapply()— use_get_runtime_command_templates_dir()+_render_full_prompt()(import from m_2_1_3's helpers) to overwrite stale files- Register with
@MigrationRegistry.register,target_version = "3.2.0" - Tests go in
tests/upgrade/migrations/test_m_3_2_0_update_planning_templates.py
Dependencies
- WP02 (migration references schema from wps_manifest module in its description/docs only; not a runtime import dependency)
Risks
_get_runtime_command_templates_dir()and_render_full_prompt()are internal helpers inm_2_1_3. Import them or extract them to a sharedmigration_helpers.py. Do not duplicate the logic.
Work Package WP05: Implement Query Mode for spec-kitty next (#526)
Goal: Bare spec-kitty next calls (no --result) return current state without advancing the DAG; output begins with the SC-003 verbatim label. Priority: P0 (ghost completions corrupt mission history) Independent Test: Before running spec-kitty next, record the current step ID from the run snapshot. After running spec-kitty next without --result, the step ID is unchanged. The first line of stdout is [QUERY — no result provided, state not advanced]. Prompt: tasks/WP05-implement-query-mode-for-next.md Requirement Refs: FR-012, FR-013, FR-014, FR-015, FR-016, NFR-004, C-005
Included Subtasks
- □ T027 Add
DecisionKind.queryconstant andis_query: bool = FalsetoDecisiondataclass (WP05) - □ T028 Implement
query_current_state()inruntime_bridge.py— read state without advancing (WP05) - □ T029 Change
resultdefault toNoneinnext_cmd.py; add query mode branch (WP05) - □ T030 Add explicit
is_querybranch to_print_human()— SC-003 verbatim label as first line (WP05) - □ T031 [P] Unit test: bare call does not advance state machine (WP05)
- □ T032 [P] Unit test: query output begins with
[QUERY — no result provided, state not advanced]verbatim (WP05) - □ T033 [P] Unit test:
--result successstill advances (no regression) (WP05) - □ T034 [P] Unit test: JSON output includes
"is_query": true(WP05)
Implementation Notes
1. Call get_or_start_run() (idempotent — only starts a run if none exists) 2. Call _read_snapshot(Path(run_ref.run_dir)) from spec_kitty_runtime.engine 3. Return Decision(kind=DecisionKind.query, is_query=True, reason=None, mission_state=current_step_id, ...) 4. Do NOT call next_step() — this is the key invariant
decision.py: Addquery = "query"toDecisionKind. Addis_query: bool = Falsefield toDecisiondataclass. Updateto_dict()to includeis_query.runtime_bridge.py:query_current_state():next_cmd.py: Changeresult: str = "success"toresult: str | None = None. Add block immediately aftermission_slugresolution:if result is None: decision = query_current_state(...); print/return._print_human(): Addif getattr(decision, "is_query", False):block at the very top. Print"[QUERY — no result provided, state not advanced]"verbatim as line 1, then" Mission: {mission} @ {step}", then progress. Return early. Do NOT enter the normal[{kind.upper()}]path.- Add new test file
tests/next/test_query_mode_unit.py. Existingtest_next_command_integration.pycan be extended for regression tests (T033).
Dependencies
None (independent).
Risks
_read_snapshotis a private function inspec_kitty_runtime.engine. If it's renamed or moved, query mode breaks silently. Add a# type: ignorecomment noting the private-API dependency, or wrap in a try/except that returns aDecisionKind.blockedresult on ImportError.
Work Package WP06: Fix Slug Validator (#527)
Goal: Update KEBAB_CASE_PATTERN to accept digit-prefixed slugs; update error message and add tests. Priority: P1 Independent Test: spec-kitty agent mission create "070-test-slug" --json (or the equivalent Python API call) does not raise MissionCreationError for the slug. User-Auth still raises an error. Prompt: tasks/WP06-fix-slug-validator.md Requirement Refs: FR-017, FR-018, FR-019
Included Subtasks
- □ T035 Update
KEBAB_CASE_PATTERNregex to accept digit-prefixed slugs (WP06) - □ T036 Update error message — add valid digit-prefix example, remove invalid example, add comment (WP06)
- □ T037 [P] Unit tests: digit-prefix slugs accepted; existing rejections unchanged (WP06)
- □ T038 [P] Integration test:
create "070-new-feature"passes slug validation (WP06)
Implementation Notes
mission_creation.pyline 47: changer"^[a-z][a-z0-9](-[a-z0-9]+)$"tor"^[a-z0-9][a-z0-9](-[a-z0-9]+)$".- Update error message (lines 202–211): add
" - 068-feature-name"to valid examples; remove" - 123-fix (starts with number)"from invalid examples. Add inline comment# Intentionally permissive: bare-digit slugs like "069" are accepted; create() prefixes the mission number anyway. - Update docstring at line ~179 to include a digit-prefixed example slug.
- New test file
tests/core/test_slug_validator_unit.py(alongsidetest_dependency_parser.py). Test cases:"068-feature"accepted,"001-foo"accepted,"User-Auth"rejected,"user_auth"rejected,""rejected,"069"accepted (bare digit — intentionally permissive).
Dependencies
None (independent).