Contracts
command-surface-generation.md
Contract: Command And Skill Surface Generation
Scope
This contract covers #968 retired checklist cleanup and #964 generated skill frontmatter.
Required Behavior
- Fresh command and skill generation must not produce active
spec-kitty.checklist*commands. - Active registry entries, packaged command templates, generated command counts, runtime doctor expectations, installer cleanup, and docs/comments that name active counts must agree.
- Upgrade/install cleanup must remove stale checklist files only when they are package-managed, or ignore/preserve unknown files intentionally.
- Generated
SKILL.mdfiles must include required YAML frontmatter or use a documented host-accepted schema. - The Codex/global
.agents/skills/spec-kitty.advise/SKILL.mdrepro must be covered by automated generation tests.
Acceptance Checks
- Fresh generation inventory contains zero active checklist commands.
- Registry and packaged template inventories match.
- Runtime diagnostics report the same command/skill counts that generation produces.
- Stale package-managed checklist files are cleaned up.
- Unknown user-authored files are not deleted by broad name matching.
- Generated
SKILL.mdfiles parse as frontmatter-bearing Markdown for the target host surface.
Non-Goals
- Resurrecting
checklist. - Renaming active user-facing commands.
- Deleting user-owned custom commands or skills based only on filename pattern.
review-verdict-consistency.md
Contract: Review Verdict Consistency
Scope
This contract covers #904 review-cycle/WP state consistency during WP transitions, mission status, mission review, and merge preflight.
Required Behavior
- Before a WP moves to
approvedordone, Spec Kitty must inspect the latest applicablereview-cycle-N.mdartifact for that WP. - If the latest applicable artifact has
verdict: rejected, the transition must fail before mutating state unless an explicit override is supplied. - The failure diagnostic must name:
- the WP id,
- the latest rejected review-cycle artifact,
- the required repair or override action.
- Mission status, mission review, and merge preflight must not silently pass when a done or approved WP is contradicted by a latest rejected review-cycle artifact.
- Explicit overrides must be persisted as structured state in review-cycle metadata or a linked override artifact.
Acceptance Checks
- A rejected latest review artifact blocks WP completion and leaves WP state unchanged.
- A later approved review artifact supersedes an earlier rejected artifact.
- An explicit override records durable evidence and permits the intended transition.
- Mission status, mission review, and merge preflight fail or report a blocking diagnostic on done/approved plus latest rejected contradiction.
- JSON-producing commands touched by this behavior keep parseable JSON on stdout.
Non-Goals
- Warning-only policy.
- Manual deletion of rejected artifacts as the normal repair path.
- Reimplementation of PR #959 or PR #969 work without current repro evidence.
status-test-boundedness.md
Contract: Status Test Boundedness
Scope
This contract covers #967 status bootstrap and emit behavior for local and CI validation.
Required Behavior
- Previously hanging status bootstrap and emit paths must complete or fail within 30 seconds when run with the mission validation timeout.
- Timeout failures must include enough diagnostics to identify the hanging test path.
- Fixture or adapter hardening must preserve the semantics of status events and materialized status.
- Default validation must not require hosted auth, tracker, SaaS sync, or network access.
Acceptance Checks
uv run pytest tests/status -q --timeout=30
The implementation should add narrower checks if root cause lands outside the current status test files.
Non-Goals
- Broad status store redesign.
- Hosted sync protocol changes.
- Weakening status assertions to make tests pass.