Contracts

command-surface-generation.md

Contract: Command And Skill Surface Generation

Scope

This contract covers #968 retired checklist cleanup and #964 generated skill frontmatter.

Required Behavior

  • Fresh command and skill generation must not produce active spec-kitty.checklist* commands.
  • Active registry entries, packaged command templates, generated command counts, runtime doctor expectations, installer cleanup, and docs/comments that name active counts must agree.
  • Upgrade/install cleanup must remove stale checklist files only when they are package-managed, or ignore/preserve unknown files intentionally.
  • Generated SKILL.md files must include required YAML frontmatter or use a documented host-accepted schema.
  • The Codex/global .agents/skills/spec-kitty.advise/SKILL.md repro must be covered by automated generation tests.

Acceptance Checks

  • Fresh generation inventory contains zero active checklist commands.
  • Registry and packaged template inventories match.
  • Runtime diagnostics report the same command/skill counts that generation produces.
  • Stale package-managed checklist files are cleaned up.
  • Unknown user-authored files are not deleted by broad name matching.
  • Generated SKILL.md files parse as frontmatter-bearing Markdown for the target host surface.

Non-Goals

  • Resurrecting checklist.
  • Renaming active user-facing commands.
  • Deleting user-owned custom commands or skills based only on filename pattern.

review-verdict-consistency.md

Contract: Review Verdict Consistency

Scope

This contract covers #904 review-cycle/WP state consistency during WP transitions, mission status, mission review, and merge preflight.

Required Behavior

  • Before a WP moves to approved or done, Spec Kitty must inspect the latest applicable review-cycle-N.md artifact for that WP.
  • If the latest applicable artifact has verdict: rejected, the transition must fail before mutating state unless an explicit override is supplied.
  • The failure diagnostic must name:
  • the WP id,
  • the latest rejected review-cycle artifact,
  • the required repair or override action.
  • Mission status, mission review, and merge preflight must not silently pass when a done or approved WP is contradicted by a latest rejected review-cycle artifact.
  • Explicit overrides must be persisted as structured state in review-cycle metadata or a linked override artifact.

Acceptance Checks

  • A rejected latest review artifact blocks WP completion and leaves WP state unchanged.
  • A later approved review artifact supersedes an earlier rejected artifact.
  • An explicit override records durable evidence and permits the intended transition.
  • Mission status, mission review, and merge preflight fail or report a blocking diagnostic on done/approved plus latest rejected contradiction.
  • JSON-producing commands touched by this behavior keep parseable JSON on stdout.

Non-Goals

  • Warning-only policy.
  • Manual deletion of rejected artifacts as the normal repair path.
  • Reimplementation of PR #959 or PR #969 work without current repro evidence.

status-test-boundedness.md

Contract: Status Test Boundedness

Scope

This contract covers #967 status bootstrap and emit behavior for local and CI validation.

Required Behavior

  • Previously hanging status bootstrap and emit paths must complete or fail within 30 seconds when run with the mission validation timeout.
  • Timeout failures must include enough diagnostics to identify the hanging test path.
  • Fixture or adapter hardening must preserve the semantics of status events and materialized status.
  • Default validation must not require hosted auth, tracker, SaaS sync, or network access.

Acceptance Checks

uv run pytest tests/status -q --timeout=30

The implementation should add narrower checks if root cause lands outside the current status test files.

Non-Goals

  • Broad status store redesign.
  • Hosted sync protocol changes.
  • Weakening status assertions to make tests pass.