Contracts
checklist_surface_removed.contract.md
Contract: /spec-kitty.checklist surface removed; requirements.md preserved
Traces to: FR-003 (#815), FR-004 (#635), NFR-003, NFR-009, C-003, C-008
Stimulus
A maintainer runs spec-kitty init <tmp> then /spec-kitty.specify (and later /spec-kitty.plan, /spec-kitty.tasks) against a fresh project, OR greps the repository for the deprecated command surface.
Required behavior
1. Slash-command surface removal: Across the repository AND across every per-agent rendered surface for every supported agent (per CLAUDE.md "Supported AI Agents" table — currently 13 slash-command agents + 2 skills agents), zero remaining textual references to the literal token /spec-kitty.checklist (with or without leading slash) inside source templates, generated agent copies, registry/manifest entries, test fixtures, regression baselines, snapshots, README, or docs/ reference pages. 2. Deprecated artifact removal (allowlisted): Zero files named checklist.md, checklist.prompt.md, checklist.SKILL.md, or spec-kitty.checklist.md inside the scan boundary below. The scan boundary explicitly EXCLUDES paths that are reserved for legitimate "checklist" concepts:
(requirements checklist, future task / review / release checklists scoped to a mission).
project release process artifact.
— release-process and review-process artifacts unrelated to the retired slash command. 3. Artifact preservation: After running /spec-kitty.specify against a fresh project, the file kitty-specs/<mission_slug>/checklists/requirements.md MUST still be created with the same content shape as today (Specification Quality Checklist). 4. Bulk-edit gate: The diff produced by FR-003 work MUST exactly match the REMOVE/KEEP classification in occurrence_map.yaml. Anything extra triggers a DIRECTIVE_035 violation.
kitty-specs//checklists/— canonical mission-level checklistsRELEASE_CHECKLIST.md(and any**/RELEASE_CHECKLIST.md) — the*/release_checklistand*/review_checklist(case-insensitive)
Forbidden behavior
rejected by this tranche (see research.md R3 alternative 1).
C-003).
artifact).
registry.
- A no-op shim command that prints "deprecated, use X" — explicitly
- Removing
kitty-specs/<mission>/checklists/requirements.md(per - Removing the standalone
RELEASE_CHECKLIST.mdfile (unrelated process - Adding
/spec-kitty.checklistback intoCANONICAL_COMMANDSor any
Implementation hint (informative, not normative)
The bulk edit is mechanical given the occurrence map. After removing sources/templates/manifest entries, regenerate the per-agent baselines under tests/specify_cli/regression/_twelve_agent_baseline/<agent>/ so the snapshot comparison at test time reflects the new reduced surface for every agent. See research.md R3.
Verifying tests
expectations.
is NOT among rendered outputs.
NOT installed.
checklist from the canonical path enumeration.
and the vibe equivalent.
tests/specify_cli/regression/_twelve_agent_baseline/<agent>/.
src/specify_cli/missions/, every supported agent's command directory under the project root (sourced from the same constant the migrations use, e.g. AGENT_DIRS), tests/specify_cli/regression/, and docs/. Asserts zero filenames matching checklist* AND zero occurrences of the regex /?spec-kitty\.checklist\b in any text file.
drives mission create + reads the post-/spec-kitty.specify artifact set, asserts kitty-specs/<slug>/checklists/requirements.md exists and contains the canonical Specification Quality Checklist headers.
- Update existing tests:
tests/specify_cli/skills/test_registry.py— drop checklisttests/specify_cli/skills/test_command_renderer.py— assert checklisttests/specify_cli/skills/test_installer.py— assert checklist istests/missions/test_command_templates_canonical_path.py— drop- Update snapshots:
- Delete
tests/specify_cli/skills/__snapshots__/codex/checklist.SKILL.md - Delete every
checklist.md/checklist.prompt.mdunder - Add new aggregate regression test:
tests/specify_cli/test_no_checklist_surface.py— recursively walks- Add new artifact-preservation test:
tests/missions/test_specify_creates_requirements_checklist.py—
Out-of-scope
- Renaming
requirements.mdor moving thechecklists/directory. - Introducing a new slash command in
/spec-kitty.checklist's place.
decision_command_help.contract.md
Contract: spec-kitty agent decision command shape consistency
Traces to: FR-007 (#774), NFR-008
Stimulus
A user (or implementing agent) reads any of:
docs/migration/
tests/specify_cli/skills/__snapshots__/
src/specify_cli/missions/*/command-templates/
--helpoutput forspec-kitty agent decision …and sub-paths- Any documentation page under
docs/reference/,docs/explanation/, - Any agent skill template snapshot under
.agents/skills/or - Any source command template under
Required behavior
Every reference to the decision command in the surfaces above MUST use exactly one canonical shape:
spec-kitty agent decision { open | resolve | defer | cancel | verify } …
Specifically:
spec-kitty decision … alias is introduced or referenced.
cancel, verify. No additional names referenced.
--slot-key, --input-key, --question, --options, --final-answer, --rationale).
- The subgroup name is
decision(singular). Nodecisions. - The subgroup is reachable via
spec-kitty agent. No top-level - The five subcommands are exactly:
open,resolve,defer, - All flags follow the existing typer schema (
--mission,--flow,
Forbidden behavior
spec-kitty agent decisions …, or spec-kitty agent decision-….
list, flag names, or flag arity.
- Any documented or rendered surface that names
spec-kitty decision …, - Any divergence between
--helpoutput and a doc page on subcommand
Implementation hint (informative, not normative)
Verified during planning that the actual subgroup (src/specify_cli/cli/commands/decision.py:1) and the only doc reference found (docs/reference/missions.md:268) already match the canonical shape. FR-007 collapses to a regression test that prevents future drift. See research.md R6.
Verifying tests
exactly the five subcommands above.
tests/specify_cli/skills/__snapshots__/, src/specify_cli/missions//command-templates/ for the regex spec-kitty\s+(?:agent\s+)?decision[s\-]\w and asserts every match falls inside the canonical shape.
these five subcommands (in any order).
- New
tests/specify_cli/cli/test_decision_command_shape_consistency.py: - Walks the typer app, asserts the
agent decisionsubgroup exists with - Recursively greps
docs/,.agents/skills/, - Asserts
--helpoutput forspec-kitty agent decisionlists exactly
Out-of-scope
clarify, not to redesign.)
- Changing the subgroup name or adding new subcommands. (#774 asks to
feature_alias_hidden.contract.md
Contract: legacy --feature alias hidden from CLI help
Traces to: FR-006 (#790), NFR-007, C-004
Stimulus
A user (or agent) runs --help against any spec-kitty command path that historically accepted a --feature flag. The flag was deprecated in favor of --mission but must remain accepted for backward compatibility.
Required behavior
1. For every command listed in the FR-006 inventory (28 declarations across 17 files; see research.md R5), the rendered --help output MUST contain zero literal occurrences of the token --feature. 2. For each such command, passing --feature <value> on the CLI MUST continue to behave exactly as it does today: route the value into --mission semantics with no behavioral difference, and (per the existing SPEC_KITTY_SUPPRESS_FEATURE_DEPRECATION env var) optionally emit a deprecation warning.
Forbidden behavior
- Any newly-introduced visible mention of
--featurein--help. - Any change that breaks an existing call site that passes
--feature. - Removing the alias altogether (out of scope for this tranche; C-004).
Implementation hint (informative, not normative)
Verified during planning that all 28 declarations already carry hidden=True. FR-006 collapses to a regression test that prevents future drift.
Verifying tests
captures the rendered string, asserts "--feature" is not a substring.
with name "feature" carries hidden=True.
tests/e2e/test_cli_smoke.py if necessary) that passes --feature to one of the historically-accepting commands and asserts the command runs to completion identically to passing --mission.
- New
tests/specify_cli/cli/test_no_visible_feature_alias.py: - Walks the typer app via Click introspection.
- For every leaf command, invokes
<command> --helpviaCliRunner, - Asserts (via direct typer Parameter inspection) that every parameter
- An existing call-site smoke test (extend
Out-of-scope
env var or its default.
- Changing the value of the
SPEC_KITTY_SUPPRESS_FEATURE_DEPRECATION - Removing the alias.
init_non_git_message.contract.md
Contract: spec-kitty init non-git target message
Traces to: FR-005 (#636)
Stimulus
A user runs spec-kitty init <target_dir> in a directory whose filesystem path is not inside a git work tree (i.e. neither <target_dir>/.git/ nor any ancestor .git/ exists).
Required behavior
Canonical invariant (Decision Moment 01KQ84P1AJ8H3FPJN9J5C12CBY): non-git init is allowed; silent non-git init is not.
1. The command MUST emit exactly one informational line, on stdout or stderr, that includes BOTH:
case-insensitive).
2. The line MUST be styled at "info" level (not red, not bold-red). Yellow or cyan styling is acceptable. 3. The command MUST complete the scaffold successfully and exit code 0 when no other failure occurs — populating .kittify/, .gitignore, agent directories, etc., as it does today. 4. The command MUST NOT auto-run git init on the target. The "git not initialized" condition is informational only. 5. The command MUST NOT bail out before writing files just because the target is not a git work tree. Fail-fast semantics are explicitly rejected (see Decision Moment record).
- The phrase "not a git repository" (or substring "not.git.repo",
- The phrase "git init" as the suggested remediation.
Forbidden behavior
is not a git repository — Decision Moment 01KQ84P1AJ8H3FPJN9J5C12CBY rejected fail-fast semantics; the "scaffold then init later" workflow is the canonical path.
user must run git init before downstream agent commands will work (the "silent non-git init" case the canonical invariant forbids).
- Multiple repetitions of the same message in one invocation.
- Hard-failing the command (any non-zero exit) solely because the target
- Skipping any normal scaffold step because the target lacks
.git/. - Silently writing files into the target without any indication that the
Implementation hint (informative, not normative)
The git binary not detected branch already exists at src/specify_cli/cli/commands/init.py:360. Add a sibling branch that checks whether the target dir is inside a git work tree (e.g. subprocess.run(["git", "rev-parse", "--is-inside-work-tree"], cwd=target, check=False)). When the answer is "no", print the new informational line. Also append a single-line "next: run git init" item to the post-init quick-start summary.
See research.md R4.
Verifying tests
drives init against a tmp dir without .git/, captures rich-rendered output (markup stripped), asserts the message appears exactly once and the regex not\s+a\s+git\s+repository matches at least once.
CliRunner.
- Unit: new
tests/specify_cli/cli/commands/test_init_non_git_message.py - E2E: extend
tests/e2e/test_cli_smoke.pywith the same assertion via
Out-of-scope
- The exact wording of the message (decided during implementation).
- Localization (the project ships English-only).
mission_create_clean_output.contract.md
Contract: spec-kitty agent mission create --json clean output
Traces to: FR-008 (#735), FR-009 (#717), NFR-004, NFR-005
Stimulus
A user (or agent) runs:
spec-kitty agent mission create "<slug>" \
--friendly-name "<title>" \
--purpose-tldr "<tldr>" \
--purpose-context "<context>" \
--json
in a spec-kitty-initialized project. The command succeeds (exit code 0).
Required behavior
After exit:
1. Stdout:
closing } of the JSON object. 2. Stderr:
family ("Not authenticated, skipping sync", "token refresh failed", equivalent variants).
[bold red]…[/bold red], or terminal escape sequences for red) AFTER the JSON payload has been written to stdout when the command's exit status is 0. 3. Exit code: 0.
- Contains exactly one valid JSON document (the mission-create payload).
- The document's last character (modulo a trailing newline) is the
- MAY contain at most ONE occurrence of any given diagnostic message
- MUST NOT contain red-styled error output (Rich
[red]…[/red],
Forbidden behavior
invocation (FR-009 / #717).
success (FR-008 / #735).
- Multiple repetitions of "Not authenticated, skipping sync" within one
- Red shutdown / "final sync" error lines after the JSON payload on
- Any diagnostic that says "error" while exit code is
0.
Implementation hint (informative, not normative)
Two cooperating pieces:
1. In-process dedup: a new module src/specify_cli/diagnostics/dedup.py provides report_once(cause_key) backed by a contextvars.ContextVar. The two callsites in src/specify_cli/sync/background.py:270 and :325 consult it before logging. 2. Atexit success-flag: the agent mission create JSON-payload writer sets a process-state flag (mark_invocation_succeeded()) right after the final print(json.dumps(...)). The atexit handlers in src/specify_cli/sync/background.py:456 and src/specify_cli/sync/runtime.py:381 consult that flag and downgrade any warning to debug-level (or skip entirely) when it's True.
See research.md R7.
Verifying tests
BackgroundSyncService directly with a mock unauthenticated session, invoke the noisy code path twice, assert the warning fires exactly once.
CliRunner (or subprocess) to invoke mission create against a tmp project; capture stdout/stderr; assert (a) JSON payload appears, (b) stderr contains zero "Not authenticated, skipping sync" repeats, (c) zero red-styled lines after the JSON.
- Unit:
tests/sync/test_diagnostic_dedup.py— drive - E2E:
tests/e2e/test_mission_create_clean_output.py— use Click's
Out-of-scope
When the command exits non-zero, all warnings remain at their normal log level.
- This contract does NOT mandate suppressing diagnostics on failure paths.
status_event_reader_tolerates_decision_events.contract.md
Contract: read_events() tolerates non-lane-transition events
Traces to: FR-010, NFR-010, SC-008
Stimulus
Any caller of specify_cli.status.store.read_events(feature_dir) against a status.events.jsonl file that contains a mix of:
from_lane, to_lane, etc.).
DecisionPointOpened, DecisionPointResolved, DecisionPointDeferred, DecisionPointCanceled, DecisionPointWidened, retrospective events (already handled), and any future event-type with no wp_id.
- Zero or more lane-transition events (
StatusEvent-shaped, withwp_id, - Zero or more mission-level events written by sibling subsystems —
Required behavior
1. read_events() returns a list[StatusEvent] containing exactly the lane-transition events (in file order). Mission-level events are silently skipped. 2. read_events() MUST NOT raise KeyError for missing wp_id, from_lane, to_lane, actor, force, or execution_mode on any event that lacks the lane-transition shape. 3. Existing behavior preserved:
StoreError("Invalid event structure on line N: …").
- Empty files return
[]. - Blank lines are silently skipped.
- Invalid JSON still raises
StoreError("Invalid JSON on line N: …"). - Lane-transition events with malformed lane fields still raise
Forbidden behavior
wp_id but a bad from_lane).
legitimate event logs).
- Returning a
listcontaining non-StatusEventinstances. - Silently dropping a malformed lane-transition event (one that has
- Logging a warning for every skipped mission-level event (would flood
Implementation hint (informative, not normative)
Add an event_type-presence guard at the top of read_events()'s per-line loop, right after the event_name.startswith("retrospective.") skip:
# Skip mission-level events (DecisionPointOpened, DecisionPointResolved,
# DecisionPointDeferred, DecisionPointCanceled, DecisionPointWidened, and
# any future event-type written by a non-status-emitter subsystem) that
# share status.events.jsonl with lane-transition events. Mission-level
# events carry a top-level `event_type` field; lane-transition events do
# not. Discriminating on `event_type` PRESENCE (not on a specific value
# allowlist) is future-proof against new mission-level event types AND
# preserves the existing "raise on malformed lane-transition event"
# contract: a corrupted lane event missing wp_id but also missing
# event_type still surfaces as `Invalid event structure on line N`.
# See FR-010.
if "event_type" in obj:
continue
Why presence-of-event_type, not absence-of-wp_id: a duck-type "skip if no wp_id" guard would also silently swallow corrupted lane-transition events that happen to be missing wp_id, which violates the existing contract that malformed lane events still raise. The event_type discriminator only skips events whose wire format explicitly identifies them as non-lane-transition; a malformed lane event that lacks event_type will still hit the existing StatusEvent.from_dict() path and raise as today.
Verifying tests
tests/status/test_read_events_tolerates_decision_events.py): 1. Construct a tmp feature_dir with a status.events.jsonl containing, in order: a DecisionPointOpened event, a lane-transition event, a DecisionPointResolved event, a second lane-transition event. 2. Call read_events(feature_dir). 3. Assert the result has exactly 2 elements (both StatusEvent instances), in the same order they appeared in the file, with the correct wp_id values.
kitty-specs/release-3-2-0a5-tranche-1-01KQ7YXH/status.events.jsonl starts with a DecisionPointOpened event. Running spec-kitty agent mission finalize-tasks --mission release-3-2-0a5-tranche-1-01KQ7YXH --json fails on main and succeeds after the fix. Cite this in the PR description as the live evidence.
- New test in
tests/status/test_store.py(or a sibling new file - Existing failure mode (regression):
- The current tranche's own
Out-of-scope
schemas (rejected as too large a blast radius — see research.md R9 Alternative 1).
research.md R9 Alternative 2).
tranche).
- Splitting
status.events.jsonlinto separate files for separate event - Promoting
wp_idtoOptional[str]onStatusEvent(rejected — see - Adding a generic event-type registry (out of scope for this stabilization
upgrade_post_state.contract.md
Contract: spec-kitty upgrade post-state coherence
Traces to: FR-002 (#705), NFR-006
Stimulus
A user (or CI job) runs spec-kitty upgrade --yes in any spec-kitty-initialized project where the CLI binary version matches the target schema bound (currently MIN_SUPPORTED_SCHEMA == MAX_SUPPORTED_SCHEMA == 3).
Required behavior
After the command exits with status 0:
1. .kittify/metadata.yaml MUST contain:
currently 3).
2. .kittify/metadata.yaml MUST NOT contain:
3. An immediately-following spec-kitty agent mission branch-context --json MUST:
spec_kitty.versionequal to the CLI binary's version.spec_kitty.schema_versionequal toREQUIRED_SCHEMA_VERSION(an integer,spec_kitty.last_upgraded_atupdated to within the last 5 seconds.spec_kitty.initialized_atUNCHANGED.- A
nullor empty value forspec_kitty.schema_version. - Any duplicate top-level keys.
- Exit with status
0. - Print a JSON payload with
"result": "success". - NOT trigger the
PROJECT_MIGRATION_NEEDEDgate.
Forbidden behavior
in a state where the next agent command is gated by PROJECT_MIGRATION_NEEDED.
during its post-migration metadata save.
- The command MUST NOT print
Upgrade complete!while leaving the project - The command MUST NOT silently overwrite
spec_kitty.schema_version
Implementation hint (informative, not normative)
Confirmed root cause: _stamp_schema_version (raw YAML round-trip) followed by metadata.save() (dataclass-only serialization that drops unknown keys) at src/specify_cli/upgrade/runner.py:163–164. Smallest- blast-radius fix is to swap the call order so _stamp_schema_version runs after metadata.save(). See research.md R2.
Verifying tests
tests/cross_cutting/versioning/test_upgrade_version_update.py with a case that runs UpgradeRunner.upgrade() against a fixture project and asserts yaml.safe_load(metadata.yaml)['spec_kitty']['schema_version'] == 3.
tests/e2e/test_upgrade_post_state.py that drives the full CLI in a tmp dir: spec-kitty init <tmp> → spec-kitty upgrade --yes → spec-kitty agent mission branch-context --json. Asserts second command exits 0 and JSON result == "success".
- Unit: extend
- E2E: new test