Spec Kitty

└─ kitty-specs
   └─ Spec Kitty Stability & Hygiene Hardening (April 2026)

Mission Run:

📚 Docs ↗

Contracts

events-envelope.md

Contract — `spec-kitty-events` envelope

Owning WP: WP05 Backing FR: FR-022, FR-023, FR-024 Authority: spec_kitty_events.* public imports at the version resolved by pyproject.toml + uv.lock.

Required envelope fields (per resolved version)

The contract test reads the resolved version of spec-kitty-events and loads its public schema. The fields below are the stable minimum that the test asserts against; the snapshot file under tests/contract/snapshots/spec-kitty-events-<version>.json carries the full authoritative shape.

Field	Type	Stability
`event_id`	str (ULID)	stable across versions
`event_type`	str (kebab-case)	stable across versions
`event_version`	int	stable across versions
`emitted_at`	str (ISO 8601 UTC)	stable across versions
`actor`	str	stable across versions
`mission_id`	str (ULID)	stable across versions
`payload`	dict	shape governed by `event_type` × `event_version`

Test-suite pinning rule

Tests under spec-kitty/tests/contract/test_events_envelope_*.py MUST:

1. Resolve the actual spec-kitty-events version from uv.lock via tomllib. If the lockfile is missing, fall back to importlib.metadata.version("spec-kitty-events") and emit a warning. 2. Load the snapshot at tests/contract/snapshots/spec-kitty-events-<resolved-version>.json. 3. Assert that emitted envelopes match the snapshot field-by-field for every event_type currently produced by spec-kitty.

If spec-kitty-events is bumped in pyproject.toml/uv.lock and the snapshot is missing, the test fails with a structured message including:

The resolved version found.
The expected snapshot path.
A pointer to scripts/snapshot_events_envelope.py.

Public-import freeze

The following imports are the stable public surface. A breaking change requires a SemVer major bump and a written ADR:

from spec_kitty_events import EventEnvelope
from spec_kitty_events.types import EventType, EventVersion
from spec_kitty_events.validators import validate_envelope

tests/architectural/test_events_tracker_public_imports.py asserts no caller in spec-kitty reaches into spec_kitty_events._internal.*.

Mission-review gate

/spec-kitty-mission-review MUST run pytest tests/contract/ -v and treat failures as hard blockers (FR-023). The mission cannot be accepted with red contract tests.

intake-source-provenance.md

Contract — Intake source provenance

Owning WP: WP02 Backing FR: FR-007, FR-008, FR-009, FR-010, FR-011, FR-012, NFR-003, NFR-004

Provenance line escape rules (FR-007)

Provenance lines MUST be written through specify_cli.intake.provenance.escape_for_comment(). The helper:

1. Strips ASCII control characters (0x00–0x1F, 0x7F) except \t. 2. Replaces comment-terminator-like sequences:

3. Clips the resulting string to 256 bytes (UTF-8 safe truncation). 4. Returns the cleaned string.

--> → -->
/ → /
leading # (line start) → \\# (when written into a Markdown context)

The helper has unit tests for each rule.

Path scanning (FR-008, FR-012)

Path resolution rules:

1. Compute intake_root_resolved = Path(intake_root).resolve(strict=True). 2. For every candidate path under the root, compute candidate.resolve(strict=True). 3. Assert candidate_resolved.is_relative_to(intake_root_resolved). 4. If assertion fails, raise INTAKE_PATH_ESCAPE with both paths in the message.

The same intake_root_resolved is used for the brief write target (FR-012).

Size cap (FR-009, NFR-003)

1. os.stat(path).st_size. If > cap, raise INTAKE_TOO_LARGE. 2. If os.stat() is unavailable (e.g. STDIN-piped), use read1(cap + 1); if len(buf) > cap, raise INTAKE_TOO_LARGE.

resident memory during a 50 MB rejection trial.

Default cap: intake.max_brief_bytes = 5_242_880 (5 MB).
Configurable in .kittify/config.yaml.
Enforcement order:
Memory ceiling: cap + small overhead. NFR-003 asserts < 1.5 × cap in

Atomic write (FR-010, NFR-004)

Brief and provenance writes use:

with open(target_tmp, "wb") as f:
    f.write(payload)
    f.flush()
    os.fsync(f.fileno())
os.replace(target_tmp, target)

Pre-conditions:

warning and falls back to a non-atomic write only when explicitly forced by intake.allow_cross_fs=True. Default behavior is to fail loudly.

target_tmp is in the same directory as target (same filesystem).
If target exists on a different filesystem, the helper logs a structured

NFR-004 test: 100 simulated kill-9 mid-write trials → 0 partial files.

Missing vs corrupt distinction (FR-011)

intake.read_brief() raises:

FileNotFoundError.

for any other read failure.

INTAKE_FILE_MISSING (with path in detail) when os.stat() raises
INTAKE_FILE_UNREADABLE (with path and underlying OSError chain)

Callers MUST NOT collapse these into a single error type. The CLI surface distinguishes them in user-facing messages.

runtime-decision-output.md

Contract — `spec-kitty next` decision JSON output

Owning WP: WP04 Backing FR: FR-019, FR-020, FR-021

Invariants

The decision JSON returned by spec-kitty next --json MUST satisfy:

1. No implicit success. A bare call (no --result) MUST NOT set result=success server-side. The runtime treats result is None as query, not outcome. (FR-019) 2. No unknown mission state. For a valid mission run with persisted state, mission_state MUST be set to a real state from the mission's state machine (discovering, specifying, planning, tasking, implementing, reviewing, accepting, or done). It MUST NOT be the string "unknown". (FR-020) 3. No [QUERY - no result provided]. The decision JSON's prompt_file, reason, and question fields MUST NOT contain that placeholder. (FR-020) 4. Structured blocked decision on resolution failure. When the runtime cannot determine the next step (missing artifact, failing guard, ambiguous WP graph), it returns kind="blocked" with a concrete reason and a populated guard_failures list. (FR-020) 5. Mission YAML schema validates. The shipped plan mission's mission-runtime.yaml MUST validate against the runtime schema. The contract test loads the file and asserts. (FR-021)

JSON shape (informative)

{
  "kind": "step | decision_required | blocked | terminal",
  "agent": "claude | codex | gemini | ...",
  "mission_slug": "<slug>",
  "mission": "<mission-type-key>",
  "mission_state": "<state-from-mission-state-machine>",
  "action": "specify | plan | ... | implement | review | accept",
  "wp_id": "<str | null>",
  "workspace_path": "<path | null>",
  "prompt_file": "<absolute path | null>",
  "reason": "<str | null>",
  "guard_failures": [],
  "progress": { "...": "..." },
  "run_id": "<ulid>",
  "step_id": "<id>",
  "decision_id": "<ulid | null>",
  "question": "<str | null>",
  "options": null
}

Tests

advance state.

mission_state != "unknown".

tests/contract/test_next_no_implicit_success.py — bare call does not
tests/contract/test_next_no_unknown_state.py — for a fixture mission,
tests/contract/test_plan_mission_yaml_validates.py — schema validation.

tracker-public-imports.md

Contract — `spec-kitty-tracker` public imports

Owning WP: WP05 Backing FR: FR-024, FR-031, C-008

Frozen public surface

The following imports are the stable public surface of spec-kitty-tracker. Breaking changes require a SemVer major bump and an ADR.

from spec_kitty_tracker import TrackerClient
from spec_kitty_tracker.models import (
    TrackerEvent,
    TrackerProject,
    TrackerWorkPackage,
)
from spec_kitty_tracker.errors import (
    TrackerAuthError,
    TrackerNotFound,
    TrackerSyncFailed,
)

Internal modules under spec_kitty_tracker._internal.* are NOT part of the public surface and may change without notice.

Bidirectional sync semantics (FR-031)

TrackerClient.bidirectional_sync() MUST:

1. Bound retries by tracker.sync_max_retries (default 5) with exponential backoff capped at tracker.sync_max_backoff_seconds (default 30). 2. On exhausted retries, raise TrackerSyncFailed with structured cause chain (HTTP status, body excerpt up to 2 KB, retry history). 3. NEVER block indefinitely. The total wall-clock cap is tracker.sync_total_timeout_seconds (default 300). 4. Emit a single user-facing failure line per invocation (paired with the token-refresh dedup behavior in FR-029).

Auth transport adoption (FR-030)

TrackerClient MUST acquire its HTTP transport from spec-kitty/src/specify_cli/auth/transport.py:AuthenticatedClient. It MUST NOT instantiate httpx.Client directly. The architectural test enforces this.

Downstream certification

A candidate release of spec-kitty-tracker cannot be promoted to stable until at least one downstream consumer (spec-kitty, spec-kitty-saas) runs its integration suite green against the candidate. This rule is encoded in release.yml workflow.