Contracts

charter-io-chokepoint.md

Contract: charter/_io.py encoding chokepoint

WP: WP06 | FRs: FR-016 – FR-021 | Source bug: #644 | Diagnostic codes: CHARTER_ENCODING_*

Public API

def load_charter_file(path: Path, *, unsafe: bool = False) -> CharterContent: ...
def load_charter_bytes(data: bytes, *, origin: str, unsafe: bool = False) -> CharterContent: ...

CharterContent shape: see data-model.md §3.

Detection order

1. BOM sniff — if leading BOM bytes match a known encoding (UTF-8-SIG, UTF-16-LE, UTF-16-BE), use that encoding. 2. Strict UTF-8 decode — if data.decode("utf-8") succeeds, encoding is "utf-8" with confidence = 1.0. 3. charset-normalizer.from_bytes(data).best() — if a candidate clears confidence >= 0.85 (computed from 1.0 - match.chaos), use it. 4. Fail — emit CHARTER_ENCODING_AMBIGUOUS with the file path and detected candidates.

unsafe bypass behavior

When unsafe=True and detection falls to step 4:

  • Use the highest-confidence candidate from charset-normalizer regardless of threshold (or fall back to cp1252 if absolutely no candidate exists).
  • Set CharterContent.normalization_applied = True.
  • Write the provenance record with bypass_used = True.
  • The function returns successfully — the operator is taking responsibility.

When unsafe=False (default) and detection falls to step 4:

  • Raise CharterEncodingError carrying the CharterEncodingDiagnostic.AMBIGUOUS code; caller is responsible for stdout/JSON emission.

Diagnostic body shape

ERROR: CHARTER_ENCODING_AMBIGUOUS
  File: kitty-specs/<mission>/charter/charter.yaml
  Detected candidates:
    - cp1252 (confidence 0.62)
    - utf-8 with replacement (confidence 0.48)
  Mixed-content signal: bytes 0xE9 0x80 0xAE at offset 1247 form valid cp1252
  '逮' but invalid UTF-8.

  Remediation options:
    1. Open the file in a UTF-8-aware editor and re-save.
    2. iconv -f cp1252 -t utf-8 <file> > <file>.utf8 && mv <file>.utf8 <file>.
    3. Re-run with --unsafe (logs bypass_used=true to provenance).

Provenance write

Every successful detection (including pure-UTF-8 with no normalization) writes a record to the provenance file per contracts/encoding-provenance-schema.md. Failure path (raise) does not write provenance — only successful resolutions are audit-worthy.

Retrofit sites (NFR-004 budget enforcement)

Exactly three modules in src/charter/ retrofit their existing read_text(encoding="utf-8") calls to load_charter_file(path):

1. compiler.py:594yaml.load(path.read_text(encoding="utf-8"))yaml.load(load_charter_file(path).text) 2. sync.py:151charter_path.read_text("utf-8")load_charter_file(charter_path).text 3. interview.py:283, 398 — analogous wrap

Other charter modules (context.py, hasher.py, language_scope.py, compact.py, neutrality/lint.py) are intentionally NOT modified — they re-read already-normalized files and trust the chokepoint's contract.

Acceptance fixtures

  • cp1252-encoded charter file → ingest succeeds with source_encoding = "cp1252", normalization_applied = True, provenance recorded.
  • Pure-UTF-8 charter → ingest succeeds with source_encoding = "utf-8", confidence = 1.0, normalization_applied = False.
  • Mixed-content file → raises AMBIGUOUS; provenance NOT written.
  • Same mixed-content file with unsafe=True → succeeds with bypass; provenance has bypass_used = True.

Invariants

  • Module count touched by WP06: 4 (_io.py new + 3 retrofit sites). NFR-004 budget respected.
  • The 5 deferred re-read sites remain unchanged in this WP. WP08 covers legacy-file migration.

encoding-provenance-schema.md

Contract: Encoding provenance schema

WP: WP06 | FRs: FR-022 | HiC decision: dual-storage, prefer per-mission, centralize shared

Record schema (JSONL; one record per line)

{"event_id": "01HXYZ...",
 "at": "2026-05-12T18:30:00+00:00",
 "file_path": "kitty-specs/<mission>/charter/charter.yaml",
 "source_encoding": "cp1252",
 "confidence": 0.93,
 "normalization_applied": true,
 "bypass_used": false,
 "actor": "spec-kitty charter compile",
 "mission_id": "01KRC57CNW5JCVBRV8RAQ2ARXZ"}

Field semantics: see data-model.md §3 EncodingProvenanceRecord.

Routing rule

A record is appended to exactly one file. The chokepoint inspects file_path:

  • Per-mission: if file_path starts with kitty-specs/<mission-slug>/, append to kitty-specs/<mission-slug>/.encoding-provenance.jsonl. Set mission_id to the mission's ULID (resolved from kitty-specs/<mission-slug>/meta.json).
  • Centralized: otherwise, append to .kittify/encoding-provenance/global.jsonl. Set mission_id to null.

There is no third destination, and a single event is never appended to both files.

File layout

kitty-specs/
├── <mission-slug-A>/
│   ├── charter/
│   └── .encoding-provenance.jsonl     # per-mission events for this mission
├── <mission-slug-B>/
│   ├── charter/
│   └── .encoding-provenance.jsonl     # per-mission events for this mission
└── ...

.kittify/
├── charter/                            # global charter (not mission-scoped)
└── encoding-provenance/
    └── global.jsonl                    # events for charter content outside any kitty-specs/<mission>/ tree

Append semantics

  • Append is open(..., "a", encoding="utf-8") + f.write(json.dumps(record, sort_keys=True) + "\n") — same pattern as the existing status.events.jsonl writer (see src/specify_cli/status/store.py).
  • event_id is a fresh ULID per record (uses the existing ULID utility from specify_cli.id_gen or equivalent).
  • at is ISO-8601 UTC with offset; same format as status events.

Read semantics

Consumers (audit tooling, hypothetical future dashboard) may cat per-mission + centralized files together without coalescing logic. Each record is self-describing via mission_id.

Acceptance fixtures

  • Ingest a file under kitty-specs/foo-01KQ.../charter/x.yaml → record appears in kitty-specs/foo-01KQ.../.encoding-provenance.jsonl, NOT in .kittify/encoding-provenance/global.jsonl.
  • Ingest a file under .kittify/charter/y.yaml → record appears in .kittify/encoding-provenance/global.jsonl, NOT in any mission file.
  • Concurrent appenders (two spec-kitty charter compile invocations in parallel) → all records survive; no overwrite.

Invariants

  • Record schema is identical across both files (same keys, same types).
  • No record is duplicated across files.
  • The schema is JSON-stable per NFR-001; new keys may be added but existing key names and types never change without a deprecation cycle.

issue-matrix-schema.md

Contract: issue-matrix.md validator schema

WP: WP03 | FRs: FR-006, FR-028 – FR-032 | Diagnostic codes: MISSION_REVIEW_ISSUE_MATRIX_*

Audit-derived vocabulary (closed sets)

Mandatory columns

Exact order, case-insensitive on input, normalized to lowercase internally:

1. issue 2. verdict 3. evidence_ref

Named-optional columns (closed set)

May appear in any order after the mandatory three:

  • title
  • scope (alias: theme)
  • wp (alias: wp_id)
  • fr (alias: fr(s))
  • nfr (alias: nfr(s))
  • sc
  • repo

Verdict allow-list (closed set)

fixed
verified-already-fixed
deferred-with-followup

Validator rules

RuleDiagnostic on violation
All mandatory columns present, in orderMISSION_REVIEW_ISSUE_MATRIX_SCHEMA_DRIFT
Every column is either mandatory or named-optionalMISSION_REVIEW_ISSUE_MATRIX_SCHEMA_DRIFT (names unknown column)
Verdict cell value is in the allow-listMISSION_REVIEW_ISSUE_MATRIX_VERDICT_UNKNOWN
Exactly one Markdown table at top level (additional prose allowed; additional tables NOT allowed)MISSION_REVIEW_ISSUE_MATRIX_MULTI_TABLE
evidence_ref cell non-emptyMISSION_REVIEW_ISSUE_MATRIX_EVIDENCE_REF_EMPTY
When verdict == deferred-with-followup, evidence_ref contains a follow-up handle (regex matches #\d+ OR contains Follow-up: substring)MISSION_REVIEW_ISSUE_MATRIX_DEFERRED_WITHOUT_HANDLE

Remediation pass over existing matrices (FR-032)

When the validator runs in remediation mode over the 6 existing matrices on main:

  • Auto-normalize: capitalization drift (Issueissue), alias drift (Evidence refevidence_ref, wp_idwp, themescope). Writes a one-line provenance note inside the file: <!-- normalized YYYY-MM-DD: header case folded; aliases resolved -->.
  • Surface, do not auto-fix: structural drift (multi-table layout in charter-golden-path-e2e-tranche-1-01KQ806X; any unknown columns like Surface or Where surfaced in code). Operator gets a diagnostic with repair guidance and must commit the fix manually.

Parsing contract

  • Parser is line-oriented Markdown table parser; tolerates leading prose.
  • Empty leading/trailing whitespace in cells is stripped.
  • Backticked verdict values (` fixed `) are accepted; backticks stripped during normalization.
  • Linkified issue values (#123) are accepted; the #NNN form is canonical for the parsed IssueMatrixRow.issue value.

Output

  • On success: parsed list[IssueMatrixRow].
  • On failure: non-zero exit + JSON diagnostic on stdout.

Acceptance fixtures

  • 6 existing matrices on main — each passes either after auto-normalize or surfaces a specific diagnostic per the rules above.
  • A synthetic matrix with an unknown column Severity — fails SCHEMA_DRIFT naming Severity.
  • A synthetic matrix with verdict deferred (no -with-followup) — fails VERDICT_UNKNOWN.
  • A synthetic matrix with deferred-with-followup verdict but evidence_ref of TBD — fails DEFERRED_WITHOUT_HANDLE.

merge-state-idempotency.md

Contract: Mission-number assignment idempotency

WP: WP04 | FRs: FR-010, FR-011, FR-012 | Source bug: #983

Pre-condition

spec-kitty merge is mid-flow with merge-state file .kittify/merge-state.json present and consistent.

Idempotency rule

The mission-number-assignment step has two read+decide branches:

1. Read meta.json.mission_number in the mission feature directory. 2. Compute expected mission number via the canonical strategy (max(existing) + 1 inside the merge-state lock).

If meta.json.mission_number == expected, the step is a no-op: no rewrite of meta.json, no commit, no state mutation. Else, the assignment proceeds as today.

After successful execution (whether by no-op or assignment), MergeState.mission_number_baked = True is set and persisted.

Resume semantics

On spec-kitty merge --resume:

  • Read MergeState from disk via load_state().
  • If mission_number_baked == True, skip the assignment step entirely (no read of meta.json, no compute, no commit).
  • Else, proceed to the idempotency check above.

Concurrency

The mission-number-assignment step continues to run inside the existing merge-state lock (max(existing) + 1 requires it). The idempotency check reads meta.json while holding the lock; release follows the existing flow.

Atomicity (opportunistic)

meta.json write — if WP04 implementation also addresses the non-atomic write (current pattern: Path.write_text(json.dumps(...))) by switching to temp-file + rename, this is bonus scope and lands in the same WP. Required scope is only the idempotency check + flag.

Acceptance fixtures

  • Simulate partial merge: write mission_number=115 to meta.json, write MergeState with mission_number_baked=False, fail the merge mid-step, rerun with --resume. Expected: no empty mission-number commit; merge completes; mission_number_baked becomes True.
  • Fresh merge (no prior assignment): expected: assignment runs as today; flag set to True after success.
  • --resume on a state where mission_number_baked == True: expected: step is skipped without meta.json read.

Invariants

  • The mission-number value itself, once written, is never overwritten — even if the computed value changes (e.g., concurrent merges). The lock guarantees serialization; idempotency guarantees no rewrite after success.

review-mode-resolution.md

Contract: spec-kitty review mode resolution

WP: WP03 | FRs: FR-005, FR-006, FR-023 | Diagnostic codes: MISSION_REVIEW_MODE_MISMATCH

Inputs

  • meta.json.baseline_merge_commit: str | None — present iff the mission has been merged via spec-kitty merge.
  • CLI argument --mode {lightweight | post-merge} (optional).

Resolution rule (precedence order)

1. CLI flag override. If --mode <m> is present on the command line, the mode is <m>. 2. Auto-detect: post-merge. Else, if meta.json.baseline_merge_commit is set, the mode is POST_MERGE. 3. Auto-detect: lightweight. Else, the mode is LIGHTWEIGHT.

Mode-mismatch detection

When step 1 sets mode to POST_MERGE and step 2's signal is absent (baseline_merge_commit not in meta.json), the command exits non-zero with MISSION_REVIEW_MODE_MISMATCH. The diagnostic body MUST contain:

1. A "What this means" paragraph naming the missing signal. 2. Three remediation options (run spec-kitty merge, re-run with --mode lightweight, or run identity backfill for pre-083 missions).

The reverse case (--mode lightweight with baseline_merge_commit present) is not a mismatch — operators may legitimately want a quick consistency check on an already-merged mission.

Output

  • Stdout (JSON): {"mode": "lightweight" | "post-merge", "auto_detected": bool, "baseline_merge_commit": "<sha>" | null}
  • Persisted in mission-review-report.md frontmatter under the mode key.

Acceptance fixtures

  • Pre-merge mission, bare spec-kitty review: mode is lightweight, exits 0.
  • Pre-merge mission, --mode post-merge: mode-mismatch diagnostic; exits non-zero.
  • Post-merge mission, bare invocation: mode is post-merge; required artifacts validated.
  • Post-merge mission, --mode lightweight: mode is lightweight; report explicitly says so; exits 0.

Invariants

  • The mode is recorded in the report; consumers downstream (cross-surface harness #992 Phase 0, dashboard) must read mode from the report, not infer it.
  • The auto-detect default never changes within a release minor without a deprecation cycle.

status-read-worktree-resolution.md

Contract: Status-read worktree resolution

WP: WP05 | FRs: FR-013, FR-014, FR-015 | Source bug: #984

Surface in scope

Read-only status commands and their JSON outputs:

  • spec-kitty agent tasks status --json
  • (spec-kitty next --json discovery — audit; likely already correct after mission 068)

Resolution rule

Read-only status commands resolve their data source via get_status_read_root() (new helper), which returns:

1. The current worktree root if invoked from inside a git worktree (including detached worktrees). 2. get_main_repo_root() as the fallback only when no current worktree can be determined.

Write paths (move-task, finalize-tasks, merge, sync emit) are not changed; they continue to resolve via get_main_repo_root() so canonical serialization remains pinned to the main checkout.

Fail-loud cases

If a read-only command is invoked in a context where worktree resolution legitimately cannot apply (e.g., command requires comparison across worktrees), it MUST fail with a diagnostic naming the constraint and the operator's options — never silently fall back to the main repo root in a way that produces stale state.

Acceptance fixtures

  • Two-worktree fixture with divergent status.events.jsonl: from each worktree, agent tasks status --json reflects the local event log.
  • Detached worktree at a verification SHA: agent tasks status --json matches a direct reducer pass over the worktree's events.
  • Invocation from the main checkout: behavior unchanged from today.
  • Write path invoked from a detached worktree: still resolves to main checkout (regression guard).

Invariants

  • get_main_repo_root() and get_status_read_root() are distinct, single-purpose helpers.
  • Audit of all callers of get_main_repo_root() in read-only paths is part of WP05 done criteria.