Contracts
charter-io-chokepoint.md
Contract: charter/_io.py encoding chokepoint
WP: WP06 | FRs: FR-016 – FR-021 | Source bug: #644 | Diagnostic codes: CHARTER_ENCODING_*
Public API
def load_charter_file(path: Path, *, unsafe: bool = False) -> CharterContent: ...
def load_charter_bytes(data: bytes, *, origin: str, unsafe: bool = False) -> CharterContent: ...
CharterContent shape: see data-model.md §3.
Detection order
1. BOM sniff — if leading BOM bytes match a known encoding (UTF-8-SIG, UTF-16-LE, UTF-16-BE), use that encoding. 2. Strict UTF-8 decode — if data.decode("utf-8") succeeds, encoding is "utf-8" with confidence = 1.0. 3. charset-normalizer.from_bytes(data).best() — if a candidate clears confidence >= 0.85 (computed from 1.0 - match.chaos), use it. 4. Fail — emit CHARTER_ENCODING_AMBIGUOUS with the file path and detected candidates.
unsafe bypass behavior
When unsafe=True and detection falls to step 4:
- Use the highest-confidence candidate from
charset-normalizerregardless of threshold (or fall back tocp1252if absolutely no candidate exists). - Set
CharterContent.normalization_applied = True. - Write the provenance record with
bypass_used = True. - The function returns successfully — the operator is taking responsibility.
When unsafe=False (default) and detection falls to step 4:
- Raise
CharterEncodingErrorcarrying theCharterEncodingDiagnostic.AMBIGUOUScode; caller is responsible for stdout/JSON emission.
Diagnostic body shape
ERROR: CHARTER_ENCODING_AMBIGUOUS
File: kitty-specs/<mission>/charter/charter.yaml
Detected candidates:
- cp1252 (confidence 0.62)
- utf-8 with replacement (confidence 0.48)
Mixed-content signal: bytes 0xE9 0x80 0xAE at offset 1247 form valid cp1252
'逮' but invalid UTF-8.
Remediation options:
1. Open the file in a UTF-8-aware editor and re-save.
2. iconv -f cp1252 -t utf-8 <file> > <file>.utf8 && mv <file>.utf8 <file>.
3. Re-run with --unsafe (logs bypass_used=true to provenance).
Provenance write
Every successful detection (including pure-UTF-8 with no normalization) writes a record to the provenance file per contracts/encoding-provenance-schema.md. Failure path (raise) does not write provenance — only successful resolutions are audit-worthy.
Retrofit sites (NFR-004 budget enforcement)
Exactly three modules in src/charter/ retrofit their existing read_text(encoding="utf-8") calls to load_charter_file(path):
1. compiler.py:594 — yaml.load(path.read_text(encoding="utf-8")) → yaml.load(load_charter_file(path).text) 2. sync.py:151 — charter_path.read_text("utf-8") → load_charter_file(charter_path).text 3. interview.py:283, 398 — analogous wrap
Other charter modules (context.py, hasher.py, language_scope.py, compact.py, neutrality/lint.py) are intentionally NOT modified — they re-read already-normalized files and trust the chokepoint's contract.
Acceptance fixtures
- cp1252-encoded charter file → ingest succeeds with
source_encoding = "cp1252",normalization_applied = True, provenance recorded. - Pure-UTF-8 charter → ingest succeeds with
source_encoding = "utf-8",confidence = 1.0,normalization_applied = False. - Mixed-content file → raises
AMBIGUOUS; provenance NOT written. - Same mixed-content file with
unsafe=True→ succeeds with bypass; provenance hasbypass_used = True.
Invariants
- Module count touched by WP06: 4 (
_io.pynew + 3 retrofit sites). NFR-004 budget respected. - The 5 deferred re-read sites remain unchanged in this WP. WP08 covers legacy-file migration.
encoding-provenance-schema.md
Contract: Encoding provenance schema
WP: WP06 | FRs: FR-022 | HiC decision: dual-storage, prefer per-mission, centralize shared
Record schema (JSONL; one record per line)
{"event_id": "01HXYZ...",
"at": "2026-05-12T18:30:00+00:00",
"file_path": "kitty-specs/<mission>/charter/charter.yaml",
"source_encoding": "cp1252",
"confidence": 0.93,
"normalization_applied": true,
"bypass_used": false,
"actor": "spec-kitty charter compile",
"mission_id": "01KRC57CNW5JCVBRV8RAQ2ARXZ"}
Field semantics: see data-model.md §3 EncodingProvenanceRecord.
Routing rule
A record is appended to exactly one file. The chokepoint inspects file_path:
- Per-mission: if
file_pathstarts withkitty-specs/<mission-slug>/, append tokitty-specs/<mission-slug>/.encoding-provenance.jsonl. Setmission_idto the mission's ULID (resolved fromkitty-specs/<mission-slug>/meta.json). - Centralized: otherwise, append to
.kittify/encoding-provenance/global.jsonl. Setmission_idtonull.
There is no third destination, and a single event is never appended to both files.
File layout
kitty-specs/
├── <mission-slug-A>/
│ ├── charter/
│ └── .encoding-provenance.jsonl # per-mission events for this mission
├── <mission-slug-B>/
│ ├── charter/
│ └── .encoding-provenance.jsonl # per-mission events for this mission
└── ...
.kittify/
├── charter/ # global charter (not mission-scoped)
└── encoding-provenance/
└── global.jsonl # events for charter content outside any kitty-specs/<mission>/ tree
Append semantics
- Append is
open(..., "a", encoding="utf-8")+f.write(json.dumps(record, sort_keys=True) + "\n")— same pattern as the existingstatus.events.jsonlwriter (seesrc/specify_cli/status/store.py). event_idis a fresh ULID per record (uses the existing ULID utility fromspecify_cli.id_genor equivalent).atis ISO-8601 UTC with offset; same format as status events.
Read semantics
Consumers (audit tooling, hypothetical future dashboard) may cat per-mission + centralized files together without coalescing logic. Each record is self-describing via mission_id.
Acceptance fixtures
- Ingest a file under
kitty-specs/foo-01KQ.../charter/x.yaml→ record appears inkitty-specs/foo-01KQ.../.encoding-provenance.jsonl, NOT in.kittify/encoding-provenance/global.jsonl. - Ingest a file under
.kittify/charter/y.yaml→ record appears in.kittify/encoding-provenance/global.jsonl, NOT in any mission file. - Concurrent appenders (two
spec-kitty charter compileinvocations in parallel) → all records survive; no overwrite.
Invariants
- Record schema is identical across both files (same keys, same types).
- No record is duplicated across files.
- The schema is JSON-stable per NFR-001; new keys may be added but existing key names and types never change without a deprecation cycle.
issue-matrix-schema.md
Contract: issue-matrix.md validator schema
WP: WP03 | FRs: FR-006, FR-028 – FR-032 | Diagnostic codes: MISSION_REVIEW_ISSUE_MATRIX_*
Audit-derived vocabulary (closed sets)
Mandatory columns
Exact order, case-insensitive on input, normalized to lowercase internally:
1. issue 2. verdict 3. evidence_ref
Named-optional columns (closed set)
May appear in any order after the mandatory three:
titlescope(alias:theme)wp(alias:wp_id)fr(alias:fr(s))nfr(alias:nfr(s))screpo
Verdict allow-list (closed set)
fixed
verified-already-fixed
deferred-with-followup
Validator rules
| Rule | Diagnostic on violation |
|---|---|
| All mandatory columns present, in order | MISSION_REVIEW_ISSUE_MATRIX_SCHEMA_DRIFT |
| Every column is either mandatory or named-optional | MISSION_REVIEW_ISSUE_MATRIX_SCHEMA_DRIFT (names unknown column) |
| Verdict cell value is in the allow-list | MISSION_REVIEW_ISSUE_MATRIX_VERDICT_UNKNOWN |
| Exactly one Markdown table at top level (additional prose allowed; additional tables NOT allowed) | MISSION_REVIEW_ISSUE_MATRIX_MULTI_TABLE |
evidence_ref cell non-empty | MISSION_REVIEW_ISSUE_MATRIX_EVIDENCE_REF_EMPTY |
When verdict == deferred-with-followup, evidence_ref contains a follow-up handle (regex matches #\d+ OR contains Follow-up: substring) | MISSION_REVIEW_ISSUE_MATRIX_DEFERRED_WITHOUT_HANDLE |
Remediation pass over existing matrices (FR-032)
When the validator runs in remediation mode over the 6 existing matrices on main:
- Auto-normalize: capitalization drift (
Issue→issue), alias drift (Evidence ref→evidence_ref,wp_id→wp,theme→scope). Writes a one-line provenance note inside the file:<!-- normalized YYYY-MM-DD: header case folded; aliases resolved -->. - Surface, do not auto-fix: structural drift (multi-table layout in
charter-golden-path-e2e-tranche-1-01KQ806X; any unknown columns likeSurfaceorWhere surfaced in code). Operator gets a diagnostic with repair guidance and must commit the fix manually.
Parsing contract
- Parser is line-oriented Markdown table parser; tolerates leading prose.
- Empty leading/trailing whitespace in cells is stripped.
- Backticked verdict values (`
fixed`) are accepted; backticks stripped during normalization. - Linkified issue values (
#123) are accepted; the#NNNform is canonical for the parsedIssueMatrixRow.issuevalue.
Output
- On success: parsed
list[IssueMatrixRow]. - On failure: non-zero exit + JSON diagnostic on stdout.
Acceptance fixtures
- 6 existing matrices on
main— each passes either after auto-normalize or surfaces a specific diagnostic per the rules above. - A synthetic matrix with an unknown column
Severity— failsSCHEMA_DRIFTnamingSeverity. - A synthetic matrix with verdict
deferred(no-with-followup) — failsVERDICT_UNKNOWN. - A synthetic matrix with
deferred-with-followupverdict butevidence_refofTBD— failsDEFERRED_WITHOUT_HANDLE.
merge-state-idempotency.md
Contract: Mission-number assignment idempotency
WP: WP04 | FRs: FR-010, FR-011, FR-012 | Source bug: #983
Pre-condition
spec-kitty merge is mid-flow with merge-state file .kittify/merge-state.json present and consistent.
Idempotency rule
The mission-number-assignment step has two read+decide branches:
1. Read meta.json.mission_number in the mission feature directory. 2. Compute expected mission number via the canonical strategy (max(existing) + 1 inside the merge-state lock).
If meta.json.mission_number == expected, the step is a no-op: no rewrite of meta.json, no commit, no state mutation. Else, the assignment proceeds as today.
After successful execution (whether by no-op or assignment), MergeState.mission_number_baked = True is set and persisted.
Resume semantics
On spec-kitty merge --resume:
- Read
MergeStatefrom disk viaload_state(). - If
mission_number_baked == True, skip the assignment step entirely (no read ofmeta.json, no compute, no commit). - Else, proceed to the idempotency check above.
Concurrency
The mission-number-assignment step continues to run inside the existing merge-state lock (max(existing) + 1 requires it). The idempotency check reads meta.json while holding the lock; release follows the existing flow.
Atomicity (opportunistic)
meta.json write — if WP04 implementation also addresses the non-atomic write (current pattern: Path.write_text(json.dumps(...))) by switching to temp-file + rename, this is bonus scope and lands in the same WP. Required scope is only the idempotency check + flag.
Acceptance fixtures
- Simulate partial merge: write
mission_number=115tometa.json, writeMergeStatewithmission_number_baked=False, fail the merge mid-step, rerun with--resume. Expected: no empty mission-number commit; merge completes;mission_number_bakedbecomesTrue. - Fresh merge (no prior assignment): expected: assignment runs as today; flag set to
Trueafter success. --resumeon a state wheremission_number_baked == True: expected: step is skipped withoutmeta.jsonread.
Invariants
- The mission-number value itself, once written, is never overwritten — even if the computed value changes (e.g., concurrent merges). The lock guarantees serialization; idempotency guarantees no rewrite after success.
review-mode-resolution.md
Contract: spec-kitty review mode resolution
WP: WP03 | FRs: FR-005, FR-006, FR-023 | Diagnostic codes: MISSION_REVIEW_MODE_MISMATCH
Inputs
meta.json.baseline_merge_commit: str | None— present iff the mission has been merged viaspec-kitty merge.- CLI argument
--mode {lightweight | post-merge}(optional).
Resolution rule (precedence order)
1. CLI flag override. If --mode <m> is present on the command line, the mode is <m>. 2. Auto-detect: post-merge. Else, if meta.json.baseline_merge_commit is set, the mode is POST_MERGE. 3. Auto-detect: lightweight. Else, the mode is LIGHTWEIGHT.
Mode-mismatch detection
When step 1 sets mode to POST_MERGE and step 2's signal is absent (baseline_merge_commit not in meta.json), the command exits non-zero with MISSION_REVIEW_MODE_MISMATCH. The diagnostic body MUST contain:
1. A "What this means" paragraph naming the missing signal. 2. Three remediation options (run spec-kitty merge, re-run with --mode lightweight, or run identity backfill for pre-083 missions).
The reverse case (--mode lightweight with baseline_merge_commit present) is not a mismatch — operators may legitimately want a quick consistency check on an already-merged mission.
Output
- Stdout (JSON):
{"mode": "lightweight" | "post-merge", "auto_detected": bool, "baseline_merge_commit": "<sha>" | null} - Persisted in
mission-review-report.mdfrontmatter under themodekey.
Acceptance fixtures
- Pre-merge mission, bare
spec-kitty review: mode islightweight, exits 0. - Pre-merge mission,
--mode post-merge: mode-mismatch diagnostic; exits non-zero. - Post-merge mission, bare invocation: mode is
post-merge; required artifacts validated. - Post-merge mission,
--mode lightweight: mode islightweight; report explicitly says so; exits 0.
Invariants
- The mode is recorded in the report; consumers downstream (cross-surface harness #992 Phase 0, dashboard) must read mode from the report, not infer it.
- The auto-detect default never changes within a release minor without a deprecation cycle.
status-read-worktree-resolution.md
Contract: Status-read worktree resolution
WP: WP05 | FRs: FR-013, FR-014, FR-015 | Source bug: #984
Surface in scope
Read-only status commands and their JSON outputs:
spec-kitty agent tasks status --json- (
spec-kitty next --jsondiscovery — audit; likely already correct after mission 068)
Resolution rule
Read-only status commands resolve their data source via get_status_read_root() (new helper), which returns:
1. The current worktree root if invoked from inside a git worktree (including detached worktrees). 2. get_main_repo_root() as the fallback only when no current worktree can be determined.
Write paths (move-task, finalize-tasks, merge, sync emit) are not changed; they continue to resolve via get_main_repo_root() so canonical serialization remains pinned to the main checkout.
Fail-loud cases
If a read-only command is invoked in a context where worktree resolution legitimately cannot apply (e.g., command requires comparison across worktrees), it MUST fail with a diagnostic naming the constraint and the operator's options — never silently fall back to the main repo root in a way that produces stale state.
Acceptance fixtures
- Two-worktree fixture with divergent
status.events.jsonl: from each worktree,agent tasks status --jsonreflects the local event log. - Detached worktree at a verification SHA:
agent tasks status --jsonmatches a direct reducer pass over the worktree's events. - Invocation from the main checkout: behavior unchanged from today.
- Write path invoked from a detached worktree: still resolves to main checkout (regression guard).
Invariants
get_main_repo_root()andget_status_read_root()are distinct, single-purpose helpers.- Audit of all callers of
get_main_repo_root()in read-only paths is part of WP05 done criteria.