Contracts
catalog-miss-cli-visibility.md
Contract — Catalog-Miss CLI Visibility
> Mission: slice-f-multi-context-extensibility-01KRX5C8 > Closes: FR-130, FR-131, FR-132 + NFR-006 | Companions: charter-scope-resolution.md > Data model: ../data-model.md §8
The catalog-miss CLI visibility contract closes RISK-3 (Mission B): operator-visible warning surface for catalog misses caused by typo'd or missing charter selections. Today the _LOGGER.warning(...) path in charter._catalog_miss is silently dropped because the spec-kitty CLI installs no log handler.
Input Contract
Bootstrap requirement (FR-130)
The spec-kitty CLI entry point (src/specify_cli/__main__.py or the typer app startup hook) MUST call:
import logging
logging.captureWarnings(True)
This routes warnings.warn(...) through the Python logging subsystem so the FR-131 handler can format and emit it.
Handler installation (FR-131)
The same entry point MUST install a Rich-aware logging.Handler that routes WARNING+ records through the existing Rich Console instance to the operator's stderr.
Reference implementation shape:
import logging
from rich.console import Console
from rich.logging import RichHandler
# IMPORTANT (RR-6): defer to the existing Console instance; do NOT instantiate a new one.
from specify_cli.console import get_stderr_console # whatever the existing accessor is
_handler = RichHandler(
console=get_stderr_console(),
show_path=False,
show_time=False,
markup=False,
rich_tracebacks=False,
)
_handler.setLevel(logging.WARNING)
logging.basicConfig(
level=logging.WARNING,
format="%(message)s",
datefmt="[%X]",
handlers=[_handler],
)
logging.captureWarnings(True)
The handler MUST be installed at process startup so subprocess invocations (FR-132 test) see warnings on stderr.
Catalog-miss emission contract — src/charter/_catalog_miss.py
The existing _LOGGER.warning(message, extra=extra) call site MUST emit the following fields in extra= (FR-131):
CatalogMissEvent extra fields
extra = {
"kind": "styleguide", # str — the artifact kind that missed
"id": "caveman-comemnts", # str — the artifact ID that didn't resolve
"cause": "typo", # Literal["typo","missing","schema_validation_suspected"]
"suggestion": "caveman-comments", # str | None — closest-match (None if unavailable)
"mission_id": "01KRX5C8MQ...", # str | None — the mission ULID, if known
"scope": None, # str | None — CharterScope.name if monorepo; else None
}
The cause classifier is heuristic:
typo— at least one allowlisted artifact ID has a Levenshtein distance ≤ 2 from the missed ID; suggestion is the closest match.missing— no close match; suggestion isNone.schema_validation_suspected— the missed ID parsed cleanly but the artifact body failed schema validation; suggestion may beNoneor point to the failing-validation file.
Output Contract
Operator-visible stderr line
When the handler fires, the operator sees on stderr (Rich-formatted):
WARNING Catalog miss: styleguide=caveman-comemnts (cause=typo). Did you mean: caveman-comments? [mission=01KRX5C8MQ..., scope=None]
Multiple-miss aggregation
Each miss produces one log line. The handler does NOT deduplicate within a process (Python's warnings default default filter handles per-location deduplication; the handler preserves that semantic).
Programmatic API
The extra= payload IS the structured surface — downstream tooling (e.g. CI log scrapers, JSON-log mode in a follow-up mission) can consume it directly via a custom handler.
Failure modes
| Trigger | Behaviour | Operator message |
|---|---|---|
| The handler is not installed (regression) | warnings.warn(...) produces a Python WARNING line (the default default filter), but no Rich formatting. The structured extra= dict is lost | The FR-132 subprocess test FAILS, signaling the regression |
| The handler is installed but Rich Console is unavailable (e.g. non-tty) | Rich falls back to plain text on stderr; the message is still visible | None — operator sees a plain-text WARNING line |
The extra= dict is missing required keys | Handler logs the raw message without the structured suffix | None — soft degradation; the structured-log contract is best-effort |
| Subprocess test runs but the catalog-miss code path never fires | FR-132 test FAILS with "no catalog-miss warning observed" | Test investigates the test fixture's charter |
FR-132 subprocess test contract
tests/integration/test_catalog_miss_cli_visibility.py:
import subprocess
import sys
from pathlib import Path
@pytest.mark.integration
@pytest.mark.git_repo
def test_typoed_styleguide_produces_visible_stderr_warning(tmp_repo):
"""A typo'd charter selection produces an operator-visible warning on stderr.
Pinned: FR-130, FR-131, FR-132, NFR-006, AC-9, Scenario 5.
"""
# tmp_repo fixture scaffolds a charter with selected_styleguides: [does-not-exist]
result = subprocess.run(
[sys.executable, "-m", "specify_cli", "agent", "action", "implement", "WP01"],
cwd=tmp_repo,
capture_output=True,
text=True,
check=False,
)
stderr = result.stderr
assert "Catalog miss" in stderr, f"Expected catalog-miss warning on stderr; got:\n{stderr}"
assert "does-not-exist" in stderr
assert "styleguide" in stderr
Why subprocess (NFR-006 binding): pytest's in-process warning capture would mask the real-world problem. The test MUST prove the warning is visible to a real CLI invocation under operator conditions.
Backward compatibility guarantee
- The bootstrap addition (
logging.captureWarnings(True)+ handler install) is additive. No existing CLI invocation changes behaviour except for previously-silent warnings becoming visible. - The
extra=dict extension is additive. Existing callers passing fewer fields continue to work; missing fields produce a soft-degradation message. - The Rich Console deferral (RR-6) ensures no double-init; existing Rich output (progress bars, tables, etc.) is unchanged.
Glossary terms (canonicalised in WP12 per FR-302)
- Catalog miss — renderer state when a charter-selected artifact ID does not resolve to a loaded artifact in any layer. Already in
glossary/contexts/doctrine.md; this mission promotes it to canonical.
ATDD anchors
tests/integration/test_catalog_miss_cli_visibility.py(FR-132; NFR-006; AC-9; Scenario 5)tests/unit/test_catalog_miss_event_extra_fields.py(unit; asserts theextra=dict carries the FR-131 fields)tests/unit/test_rich_log_handler_install.py(unit; assertslogging.captureWarnings(True)+ aRichHandleris installed at module import)
charter-scope-resolution.md
Contract — Charter Scope Resolution
> Mission: slice-f-multi-context-extensibility-01KRX5C8 > Closes: FR-008, FR-009, FR-010, FR-011 | Companions: org-drg-schema.md > Data model: ../data-model.md §4
CharterScope is the runtime resolver for "which charter applies to this filesystem path" in optional monorepo configurations. Single-project repositories behave identically to today.
Input Contract
Operator-facing surface — .kittify/config.yaml (optional)
For monorepos that want per-package charter scoping:
# pydantic_model: charter.scope.CharterScopeConfig
# expect: valid
charter_scopes:
- root: packages/auth
name: auth
- root: packages/web
name: web
Single-project repositories OMIT this key entirely. The CharterScope resolver defaults to repo-root (FR-011, NFR-001).
API surface — charter/scope.py
from pathlib import Path
from charter.scope import CharterScope
# Default (single-project) — behaviour byte-identical to today
scope = CharterScope.default(repo_root)
# Resolve from a feature directory (monorepo-aware)
scope = CharterScope.resolve(repo_root, feature_dir)
API surface — charter/context.py
from charter.context import build_charter_context
# Single-project (no scope passed) — byte-identical to today's call site
result = build_charter_context(repo_root, action="implement")
# Monorepo (scope passed explicitly)
scope = CharterScope.resolve(repo_root, feature_dir)
result = build_charter_context(repo_root, action="implement", scope=scope)
When scope=None (the default), build_charter_context internally constructs CharterScope.default(repo_root). No behaviour change for the 23 existing governance-contract fixtures (NFR-001 binding).
Output Contract
Resolution algorithm
CharterScope.resolve(repo_root, feature_dir):
1. Read .kittify/config.yaml's optional charter_scopes list. If absent, return CharterScope.default(repo_root). 2. Compute the absolute path of feature_dir. 3. For each configured scope, compute the absolute path of repo_root / scope.root. 4. Find the configured scope whose root is the nearest enclosing ancestor of feature_dir. Tie-breaking: deepest match wins. 5. If no scope encloses feature_dir, raise CharterScopeNotFound. 6. If two scopes have incompatible nesting depths (e.g. packages/auth and packages/auth/inner both configured, and feature_dir is inside packages/auth/inner/sub), raise CharterScopeConflict naming both paths.
Returned CharterScope fields
| Field | Default-case value | Monorepo-case value |
|---|---|---|
root | repo_root (absolute) | repo_root / scope.root (absolute) |
name | None | The configured name string |
config_source | "repo_root_default" | "monorepo_config" |
Threading into build_charter_context
When a non-default scope is active, build_charter_context reads the charter from scope.root / .kittify/charter/charter.md and threads the scope name into the rendered prompt's provenance metadata. Catalog-miss warnings include the scope name in their extra= dict (see catalog-miss-cli-visibility.md §scope field).
Failure modes
| Trigger | Exception | Operator message |
|---|---|---|
charter_scopes: configured but feature_dir is not under any scope's root | CharterScopeNotFound | "No charter scope encloses <feature_dir>. Configured scopes: <list>. Either run from inside one of the configured scopes or add an entry to .kittify/config.yaml." |
Two nested configured scopes claim the same feature_dir ambiguously | CharterScopeConflict | "Charter scope configuration is malformed: <path_a> and <path_b> both claim <feature_dir>. Reorganise the configuration so each path belongs to exactly one scope." (Scenario 2 exception path) |
Configured root does not exist on disk | CharterScopeConflict | "Charter scope <name> configured at <root> does not exist. Remove the entry or create the directory." |
The configured scope's root does not contain .kittify/charter/ | CharterScopeNotFound | "Charter scope <name> at <root> does not contain .kittify/charter/. Run spec-kitty charter scaffold inside that directory or remove the scope entry." |
Backward compatibility guarantee
- NFR-001 binding: repositories without
charter_scopes:configured behave identically to today. The 23test_wp_prompt_governance_contract.pyfixtures pass unchanged. build_charter_context(repo_root, action=...)(noscope=keyword) constructsCharterScope.default(repo_root)internally; output is byte-identical to today.CharterScope.default(repo_root)is the only constructor used internally by Mission A / Mission B test fixtures and CLI flows — no migration required for any historical mission.
Sample malformed configuration — round-trip frontmatter
# pydantic_model: charter.scope.CharterScopeConfig
# expect: invalid
charter_scopes:
- root: packages/auth
# name field missing? Not required (Optional); this is still valid.
# The invalid case below is empty root:
- root: ""
Empty root is rejected by the validator.
ATDD anchors
tests/integration/test_monorepo_charter_scope.py(Scenario 2 happy + exception paths; AC-3)tests/charter/test_charter_scope.py(unit; default + resolve + conflict + not-found)tests/specify_cli/next/test_wp_prompt_governance_contract.py(regression; 23/23 unchanged; NFR-001)
contract-round-trip-frontmatter.md
Contract — Contract Round-Trip Frontmatter
> Mission: slice-f-multi-context-extensibility-01KRX5C8 > Closes: FR-140, FR-141 | Companions: ratchet-baseline-format.md, org-drg-schema.md, workflow-sequence-schema.md
The contract round-trip backstop closes Process Gap 1 at the architectural-test level. Today, Step 3.5 of the runtime-review skill (the Contract Round-Trip Check) is a human-only checklist item — a reviewer who skips it is not challenged. This contract turns that checklist into a CI gate.
The mechanism is YAML codeblock frontmatter on every example in kitty-specs//contracts/.md. The frontmatter declares the Pydantic model the codeblock should parse against AND the expected outcome (valid or invalid). A walker (tests/contract/test_example_round_trip.py) exercises every tagged codeblock and asserts the outcome matches.
Input Contract
Frontmatter convention on YAML codeblocks
Every YAML codeblock in kitty-specs/<mission>/contracts/*.md that documents a parseable contract example MUST be preceded by a frontmatter comment of the shape:
# pydantic_model: <module.dotted.path.ClassName>
# expect: valid | invalid
Example:
pydantic_model: charter.drg.OrgDRGFragment
expect: valid
pack_name: acme-compliance source_kind: local_path ...
Recognised frontmatter keys
| Key | Type | Required | Purpose |
|---|---|---|---|
pydantic_model | str (dotted import path) | yes | The Pydantic model to instantiate via model_validate(yaml.safe_load(...)). MUST be importable from the running test process |
expect | Literal["valid", "invalid"] | yes | The expected outcome. valid ⇒ model_validate MUST succeed; invalid ⇒ MUST raise pydantic.ValidationError |
expect_message | str (substring match) | no | When expect: invalid, optionally pin a substring that MUST appear in the raised exception's message |
Codeblocks NOT subject to round-trip
YAML codeblocks WITHOUT the pydantic_model: frontmatter line are skipped — they are documentation prose or shape sketches, not contract examples. The walker counts them in a skipped: summary but does not fail.
Legacy contract allowlist (FR-141)
Contracts from missions predating this convention live under an allowlist tracked in tests/architectural/_baselines.yaml:
test_example_round_trip:
legacy_contract_allowlist: <N>
Files in this allowlist warn rather than fail when their codeblocks lack frontmatter or when an example's expect: claim cannot be verified. The allowlist participates in the FR-110 baseline — it shrinks over time as legacy missions backfill frontmatter (or get tickets opened to do so).
Output Contract
Walker behaviour — tests/contract/test_example_round_trip.py
from pathlib import Path
import importlib
import yaml
import re
FRONTMATTER_RE = re.compile(r"^# pydantic_model: (?P<model>[\w\.]+)\s*\n# expect: (?P<expect>valid|invalid)", re.MULTILINE)
def _discover_examples():
"""Walk kitty-specs/*/contracts/*.md and yield (file, model, expect, payload)."""
for contract_md in Path("kitty-specs").glob("*/contracts/*.md"):
text = contract_md.read_text()
# Extract every fenced ```yaml ...``` block; for each, look at the first two non-blank lines for frontmatter
for codeblock in _iter_yaml_codeblocks(text):
frontmatter = FRONTMATTER_RE.search(codeblock)
if not frontmatter:
continue
model_path = frontmatter.group("model")
expect = frontmatter.group("expect")
payload = _strip_frontmatter(codeblock)
yield contract_md, model_path, expect, payload
@pytest.mark.parametrize("contract_md,model_path,expect,payload", list(_discover_examples()))
def test_contract_example_round_trip(contract_md, model_path, expect, payload):
module_name, _, class_name = model_path.rpartition(".")
module = importlib.import_module(module_name)
model = getattr(module, class_name)
parsed = yaml.safe_load(payload)
if expect == "valid":
model.model_validate(parsed) # MUST succeed
else:
with pytest.raises(pydantic.ValidationError):
model.model_validate(parsed)
Failure shape
When a expect: valid codeblock fails to parse:
> FAIL: kitty-specs/<mission>/contracts/<file>.md (codeblock #N) declared pydantic_model: <Model>, expect: valid but model_validate raised: <exception text>.
When a expect: invalid codeblock parses cleanly:
> FAIL: kitty-specs/<mission>/contracts/<file>.md (codeblock #N) declared pydantic_model: <Model>, expect: invalid but model_validate succeeded.
When a pydantic_model: references a non-importable model:
> FAIL: kitty-specs/<mission>/contracts/<file>.md (codeblock #N) declared pydantic_model: <bad.path.Model> but the module is not importable: <ImportError text>.
Legacy allowlist behaviour
For files in the legacy allowlist, FAIL conditions become WARN conditions and the test passes — but the legacy file's path is reported in a pytest warning so the operator sees the unwound work.
Failure modes
| Trigger | Reporter | Operator message |
|---|---|---|
A new contract's expect: valid example doesn't actually parse | test_example_round_trip FAIL | "Contract <file> codeblock #N declares expect: valid but model_validate raised: <exc>. Fix the example OR the model" |
A new contract's expect: invalid example DOES parse | test_example_round_trip FAIL | "Contract <file> codeblock #N declares expect: invalid but model_validate succeeded. Either the example was meant to be valid OR the model lost a validator" |
A contract file in kitty-specs/<mission>/contracts/ has YAML codeblocks but none carry frontmatter | If the contract is post-Slice-F (not in legacy allowlist) — test_example_round_trip FAIL | "Contract <file> has unfronted YAML codeblocks. Add # pydantic_model: and # expect: frontmatter or move the file to the legacy allowlist (_baselines.yaml:test_example_round_trip.legacy_contract_allowlist)" |
| A contract is in the legacy allowlist but no longer exists | test_ratchet_baselines FAIL with stale-allowlist message | "Stale legacy contract <file> in allowlist. Remove from _baselines.yaml" |
Backward compatibility guarantee
- Pre-Slice-F contract files (every contract under
kitty-specs/<mission>/contracts/predating this mission) participate via the legacy allowlist (FR-141). The allowlist is initially sized by WP03's discovery sweep (RR-7 mitigation). - Slice F's own contracts (the 6 contracts in this directory) DOGFOOD the convention — every
expect: validandexpect: invalidexample above is exercised at WP03 acceptance. - The walker does NOT crash on contracts with NO YAML codeblocks (e.g. prose-only contracts) — they are simply skipped.
Example use of expect: invalid for negative testing
# pydantic_model: charter.drg.OrgDRGFragment
# expect: invalid
# expect_message: "unknown kind"
pack_name: acme-compliance
source_kind: local_path
source_ref: ../acme-org-doctrine
layer_index: 1
provenance_marker: org
nodes:
- id: bogus
kind: not-a-real-kind
title: "Bogus"
edges: []
This codeblock asserts that the org-DRG schema correctly REJECTS unknown kinds (C-009 enforcement). The walker:
1. Imports charter.drg.OrgDRGFragment. 2. Parses the YAML payload. 3. Calls model.model_validate(payload). 4. Asserts that the call raises pydantic.ValidationError AND the error message contains "unknown kind".
Charter pinning (optional, FR-303 derivative)
The frontmatter convention itself is documented in src/specify_cli/upgrade/migrations/README.md (per Q7 resolution) so new contributors authoring contracts see it before they author. The convention does NOT become a charter rule in this mission; only the ATDD-first discipline (C-011) and burn-down policy (C-004) are charter-pinned.
ATDD anchors
tests/contract/test_example_round_trip.py(FR-140, FR-141; AC-10)- All 6 Slice F contracts (this directory) — each contains at least one
expect: validexample, andcontracts/org-drg-schema.md+contracts/workflow-sequence-schema.mdeach contain at least oneexpect: invalidexample for negative testing tests/architectural/test_ratchet_baselines.py(the legacy-allowlist baseline participates per FR-141)
org-drg-schema.md
Contract — Organisation-Tier DRG Fragment Schema
> Mission: slice-f-multi-context-extensibility-01KRX5C8 > Closes: FR-001, FR-003, FR-004, FR-005 | Companions: charter-scope-resolution.md, catalog-miss-cli-visibility.md, contract-round-trip-frontmatter.md > Data model: ../data-model.md §2, ../data-model.md §3
The organisation-tier DRG fragment is one configured layer of doctrine-reference-graph state between shipped (built-in) and project (.kittify/doctrine/graph.yaml) layers. Slice F adds this tier so organisations can ship proprietary governance artefacts without forking the shipped graph.
Input Contract
Operator-facing surface — .kittify/config.yaml
The operator configures one or more org packs:
organisation_packs:
- name: acme-compliance
source: local_path
path: ../acme-org-doctrine
- name: acme-engineering
source: local_path
path: ../acme-engineering-doctrine
This mission ships source: local_path only (NEW-1 resolution). url and package sources are reserved and produce NotImplementedError with a descriptive message that links to the follow-up tracker.
Pack-side layout
Each org pack on disk:
<pack-path>/
├── org-charter.yaml # required (already supported by Mission B)
├── drg/
│ └── fragment.yaml # NEW (this mission)
└── <kind>s/<id>.<kind>.yaml # any artefacts the fragment references
fragment.yaml shape
# pydantic_model: charter.drg.OrgDRGFragment
# expect: valid
pack_name: acme-compliance
source_kind: local_path
source_ref: ../acme-org-doctrine
layer_index: 1
provenance_marker: org
nodes:
- id: sox-controls
kind: directives
title: "SOX Control Framework"
body_path: directives/sox-controls.directive.yaml
edges:
- source: sox-controls
target: caveman-comments
relation: refines
The nodes and edges shapes mirror doctrine.drg.models.DRGNode and DRGEdge.
Invalid example — kind not in the 8-kind universe
# pydantic_model: charter.drg.OrgDRGFragment
# expect: invalid
pack_name: acme-compliance
source_kind: local_path
source_ref: ../acme-org-doctrine
layer_index: 1
provenance_marker: org
nodes:
- id: foo
kind: not-a-real-kind # ← C-009 violation
title: "Bogus"
edges: []
Per C-009 the schema reuses Mission B's 8-kind plural-naming union semantics; unknown kinds raise pydantic.ValidationError. The FR-140 round-trip gate exercises this example via the frontmatter walker.
Output Contract
Loader output
charter.drg.load_org_drg(repo_root: Path) -> list[OrgDRGFragment] returns one fragment per configured pack in .kittify/config.yaml declaration order. Layer indices are assigned 1..N matching declaration order.
Merge output
charter.drg.merge_three_layers(shipped, org_fragments, project) -> DRGGraph produces a merged graph where every node and edge carries source: built-in | org:<pack_name> | project. The merge is order-stable (deterministic).
Validator output
spec-kitty charter lint reports per-layer findings prefixed with the source name:
[built-in] OK — 87 nodes, 142 edges
[org:acme-compliance] OK — 12 nodes, 4 edges
[project] warn: directive 'caveman-comments' selected but no body found
Conflict output
OrgDRGConflictError carries one or more OrgDRGConflict records. The error message is operator-actionable and lists:
- Each conflict's
kind,target_id, andconflicting_layers. - The
resolution_applied. - The remediation hint (e.g. "remove the override from the org pack, OR escalate the shipped invariant change via a spec-kitty governance proposal").
Failure modes
| Trigger | Exception | Operator message |
|---|---|---|
Configured local_path does not exist on disk | OrgPackMissingError | "Org pack <name> configured at <path> not found. Either fetch the pack (spec-kitty doctrine fetch --pack <name>) or remove the entry from .kittify/config.yaml." (FR-004) |
Pack's drg/fragment.yaml declares a node kind not in the 8-kind universe | pydantic.ValidationError | Per pydantic — names the field and the rejected value (C-009 binding) |
| Pack's fragment overrides a shipped invariant edge or node | OrgDRGConflictError with resolution_applied="hard_fail" | "Org pack <name> attempts to override shipped invariant <target_id>. Layer rule (Mission A): shipped invariants cannot be overridden by org packs. Remove the override or escalate the change upstream." (FR-005) |
Pack's fragment imports across the layer boundary (e.g. body_path references src/specify_cli/...) | OrgDRGConflictError with kind="layer_rule_violation" | "Org pack <name> violates the layer rule. Doctrine artefacts cannot reference src/specify_cli/." (FR-005, C-001) |
Pack uses source_kind: url or package in this mission | NotImplementedError | "Org pack source url/package is not yet implemented (tracker: <ticket>). Use source: local_path for now." (NEW-1) |
Backward compatibility guarantee
- Repositories with no
organisation_packs:configuration behave identically to today (NFR-001 binding).load_org_drg(repo_root)returns[];merge_three_layerscollapses to the existing two-layer merge. - The 23
test_wp_prompt_governance_contract.pyfixtures pass unchanged because none of them configure an org pack. - The shipped-DRG layer's contents are not altered by this mission — only the loader is extended to thread an org layer between shipped and project.
ATDD anchors
tests/integration/test_three_layer_drg_end_to_end.py(Scenario 1 happy path; AC-1)tests/charter/test_org_drg_loader.py(unit; loader + merge + provenance)tests/integration/test_org_pack_missing_path_hard_fails.py(FR-004)tests/charter/test_org_drg_cannot_override_shipped_invariants.py(FR-005)tests/contract/test_example_round_trip.py(theexpect: validandexpect: invalidexamples above, exercised by the FR-140 walker)
ratchet-baseline-format.md
Contract — Ratchet Baseline Format
> Mission: slice-f-multi-context-extensibility-01KRX5C8 > Closes: FR-110, FR-111, FR-112, FR-141 | Companions: contract-round-trip-frontmatter.md > Data model: ../data-model.md §7
The ratchet baseline file (tests/architectural/_baselines.yaml) is the canonical statement of allowlist-size intent for every mutable architectural ratchet in the test suite. The companion meta-test (test_ratchet_baselines.py) FAILS on growth above baseline and WARNS (informationally) on shrinkage so the baseline gets edited downward in the same PR.
Input Contract
File location
tests/architectural/_baselines.yaml. Checked in. Per C-004 (binding) the file participates in the project charter's burn-down policy (FR-303(a)).
Schema
Real-values example (the gate's positive case)
This block is what an actual _baselines.yaml looks like at HEAD after WP01 + WP03 land. The contract round-trip gate parses this block and asserts it validates cleanly against BaselinesFile.
# pydantic_model: tests.architectural.test_ratchet_baselines.BaselinesFile
# expect: valid
test_no_dead_modules:
category_1_auto_discovered: 70
category_2_schema_generators: 4
category_3_external_entry_points: 4
category_4_compat_shims: 8
category_5_slot_holders: 3
category_6_internal_runtime: 3
category_7_grandfathered: 7 # MUST SHRINK -- C-006 target 0 by 4.0
test_migration_chain_integrity:
known_line_jumps: 4
known_patch_skips: 9 # NEW with Gap-A8
test_runtime_charter_doctrine_boundary:
baseline_allowlist: 0
test_auth_transport_singleton:
allowed_direct_httpx_files: 2 # NO CHANGE this mission (C-005)
test_compat_shims:
pure_shim_files: 3 # MUST SHRINK -- C-006 target 0 by 4.0
test_example_round_trip:
legacy_contract_allowlist: 151 # WP03 discovery sweep; shrinks as legacy contracts gain frontmatter
test_all_declarations_required:
charter_without_all: 0 # all migrated at WP02
kernel_without_all: 0 # all migrated at WP02
Schema-shape example with placeholders (the gate's negative case)
This block illustrates the schema shape for documentation purposes with placeholder values where the live count is operator-dependent. The contract round-trip gate parses this block and asserts it correctly FAILS validation (placeholders are strings, not the int the schema demands). This pattern doubles as a regression-test for the validator: if the validator silently coerced strings to ints (a bug class), this block would parse cleanly and the gate would flag it.
# pydantic_model: tests.architectural.test_ratchet_baselines.BaselinesFile
# expect: invalid
# expect_message: Input should be a valid integer
test_no_dead_modules:
category_1_auto_discovered: <count-at-HEAD>
category_2_schema_generators: <count-at-HEAD>
category_3_external_entry_points: <count-at-HEAD>
category_4_compat_shims: <count-at-HEAD>
category_5_slot_holders: <count-at-HEAD>
category_6_internal_runtime: <count-at-HEAD>
category_7_grandfathered: <count-at-HEAD>
test_migration_chain_integrity:
known_line_jumps: <count-at-HEAD>
known_patch_skips: <count-at-HEAD>
test_runtime_charter_doctrine_boundary:
baseline_allowlist: <count-at-HEAD>
test_auth_transport_singleton:
allowed_direct_httpx_files: <count-at-HEAD>
test_compat_shims:
pure_shim_files: <count-at-HEAD>
test_example_round_trip:
legacy_contract_allowlist: <discovered-at-WP03>
test_all_declarations_required:
charter_without_all: 0
kernel_without_all: 0
(Initial values for this mission: WP01 implementer reads HEAD-of-mission-branch sizes and pins them. The Cat-7 value 7 reflects the FR-113 same-PR shrinkage from 10.)
Per-test, per-category interpretation
- A test with a single mutable allowlist (e.g.
test_runtime_charter_doctrine_boundary) maps to a single integer baseline. - A test with multiple categorised allowlists (e.g.
test_no_dead_modulesafter FR-112 refactor) maps to a sub-dict of per-category integers.
Per-PR baseline edit policy
Per C-004 binding:
- Growing a baseline requires a one-line YAML diff in the same PR.
- The PR description MUST include a
rationale:line naming why growth is justified. - A PR that grows Cat-7 specifically MUST link a follow-up tracker ticket per FR-303's burn-down policy.
- Shrinkage requires no ceremony.
Output Contract
Meta-test API — tests/architectural/test_ratchet_baselines.py
The meta-test imports each gated module dynamically and inspects the size of its frozenset / dict. Pseudocode:
import yaml
import importlib
BASELINES = yaml.safe_load((Path(__file__).parent / "_baselines.yaml").read_text())
def test_no_dead_modules_per_category():
from tests.architectural.test_no_dead_modules import (
_CATEGORY_1, _CATEGORY_2, ..., _CATEGORY_7,
)
for name, current in {
"category_1_auto_discovered": len(_CATEGORY_1),
...
"category_7_grandfathered": len(_CATEGORY_7),
}.items():
baseline = BASELINES["test_no_dead_modules"][name]
if current > baseline:
pytest.fail(
f"Allowlist `{name}` grew from baseline {baseline} to {current}. "
f"Either remove the new entry OR edit _baselines.yaml from {baseline} "
f"to {current} with a justification comment in the PR."
)
elif current < baseline:
warnings.warn(
f"Allowlist `{name}` shrunk from baseline {baseline} to {current}. "
f"Edit _baselines.yaml in this PR to lock in the shrinkage."
)
Failure shape
When the meta-test FAILS (growth above baseline), the message names:
1. The test file / category (e.g. test_no_dead_modules.category_7_grandfathered). 2. The baseline value. 3. The current value. 4. The remediation hint (remove the entry OR edit the baseline with justification).
When the meta-test WARNS (shrinkage below baseline), the message names:
1. The category. 2. The new lower bound (current value). 3. The instruction to edit _baselines.yaml in this PR.
Per-test invariants
| Test | Invariant |
|---|---|
test_no_dead_modules | Per-category baselines; Cat-7 MUST shrink each major release per C-006 (≥ 2 entries per major; target 0 by 4.0) |
test_migration_chain_integrity.known_line_jumps | Cap on intentional line jumps; grows only with HiC-approved exception |
test_runtime_charter_doctrine_boundary.baseline_allowlist | Cap at 2 documented exceptions per Mission B C-004; this mission keeps it at 0 |
test_auth_transport_singleton.allowed_direct_httpx_files | NO CHANGE this mission per C-005 |
test_compat_shims.pure_shim_files | Per C-006, target 0 by 4.0 |
test_example_round_trip.legacy_contract_allowlist | Per FR-141; shrinks over time as legacy contracts gain frontmatter |
test_all_declarations_required.{charter,kernel}_without_all | After WP02 lands, MUST stay at 0 |
Failure modes
| Trigger | Reporter | Operator message |
|---|---|---|
New entry added to _ALLOWLIST without baseline edit | test_ratchet_baselines.py::test_no_dead_modules_per_category FAIL | "Allowlist category_<n>_<name> grew from baseline <b> to <c>. Either remove the new entry OR edit _baselines.yaml from <b> to <c> with a justification comment in the PR." |
Entry removed from _ALLOWLIST without baseline edit | test_ratchet_baselines.py WARN | "Allowlist category_<n>_<name> shrunk from baseline <b> to <c>. Edit _baselines.yaml in this PR to lock in the shrinkage." |
_baselines.yaml missing | test_ratchet_baselines.py collection error | "tests/architectural/_baselines.yaml is missing. This file is a binding ratchet artefact per C-004; restore it from the previous commit OR run the WP01 bootstrap script." |
_baselines.yaml malformed (e.g. wrong key for test_no_dead_modules) | pydantic.ValidationError via the BaselinesFile model | Per pydantic; names the offending key |
Backward compatibility guarantee
_baselines.yamlis additive: introducing it does not modify any existing allowlist nor any pre-existing test's pass/fail behaviour.- The meta-test
test_ratchet_baselines.pyis a separate test file; existing CI does not regress because the gated tests themselves are unchanged in semantics (only categorised in FR-112's refactor). - For tests not yet listed in
_baselines.yaml(e.g. a future gate added in a follow-up mission), the meta-test treats them as "no baseline pinned" and skips them with apytest.skipreason — they participate in the burn-down model only when an explicit baseline entry is added.
Charter pinning (per FR-303(a) / C-004 binding)
After Mission C ships, the project charter (.kittify/charter/charter.md) carries:
> Burn-down policy. Per-category allowlist sizes recorded in tests/architectural/_baselines.yaml may shrink between releases but never grow except via documented exception (rationale comment + tracker reference). Cat-7 (grandfathered orphans) MUST shrink by ≥2 entries per major release; target = 0 by 4.0. Pure-shim files in compat/_adapters/ MUST go to 0 by 4.0.
ATDD anchors
tests/architectural/test_ratchet_baselines.py(the meta-test itself; FR-111)tests/architectural/test_no_dead_modules.py(refactored per FR-112; the gate the meta-test guards)tests/contract/test_example_round_trip.py(legacy-contract-allowlist shrinkage participates per FR-141)
workflow-sequence-schema.md
Contract — Workflow Sequence Schema
> Mission: slice-f-multi-context-extensibility-01KRX5C8 > Closes: FR-012, FR-013, FR-014, FR-015 | Companions: contract-round-trip-frontmatter.md > Data model: ../data-model.md §5, ../data-model.md §6
Workflow sequence is the first-class artifact representing a mission's action sequence (specify → plan → tasks → implement → review → merge today). Slice F promotes it from a hardcoded constant to a doctrine-side YAML.
Input Contract
Operator-facing surface — workflow YAML
A workflow lives at src/doctrine/workflows/<workflow_id>.workflow.yaml. The default (byte-stable with today) is shipped at src/doctrine/workflows/software-dev-default.workflow.yaml.
Example: default workflow (byte-stable, C-008)
# pydantic_model: specify_cli.next._internal_runtime.workflow_schema.WorkflowSequence
# expect: valid
workflow_id: software-dev-default
description: |
The default Spec Kitty action sequence: specify -> plan -> tasks ->
implement -> review -> merge. Byte-stable with the pre-Slice-F
hardcoded behaviour (Mission C C-008).
version: 1
initial: specify
actions:
- action_name: specify
next: [plan]
description: "Author the mission specification."
- action_name: plan
next: [tasks]
description: "Author the implementation plan."
- action_name: tasks
next: [implement]
description: "Decompose the plan into work packages."
- action_name: implement
next: [review]
description: "Execute the next ready work package."
- action_name: review
next: [merge]
description: "Review the implemented work package."
- action_name: merge
next: []
description: "Merge approved work packages."
terminal: true
Example: team workflow with an extra design-review step (fixture for AC-4)
# pydantic_model: specify_cli.next._internal_runtime.workflow_schema.WorkflowSequence
# expect: valid
workflow_id: our-team-design-first
description: "Team workflow with mandatory design-review between plan and tasks."
version: 1
initial: specify
actions:
- action_name: specify
next: [plan]
description: "Author the mission specification."
- action_name: plan
next: [design-review]
description: "Author the implementation plan."
- action_name: design-review
next: [tasks]
description: "Design lead reviews the plan before task decomposition."
- action_name: tasks
next: [implement]
description: "Decompose into work packages."
- action_name: implement
next: [review]
description: "Execute the next ready work package."
- action_name: review
next: [merge]
description: "Review the implemented work package."
- action_name: merge
next: []
description: "Merge."
terminal: true
Invalid example: dangling next reference
# pydantic_model: specify_cli.next._internal_runtime.workflow_schema.WorkflowSequence
# expect: invalid
workflow_id: bogus
description: "Has a dangling next reference."
version: 1
initial: specify
actions:
- action_name: specify
next: [does-not-exist] # ← FR-012 invariant violated
description: "Bogus."
Operator-facing surface — meta.json.workflow_id
A mission's meta.json carries an optional workflow_id field (FR-013):
{
"mission_id": "01KRX5C8MQRGG7WJW1YK53DTF5",
"mission_slug": "...",
"workflow_id": "our-team-design-first"
}
Absent or null ⇒ resolves to software-dev-default (NEW-2: permanent default).
Output Contract
Registry API
from specify_cli.next._internal_runtime.workflow_registry import get_workflow
workflow = get_workflow("software-dev-default")
# workflow.actions[0].action_name == "specify"
# workflow.actions[0].next == ["plan"]
Resolution at spec-kitty next time
1. Read kitty-specs/<mission>/meta.json and extract workflow_id (default None). 2. If workflow_id is None, resolve to software-dev-default. 3. Look up the workflow via get_workflow(workflow_id). 4. Determine the current action from the mission's lane state (existing logic, unchanged). 5. Compute the next action from the workflow's action graph (the next list of the current action's ActionStep; first element for linear interpretation). 6. Return the next action via the existing NextDecision / prompt-builder pipeline.
Byte-stability guarantee (C-008, FR-014)
For every (current_action, next_action) transition the pre-Slice-F hardcoded sequence produced, the software-dev-default workflow MUST produce the same transition. Pinned by tests/specify_cli/next/test_workflow_software_dev_default_is_byte_stable.py:
HARDCODED_TRANSITIONS = [
("specify", "plan"),
("plan", "tasks"),
("tasks", "implement"),
("implement", "review"),
("review", "merge"),
]
def test_default_workflow_byte_stable():
workflow = get_workflow("software-dev-default")
transitions = [(a.action_name, a.next[0]) for a in workflow.actions if a.next]
assert transitions == HARDCODED_TRANSITIONS
Failure modes
| Trigger | Exception | Operator message |
|---|---|---|
meta.json.workflow_id references an unknown id | UnknownWorkflowError | "Unknown workflow id <id>. Available workflows: <list-from-src/doctrine/workflows/>." (FR-015 — no silent fallback) |
Workflow YAML has a dangling next reference | pydantic.ValidationError at load time | Per pydantic; names the offending action and the dangling reference |
| Workflow YAML has a cyclic action graph | WorkflowCycleError | "Workflow <id> action graph contains a cycle: <cycle>. Action graphs MUST be acyclic." |
software-dev-default.workflow.yaml is missing from src/doctrine/workflows/ | WorkflowRegistryError | "Default workflow software-dev-default is missing from src/doctrine/workflows/. This is a Spec Kitty installation defect; reinstall the package." (regression-catching only; shipped files are expected present) |
Workflow YAML declares terminal: true AND a non-empty next | pydantic.ValidationError | Per FR-012 invariant; names the action |
version is not 1 | WorkflowVersionUnsupportedError | "Workflow <id> declares schema version <n>; this Spec Kitty release supports version 1 only." (forward-compat hook for future schema extensions, RR-9) |
Backward compatibility guarantee
- Pre-Slice-F missions (every mission with
meta.jsonlackingworkflow_id) continue to work unchanged — they implicitly resolve tosoftware-dev-default(NEW-2 binding). - No silent semantic drift between the hardcoded path and the default-via-YAML path (C-008 byte-stability).
- The Mission B test surfaces in
tests/specify_cli/next/test_wp_prompt_governance_contract.pypass unchanged (NFR-001). - No retroactive migration is run on historical missions (C-002 forward-only).
ATDD anchors
tests/specify_cli/next/test_workflow_registry.py(unit; load + cache + unknown-id hard-fail; FR-012, FR-015)tests/specify_cli/next/test_workflow_software_dev_default_is_byte_stable.py(C-008, FR-014)tests/integration/test_workflow_sequence_runtime.py(Scenario 3; AC-4 — uses theour-team-design-firstfixture)tests/contract/test_example_round_trip.py(exercises theexpect: validandexpect: invalidexamples above via FR-140)