Contracts
host-surface-inventory.md
Contract: Host-Surface Inventory Matrix
Mission: phase-4-closeout-host-surfaces-and-trail-01KPWA5X Covers: FR-001 (inventory), FR-002 (parity), FR-006 (guidance-gap coverage), NFR-003 (100% surface coverage) Living location: kitty-specs/phase-4-closeout-host-surfaces-and-trail-01KPWA5X/artifacts/host-surface-inventory.md Promoted location: docs/host-surface-parity.md (created by WP05 on mission close)
Purpose
One authoritative matrix, per mission, that lists every supported host surface and its parity status against the advise/ask/do governance-injection contract. Drives WP02–WP04 scope during Tranche A; becomes the durable operator-facing reference at closeout.
Row schema
Columns in this exact order:
| # | Column | Type | Allowed values | Notes |
|---|---|---|---|---|
| 1 | surface_key | str | One of the keys in AGENT_DIRS from src/specify_cli/upgrade/migrations/m_0_9_1_complete_lane_migration.py | Canonical host key. |
| 2 | directory | str | e.g. .claude/commands/, .agents/skills/spec-kitty.advise/ | Relative to repo root. |
| 3 | kind | str | slash_command or agent_skill | Derived from the surface category. |
| 4 | has_advise_guidance | bool | yes / no | Does the surface teach when to call advise/ask/do? |
| 5 | has_governance_injection | bool | yes / no | Does the surface teach how to inject governance_context_text? |
| 6 | has_completion_guidance | bool | yes / no | Does the surface teach how to call profile-invocation complete? |
| 7 | guidance_style | str | inline or pointer | inline hosts the content; pointer links to the canonical skill pack. |
| 8 | parity_status | str | at_parity, partial, or missing | Composite judgement from columns 4–7. |
| 9 | notes | str | free text | Captures per-surface rationale — especially required for pointer style per FR-006. |
Canonical host surface list
The 15 supported surfaces are:
Slash-command surfaces (13)
| surface_key | directory | subdir |
|---|---|---|
claude | .claude/ | commands/ |
copilot | .github/ | prompts/ |
gemini | .gemini/ | commands/ |
cursor | .cursor/ | commands/ |
qwen | .qwen/ | commands/ |
opencode | .opencode/ | command/ |
windsurf | .windsurf/ | workflows/ |
kilocode | .kilocode/ | workflows/ |
auggie | .augment/ | commands/ |
roo | .roo/ | commands/ |
q | .amazonq/ | prompts/ |
kiro | .kiro/ | prompts/ |
agent | .agent/ | workflows/ |
Agent Skills surfaces (2)
| surface_key | directory |
|---|---|
codex | .agents/skills/ (reads from tree directly) |
vibe | .agents/skills/ (via .vibe/config.toml::skill_paths) |
Parity judgement rubric
| parity_status | Condition |
|---|---|
at_parity | All three guidance flags yes and the surface matches the content shape shipped in .agents/skills/spec-kitty.advise/SKILL.md and src/doctrine/skills/spec-kitty-runtime-next/SKILL.md. |
partial | Some guidance flags yes, some no, or guidance is present but not aligned with the reference content shape. |
missing | All three guidance flags no. |
Example row (worked)
| claude | .claude/commands/ | slash_command | yes | yes | yes | inline | at_parity | Priority slice shipped in 3.2.0a5. Uses src/doctrine/skills/spec-kitty-runtime-next/SKILL.md content. |
| copilot | .github/prompts/ | slash_command | no | no | no | pointer | missing | No governance-injection block present. WP04 will add a pointer to the canonical skill pack; .github/prompts/ is read into Copilot context via workspace-level prompts only. |
Promotion rules (WP05)
When Tranche A closes:
1. Copy the matrix verbatim to docs/host-surface-parity.md. 2. Add a short preamble to the promoted doc explaining what the matrix is and how it is kept up to date (any new host integration MUST add a row). 3. Link the promoted doc from:
docs/trail-model.md(under "Host surfaces that teach the trail" subsection).- README governance section (one-line pointer).
Acceptance
- Mechanical:
tests/specify_cli/docs/test_host_surface_inventory.pyparsesdocs/host-surface-parity.mdafter WP05, asserts everysurface_keyfromAGENT_DIRShas exactly one row, and asserts each row'sparity_statusis one of the three allowed values. Covers FR-001 / NFR-003. - Textual: Each row with
parity_status != "at_parity"must have a non-emptynotescolumn explaining the gap and the remediation plan. Covers FR-006.
profile-invocation-complete.md
Contract: spec-kitty profile-invocation complete (extended)
Mission: phase-4-closeout-host-surfaces-and-trail-01KPWA5X Covers: FR-007 (correlation), FR-009 (mode enforcement at promotion), FR-012 (local-first invariant) Target file: src/specify_cli/cli/commands/profile_invocation.py (CLI) + src/specify_cli/invocation/executor.py::complete_invocation (runtime)
Purpose
Close an open profile invocation record and, optionally:
1. Promote its output to a Tier 2 evidence artifact (existing behaviour; now mode-gated). 2. Attach one or more correlation links to the invocation JSONL — artifact references and/or a single commit SHA (new).
Command Shape
spec-kitty profile-invocation complete \
--invocation-id <id> \
[--outcome done|failed|abandoned] \
[--evidence <path>] \
[--artifact <path>]... \
[--commit <sha>] \
[--json]
Flag semantics
| Flag | Cardinality | Type | Required | Default | Description |
|---|---|---|---|---|---|
--invocation-id | 1 | str (ULID) | yes | — | Target invocation file. |
--outcome | 1 | enum done / failed / abandoned | no | None | Recorded on the completed event. |
--evidence | 1 | path-or-string | no | None | Tier 2 promotion trigger. Mode-gated: rejected for advisory / query. |
--artifact | ≥ 0 | path-or-string | no | — | Repeatable. Each value appends one artifact_link event. |
--commit | 0 or 1 | str (SHA) | no | None | Singular. Appends one commit_link event. |
--json | flag | bool | no | false | Existing behaviour — JSON output. |
Execution order
On a successful invocation with all flags present, the runtime performs these steps in order:
1. Read started event (first line of the invocation JSONL). 2. Mode enforcement (FR-009): if --evidence is set and the derived mode_of_work ∈ {advisory, query}, raise InvalidModeForEvidenceError and exit without appending any new lines. 3. Append completed event (existing write_completed). 4. If --evidence is set (and mode check passed): resolve + normalise ref, then call existing promote_to_evidence(). 5. For each --artifact <path>: resolve + normalise, append artifact_link event. 6. If --commit <sha>: append commit_link event. 7. Submit completed to SaaS propagator (existing behaviour). Correlation events are also submitted to the propagator, but projection is subject to POLICY_TABLE (see projection-policy.md).
Why steps 5 and 6 run after step 3
The append-only invariant holds in both orderings, but closing the invocation first lets readers distinguish completed from the correlation tail. Running correlation writes after the completed event also means a filesystem failure on a correlation write leaves the invocation in a fully-closed state — a retry of complete with the remaining correlation flags is a clean append, not a recovery.
Error shapes
| Condition | Error class | Exit code | Message guidance |
|---|---|---|---|
--invocation-id points to missing file | InvocationError | 2 | "Invocation record not found: <id>." |
| Already-completed invocation | AlreadyClosedError | 2 | Existing. |
--evidence supplied on advisory or query | InvalidModeForEvidenceError (new) | 2 | "Cannot promote evidence on invocation <id>: mode is <mode>; Tier 2 evidence is only allowed on task_execution or mission_step invocations." |
| Filesystem write fails for any event | InvocationWriteError | 2 | Existing. |
Ref normalisation (for --artifact and --evidence)
Per data-model.md §6:
- Resolve the input path. If resolution succeeds and the resolved path is under
repo_root, persist the repo-relative string. - Otherwise persist the absolute resolved path.
- If resolution raises (malformed input), persist the input verbatim — same fallback the existing
executor.complete_invocationalready uses for unreadable evidence refs.
This normalisation rule applies uniformly to both --artifact and --evidence, so correlation refs and evidence refs read the same.
JSON output (when --json is set)
Response shape extended to report appended correlation events:
{
"result": "success",
"invocation_id": "01HXYZ...",
"outcome": "done",
"evidence_ref": "kitty-specs/042-foo/evidence/snapshot.md",
"artifact_links": ["kitty-specs/042-foo/tasks/WP03.md", "build/report.html"],
"commit_link": "a1b2c3d4e5f67890..."
}
evidence_refis present only when--evidencewas supplied and promotion succeeded.artifact_linksis an array (possibly empty). Order matches the input order of--artifactflags.commit_linkis a string ornull.- Existing fields (
result,invocation_id,outcome) are unchanged.
On error, the existing JSON error envelope shape is used.
Invariants
- No existing JSONL line is mutated. All new events are append-only (C-004).
- Tier 1 unconditional. The
completedevent is written before any correlation append; a filesystem failure on correlation does not leave the invocation half-closed. Tier 1 must keep working with SaaS disabled, unauthenticated, or network-unreachable (C-002, FR-012). - No new top-level command. This is a flag extension on an existing subcommand (C-008).
- Backwards-compatible. Omitting
--artifactand--commityields identical behaviour to 3.2.0a5. Pre-mission invocations (nomode_of_workfield onstarted) accept--evidence—Nonemode skips enforcement.
Acceptance tests (selected)
These tests live in tests/specify_cli/invocation/test_correlation.py (new) and tests/specify_cli/invocation/test_invocation_e2e.py (extended):
1. complete with two --artifact values appends two artifact_link events in order. 2. complete with --commit abc123 appends exactly one commit_link event. 3. complete with --artifact kitty-specs/042/spec.md (under checkout) persists "kitty-specs/042/spec.md". 4. complete with --artifact /tmp/report.log (outside checkout) persists "/tmp/report.log". 5. complete with --artifact ./build/out.log persists "build/out.log" (repo-relative). 6. complete on an advisory invocation with --evidence path raises InvalidModeForEvidenceError; no completed event, no evidence artifact, no correlation events written. 7. complete on a task_execution invocation with --evidence path --artifact other --commit sha writes (in order) completed → artifact_link → commit_link, and promotes evidence to .kittify/evidence/<id>/. 8. complete with sync disabled writes all events locally; .kittify/events/propagation-errors.jsonl remains empty. 9. Second call to complete for the same invocation_id raises AlreadyClosedError before any mutation (existing behaviour preserved).
projection-policy.md
Contract: src/specify_cli/invocation/projection_policy.py
Mission: phase-4-closeout-host-surfaces-and-trail-01KPWA5X Covers: FR-010 (SaaS read-model policy), FR-012 (local-first invariant), NFR-005 (typed + mypy-strict), NFR-007 (propagation-error quiet invariant)
Purpose
Define the single source of truth for how each (mode_of_work, event) pair projects to the SaaS timeline. Consumed by src/specify_cli/invocation/propagator.py::_propagate_one after the existing sync-gate. Documented for operators in docs/trail-model.md.
Public API
# Re-exported from projection_policy.py for caller convenience.
from specify_cli.invocation.modes import ModeOfWork
class EventKind(str, Enum):
STARTED = "started"
COMPLETED = "completed"
ARTIFACT_LINK = "artifact_link"
COMMIT_LINK = "commit_link"
@dataclass(frozen=True)
class ProjectionRule:
project: bool
include_request_text: bool
include_evidence_ref: bool
POLICY_TABLE: dict[tuple[ModeOfWork, EventKind], ProjectionRule]
def resolve_projection(mode: ModeOfWork | None, event: EventKind) -> ProjectionRule: ...
The module exports exactly these symbols. Nothing else is public.
Table authority
POLICY_TABLE is the complete enumeration of 4 modes × 4 events = 16 entries. See data-model.md §5 for the full table.
Golden-path invariants (contract tests):
| Row | Rule |
|---|---|
(TASK_EXECUTION, STARTED) | ProjectionRule(True, True, False) |
(TASK_EXECUTION, COMPLETED) | ProjectionRule(True, True, True) |
(MISSION_STEP, STARTED) | ProjectionRule(True, True, False) |
(MISSION_STEP, COMPLETED) | ProjectionRule(True, True, True) |
Any change to these four rows requires an ADR and a migration note — they govern existing dashboard behaviour for active missions.
Expected zero-projection rows:
| Row | Rule |
|---|---|
any (QUERY, *) | project=False |
(ADVISORY, ARTIFACT_LINK) | project=False |
(ADVISORY, COMMIT_LINK) | project=False |
Query invocations and advisory correlation events produce no SaaS timeline traffic.
resolve_projection() semantics
def resolve_projection(mode: ModeOfWork | None, event: EventKind) -> ProjectionRule:
effective_mode = mode if mode is not None else ModeOfWork.TASK_EXECUTION
return POLICY_TABLE.get((effective_mode, event), _DEFAULT_RULE)
mode is None→ treated asTASK_EXECUTION. Rationale: pre-mission records projected under the old unconditional behaviour, which was effectively(TASK_EXECUTION, event)projection. Preserving that behaviour on upgrade means no dashboard regression.- Unknown
(mode, event)pair → falls back to_DEFAULT_RULE = ProjectionRule(True, True, True). In practice the table is exhaustive for the enums as defined, so this path is only hit if a futureEventKindvalue is added without the policy table being extended.
Consumer contract — _propagate_one
Modified sequence (diff from 3.2.0a5):
def _propagate_one(record: InvocationRecord_or_EventDict, repo_root: Path) -> None:
# 1. Existing sync-gate — unchanged. Short-circuit on sync disabled.
routing = resolve_checkout_sync_routing(repo_root)
if routing is not None and not routing.effective_sync_enabled:
return
# 2. Existing auth/client lookup — unchanged.
client = _get_saas_client(repo_root)
if client is None:
return
# 3. NEW: consult policy.
mode = _extract_mode(record) # returns ModeOfWork | None
event = _extract_event(record) # returns EventKind
rule = resolve_projection(mode, event)
if not rule.project:
return
# 4. Existing envelope build — now respects include_request_text / include_evidence_ref.
...
The helper _extract_mode reads record.mode_of_work for InvocationRecord inputs and the mode_of_work key from the stored started event when the input is a correlation event dict. _extract_event maps the event field to EventKind.
Envelope field gating
When rule.include_request_text is False, the envelope for started events omits the request_text key entirely. Omission, not empty string — this keeps dashboard consumers able to distinguish "advisory started" (no body) from "task_execution started with empty request" (present body, empty).
Same rule for rule.include_evidence_ref: omit on False, include on True (only relevant for completed events where evidence_ref is present).
Invariants
- Policy evaluation is read-only. It never writes to disk, never raises an uncaught exception, and never blocks.
- Policy evaluation runs after the sync-gate. If sync is disabled for the checkout (
effective_sync_enabled=False),resolve_projectionis never called — the short-circuit still owns the gate (C-002, FR-012). - Policy evaluation runs after authentication. Unauthenticated checkouts never reach policy evaluation.
- Type exhaustiveness.
mypy --strictpasses;ModeOfWorkandEventKindare closed sets. - Frozen dataclass.
ProjectionRuleis frozen so policy rows are shareable and immutable. - No operator-configurable override. This mission does not introduce YAML or env-var overrides (C-009, D4).
Acceptance tests (selected)
These tests live in tests/specify_cli/invocation/test_projection_policy.py (new) and extensions to test_invocation_e2e.py:
1. Every (ModeOfWork, EventKind) pair has an entry in POLICY_TABLE. 2. Golden-path rows match the expected rules (see table above). 3. resolve_projection(None, EventKind.STARTED) returns the TASK_EXECUTION / STARTED rule (null-tolerance). 4. _propagate_one with a mocked connected WebSocket client:
5. With sync disabled (effective_sync_enabled=False), resolve_projection is not called and no envelope is built — verified by mock assertion. 6. With user unauthenticated, resolve_projection is not called — verified by mock assertion. 7. propagation-errors.jsonl remains empty across 100 invocations under all four modes with sync disabled (NFR-007, SC-008).
- Drops advisory
artifact_linkevents (nosend_eventcall). - Emits
task_executionstartedevents (onesend_eventcall, envelope includesrequest_text). - Emits
task_executioncompletedevents (envelope includesevidence_refwhen supplied).