Charter Contract Cleanup Tranche 1
Mission ID: 01KQATS4HJS0HY02VF05FCP57T Mission Slug: charter-contract-cleanup-tranche-1-01KQATS4 Mission Type: software-dev Target Branch: main Created: 2026-04-28
Purpose
TLDR
Make the landed Charter product gate truthful and CI-green via JSON synthesis contract fixes, golden-path E2E prompt-file hardening, and CI mypy hygiene — one product-repo PR.
Context
Release PR #864 shipped the Charter epic, but current main still has Charter CLI --json envelope contract gaps (warning leakage on stdout, missing result / adapter / written_artifacts fields, dry-run paths derived from placeholder PROJECT_000), a weak prompt-file assertion in the golden-path E2E (#844), and a CI Quality e2e-cross-cutting failure caused by missing mypy in that job.
This mission produces one PR against Priivacy-ai/spec-kitty:main from branch fix/charter-827-contract-cleanup that closes those contract truths, hardens the E2E, restores green CI, and verifies (does not rewrite) the retrospective and path-traversal regression guards.
External end-to-end coverage (#827 Tranche 3), plain-English acceptance scenarios, end-user documentation (#828), and Phase 7 schema/provenance hardening (#469) are explicit later tranches and are out of scope for this mission.
Domain Language
| Canonical term | Meaning | Avoid these synonyms |
|---|---|---|
| Charter synthesize CLI | The spec-kitty charter synthesize command surface, including --json and --adapter fixture flags | "synth", "charter compile" |
| Synthesis envelope | The JSON object emitted to stdout when --json is set | "synth output", "synth result blob" |
| Written artifacts | The list of doctrine files actually staged or promoted by a synthesize run | "outputs", "files", "produced docs" |
| Issued action | A kind=step envelope produced by the runtime that requires the agent to perform a prompted action | "step", "task envelope" |
| Blocked decision | A runtime envelope that halts a step with a non-empty human-readable reason | "halt", "stopped step" |
| Prompt file | The on-disk prompt artifact that an issued action points to | "prompt", "instruction file" |
| Dry-run staged path | The artifact path that the synthesize pipeline reports during --dry-run and that must match the path used by a real run | "preview path", "planned path" |
| Regression guard | A previously-fixed bug whose fix must remain intact (verify, do not rewrite) | "old fix", "legacy patch" |
User Scenarios & Testing
Primary Scenario: Charter synthesize emits a strict, contracted JSON envelope
Actor: A developer or automation that runs spec-kitty charter synthesize --adapter fixture --json against a fresh project to seed Charter doctrine.
Trigger: The user invokes the synthesize command with --json set.
Happy path:
1. The command runs synthesis (including evidence gathering). 2. The command emits exactly one JSON document to stdout — nothing else, even if evidence warnings were produced. 3. The JSON document contains, at minimum: result: "success", adapter with id and version, written_artifacts (the real list of staged/promoted files), and warnings (possibly empty). 4. The exit code is 0. 5. The caller can pass the stdout into json.loads(...) without preprocessing.
Acceptance:
- AC-001 The full stdout parses as a single JSON document via
json.loads(stdout)even when evidence warnings exist. - AC-002 The success envelope contains the four contracted fields named above; missing any one of them is a hard failure.
- AC-003
written_artifactslists files that the run actually staged or promoted; no entry is fabricated fromkind:slugselectors.
Scenario: Dry-run paths match real-run paths
Actor: A developer running spec-kitty charter synthesize --dry-run --json to preview what would be written.
Trigger: The user wants to know the exact target paths before allowing a write.
Happy path:
1. The dry-run reports a list of staged target paths. 2. A subsequent non-dry-run with the same inputs writes to exactly those paths. 3. No user-visible string contains the placeholder PROJECT_000.
Acceptance:
- AC-004 For an input where provenance yields a non-placeholder artifact ID (for example
PROJECT_001), the dry-run envelope reports the same target path the non-dry-run would write. - AC-005 No
--jsonenvelope, log message, or error message visible to the user containsPROJECT_000.
Scenario: Golden-path E2E rejects empty prompt files
Actor: CI / a developer running the Charter epic golden-path E2E.
Trigger: tests/e2e/test_charter_epic_golden_path.py runs to completion against the real synthesizer.
Happy path:
1. The E2E inspects every issued-action (kind=step) envelope. 2. For each issued action, the test asserts a prompt_file (or documented public equivalent) field exists, is not None, is not empty, and resolves to a real prompt file on disk. 3. For every blocked decision, the test asserts a non-empty reason is present and does not require a prompt file. 4. If any issued action lacks a resolvable prompt file, the test fails with a clear message.
Acceptance:
- AC-006 An issued action whose prompt path is missing,
None, empty, or unresolvable causes the E2E to fail, not silently pass. - AC-007 A blocked decision with a non-empty
reasonand no prompt file passes that check.
Scenario: CI e2e-cross-cutting is honest about mypy
Actor: CI on a PR.
Trigger: The e2e-cross-cutting job runs tests/cross_cutting/test_mypy_strict_mission_step_contracts.py.
Happy path:
1. Either mypy is available in the job and the test runs and passes against the real strict configuration; or 2. mypy is intentionally absent from the job, in which case the test reports a clear, actionable skip or failure that names the missing extra and how to install it. Silent passes are not acceptable.
Acceptance:
- AC-008 The
e2e-cross-cuttingjob exits cleanly when run against current main with this mission's fixes applied; nopython -m mypy: not foundfailures remain unaddressed.
Scenario: Regression guards verified intact
Actor: The implementing agent during this mission.
Trigger: The mission begins.
Happy path:
1. Before touching any product code, the agent runs the regression-guard test suite listed in NFR-001 against current main. 2. All listed tests pass; no edits are made to those code paths. 3. If any regression guard fails, that finding is treated as a new in-scope item and a fix is added to this mission.
Acceptance:
- AC-009
tests/next/test_retrospective_terminus_wiring.py,tests/retrospective/test_gate_decision.py, andtests/doctrine_synthesizer/test_path_traversal_rejection.pyall pass on the feature branch with no rewrites of their underlying production code unless a regression was actually observed.
Edge Cases & Exception Paths
- A synthesize run that produces zero written artifacts must still emit a contracted envelope with
written_artifacts: [], not omit the field. - A synthesize run with multiple warnings must place every warning inside the JSON envelope's
warningsarray (or send all of them to stderr) — partial leakage is a contract failure. - A blocked decision that also carries a prompt path is acceptable; a blocked decision without a
reasonis not. - The
Protect Main Branchfailure observed on the release merge is treated as release-process hygiene; if confirmed not to be product-code-fixable, it is filed or updated as a separate GitHub issue rather than addressed in this PR. - Any command path that touches hosted auth, tracker, SaaS sync, or sync behavior must be invoked with
SPEC_KITTY_ENABLE_SAAS_SYNC=1set in the environment.
Functional Requirements
| ID | Requirement | Status |
|---|---|---|
| FR-001 | charter synthesize --json emits exactly one JSON document to stdout when --json is set, even when evidence warnings are produced. Warnings are placed inside the envelope (preferred) or routed to stderr; nothing non-JSON is written to stdout before, after, or interleaved with the envelope. | Pending |
| FR-002 | charter synthesize --adapter fixture --json success envelope includes the contracted fields result, adapter (containing id and version), written_artifacts, and warnings. Useful legacy fields (for example target_kind, target_slug, inputs_hash, adapter_id, adapter_version) may remain for compatibility, but the four contracted fields are mandatory. | Pending |
| FR-003 | written_artifacts is sourced from the actual staged-and-promoted artifact list returned by the write pipeline (typed staged-artifact entries or the real stage paths). It is never derived from kind:slug selectors or any other lossy reconstruction. For fresh-project seed mode, the already-known written file list is used. | Pending |
| FR-004 | The --dry-run --json envelope reports target paths identical to the paths a subsequent non-dry-run with the same inputs would write. Dry-run output is driven by typed staged-artifact entries, not from placeholder identifiers. | Pending |
| FR-005 | No user-visible CLI surface (stdout, stderr, JSON envelope, log line, error message) contains the placeholder string PROJECT_000. The placeholder remains an internal-only token. | Pending |
| FR-006 | The Charter epic golden-path E2E (tests/e2e/test_charter_epic_golden_path.py) asserts that every kind=step / issued-action envelope carries a prompt_file field (or documented public equivalent) with a value that is present, non-null, non-empty, and resolves to an existing prompt file on disk (test-project-relative path or absolute path that exists, or a documented shipped prompt artifact). | Pending |
| FR-007 | The same E2E permits a blocked decision to carry a non-empty reason instead of a prompt file. A blocked decision without a non-empty reason is a test failure. | Pending |
| FR-008 | tests/cross_cutting/test_mypy_strict_mission_step_contracts.py runs cleanly in the e2e-cross-cutting CI job: either mypy is available in that job environment and the test passes, or the test deterministically skips/fails with an actionable message that names the missing lint extra. The test no longer fails because python -m mypy is silently absent. | Pending |
| FR-009 | The retrospective and path-traversal regression guards remain intact: runtime_bridge.py gates retrospective emission before terminal completion (or uses a buffered emitter), blocked terminal completion does not emit MissionRunCompleted to the real sync emitter, retrospective/schema.py accepts uppercase/mixed-case contract IDs (for example DIRECTIVE_NEW_EXAMPLE, PROJECT_001) while still rejecting traversal, and tests/next/test_retrospective_terminus_wiring.py exercises the real opt-in runtime bridge with a recording emitter and asserts no completion event was emitted. Verification only — no rewrite unless a regression is observed. | Pending |
| FR-010 | The golden-path E2E continues to call charter synthesize --json against the real synthesizer (no hand-seeded .kittify/doctrine), _parse_first_json_object parses the full stdout via json.loads(...), and _run_next_and_assert_lifecycle hard-fails when the lifecycle log is absent. Verification only — no rewrite unless a regression is observed. | Pending |
| FR-011 | After product fixes land and CI is green, GitHub issue hygiene is applied: #844 is closed or commented with evidence only after FR-006 and FR-007 are merged; #827 receives a comment naming exactly what this tranche closed and what remains; #848 is updated if the mypy / uv-lock / environment issue is resolved or reclassified. #827 and #828 are not closed by this mission. | Pending |
| FR-012 | The mission produces exactly one PR against Priivacy-ai/spec-kitty:main from feature branch fix/charter-827-contract-cleanup, scoped to the spec-kitty repository only. | Pending |
| FR-013 | If the Protect Main Branch failure observed on the prior release merge is determined to be release-process hygiene rather than product-code-fixable, a GitHub issue is filed or updated to track it; otherwise it is fixed in this PR. The mission does not silently ignore the failure. | Pending |
Non-Functional Requirements
| ID | Requirement | Threshold | Status |
|---|---|---|---|
| NFR-001 | The required local test gate passes on the feature branch before PR open | All five commands listed in start-here.md §"Required Product-Repo Test Gate" exit 0: uv run pytest tests/e2e/test_charter_epic_golden_path.py -q; uv run pytest tests/agent/cli/commands/test_charter_synthesize_cli.py tests/integration/test_json_envelope_strict.py tests/integration/test_charter_synthesize_fresh.py -q; uv run pytest tests/next/test_retrospective_terminus_wiring.py tests/retrospective/test_gate_decision.py tests/doctrine_synthesizer/test_path_traversal_rejection.py -q; uv run pytest tests/cross_cutting/test_mypy_strict_mission_step_contracts.py -q; uv run ruff check src tests | Pending |
| NFR-002 | No regressions in CI checks that pass on current main | Every CI check that is green on origin/main at branch-cut time remains green on the PR head; net new failures are zero | Pending |
| NFR-003 | Test coverage for new code | Code added or modified in this mission carries automated test coverage at the repository's existing standard (90%+ line coverage for new/changed code, per the project policy summary) | Pending |
| NFR-004 | Strict type-check coverage for new and modified runtime code | uv run mypy --strict passes for all modified files in src/specify_cli/cli/commands/charter.py and src/charter/synthesizer/write_pipeline.py (if touched), and for src/specify_cli/mission_step_contracts/executor.py per the existing cross-cutting rule | Pending |
| NFR-005 | Hosted-surface command rule | 100% of commands in this mission's tests, fixtures, and examples that touch hosted auth, tracker, SaaS sync, or sync behavior are invoked with SPEC_KITTY_ENABLE_SAAS_SYNC=1 | Pending |
| NFR-006 | Single-PR scope discipline | The diff for this mission lives in exactly one PR; the PR touches files only inside the spec-kitty repository | Pending |
Constraints
| ID | Constraint | Status |
|---|---|---|
| C-001 | Single product-repo PR. Only the spec-kitty repository is modified in this mission. The sibling repos spec-kitty-end-to-end-testing and spec-kitty-plain-english-tests are not touched. | Active |
| C-002 | Out of scope: external E2E canaries (#827 Tranche 3 in spec-kitty-end-to-end-testing), plain-English acceptance scenarios (in spec-kitty-plain-english-tests), end-user documentation parity (#828), and Phase 7 schema versioning / provenance hardening (#469). These are explicitly later tranches. | Active |
| C-003 | Verify-do-not-rewrite. Items in start-here.md §"Already Fixed, But Must Stay Covered" are validated by running their tests; their underlying production code is not rewritten unless the verification step actually fails. | Active |
| C-004 | The Protect Main Branch failure on the release merge may be release-process hygiene rather than product-code-fixable. If so, it is captured as a GitHub issue rather than fixed inside this PR. | Active |
| C-005 | Machine rule: any test or command path that touches hosted auth, tracker, SaaS sync, or sync behavior runs with SPEC_KITTY_ENABLE_SAAS_SYNC=1 set. | Active |
| C-006 | Issue closure discipline. #827 and #828 are not closed by this mission. #844 is only closed after FR-006 and FR-007 land. #848 is updated only after the mypy/uv-lock/environment situation is resolved or reclassified. | Active |
| C-007 | Authoritative briefing source: /Users/robert/spec-kitty-dev/spec-kitty-20260428-193814-MFDsf5/start-here.md. Where this spec and start-here.md disagree on intent, start-here.md wins until the spec is updated through a documented decision. | Active |
Success Criteria
- SC-001 A caller can pipe
spec-kitty charter synthesize --jsonstdout straight into a JSON parser in 100% of fresh-project synthesize runs that previously emitted warnings, with zero parse errors caused by warning leakage. - SC-002 An automated assertion proves that every
--adapter fixture --jsonsuccess envelope containsresult,adapter,written_artifacts, andwarnings; that assertion fails the build if any field is missing. - SC-003 For every input where provenance yields a non-placeholder artifact ID, the dry-run reported target paths are byte-identical to the paths the non-dry-run writes; the placeholder
PROJECT_000appears in zero user-visible outputs. - SC-004 The Charter epic golden-path E2E fails in fewer than 30 seconds when an issued action lacks a resolvable prompt file, with a message that names the offending action and the missing/empty prompt path.
- SC-005 The
e2e-cross-cuttingCI job exits with status 0 on a PR branch built from currentmainplus this mission's changes, with nopython -m mypy: not founderrors recorded. - SC-006 The required local test gate (NFR-001) passes locally on the feature branch before PR submission, captured as terminal evidence in the PR description.
- SC-007 GitHub issues
#844,#827, and#848reflect post-mission status with evidence comments referencing this PR;#827and#828remain open. - SC-008 The PR merges into
Priivacy-ai/spec-kitty:mainwith all required CI checks green and zero net-new failing checks compared to currentmain.
Key Entities
- Synthesis envelope — the strict JSON object emitted by
spec-kitty charter synthesize --json. Containsresult,adapter,written_artifacts,warnings, and may carry legacy compatibility fields. Authored insrc/specify_cli/cli/commands/charter.py. - Staged artifact — a typed entry returned by the synthesizer's write pipeline (
src/charter/synthesizer/write_pipeline.py) describing the kind, identifier, and target path of a doctrine artifact that was staged or promoted. The envelope'swritten_artifactslist is derived from these entries. - Issued action envelope — a runtime-emitted envelope of
kind=stepinstructing the agent to perform a prompted action. Carries a prompt path field; consumed by the golden-path E2E. - Blocked decision envelope — a runtime-emitted envelope that halts a step. Must carry a non-empty
reason; may omit a prompt path. - Prompt file — an on-disk markdown/text artifact pointed at by an issued action envelope. May be test-project-relative, absolute, or a documented shipped prompt artifact path.
- Regression-guard test set —
tests/next/test_retrospective_terminus_wiring.py,tests/retrospective/test_gate_decision.py, andtests/doctrine_synthesizer/test_path_traversal_rejection.py. Verified, not rewritten.
Assumptions
- A1 The
_parse_first_json_objecthelper in the golden-path E2E currently usesjson.loads(stdout)over the full stdout (per start-here.md verification list). This mission preserves that, but FR-001 also requires the producer side to emit strict JSON regardless. - A2 The Charter synthesizer's write pipeline already returns a typed staged-artifact list (or can be extended to do so without changing public types). If the staged-artifact return shape needs to grow, edits to
src/charter/synthesizer/write_pipeline.pyare in scope. - A3 The golden-path E2E builds a real test project where prompt files have predictable, resolvable paths; the assertion in FR-006 can therefore distinguish "prompt path is wrong" from "test project setup is broken."
- A4 The
Protect Main Branchfailure described in start-here.md is not caused by code insrc/. If diagnosis proves otherwise, it is fixed in this PR (FR-013). - A5 The agent has authority to file or update GitHub issues under the rules in C-006 and FR-013, including unsetting
GITHUB_TOKENto use keyring scopes when needed (per CLAUDE.md guidance).
Out of Scope
- External end-to-end canaries in
spec-kitty-end-to-end-testing(local-only and SaaS-enabled trusted-runner canaries) —#827Tranche 3. - Plain-English acceptance scenarios in
spec-kitty-plain-english-tests—#827Tranche 3. - End-user documentation parity for the Charter epic —
#828. - Phase 7 schema versioning and provenance hardening —
#469. - Any change to repos other than
spec-kitty. - Any rewrite of regression-guard production code that is currently passing.
Dependencies
ghCLI authenticated forPriivacy-ai/spec-kitty(issue hygiene in FR-011, FR-013).uvavailable locally to run the test gate in NFR-001.- The current
origin/mainHEAD as the baseline against which "no regressions" (NFR-002) is measured. - The brief at
/Users/robert/spec-kitty-dev/spec-kitty-20260428-193814-MFDsf5/start-here.md(per C-007).
Decision Log
No deferred decisions at this time. No [NEEDS CLARIFICATION] markers were required: the user-supplied brief was comprehensive, and reasonable defaults were applied where the brief allowed (for example, choosing "warnings inside envelope" as the preferred FR-001 placement, with stderr permitted as an alternative).