architecture/2.x/adr/2026-04-25-1-shared-package-boundary.md(the shared package boundary that this gate defends downstream of)architecture/2.x/adr/2026-04-26-1-contract-pinning-resolved-version.md(paired with this gate for the in-repo contract layer)architecture/2.x/adr/2026-04-26-2-auth-transport-boundary.md(a downstream consumer of this gate's protection)
Context
Spec Kitty is a multi-repo product. The CLI (spec-kitty) imports
spec-kitty-events and spec-kitty-tracker as external PyPI
dependencies. The SaaS surface (spec-kitty-saas) consumes the same
contracts. The end-to-end repo (spec-kitty-end-to-end-testing) is the
only place where a real cross-repo workflow runs against the resolved
package versions in a clean venv.
Pre-mission, the release gating model was:
- Each package has its own CI suite (
pytest+ lint + types). - The CLI has
tests/contract/that asserts the expected shape of events/tracker payloads. - There is no required check that the resolved package versions in the
CLI's lockfile actually behave the way
tests/contract/claims they do, end to end, against a real downstream consumer.
This produced two recurring failures the maintainers paid for:
- Silent contract drift. A
spec-kitty-eventsminor bump changed envelope semantics (omitted a previously-required field, widened a type). Package-local CI passed because the contract was never cross-checked. The CLI shipped with the new version pinned inuv.lock. Downstream consumers (spec-kitty-saasingestion, the e2e harness) failed in production with a missing-field exception. - Uninitialized-repo regressions.
spec-kitty specify(and friends) silently fell back to a sibling initialized repo when invoked in a non-Spec-Kitty directory, writing artifacts into the wrong tree. The bug was real but only reproduced cleanly in a multi-directory layout the CLI's own tests did not exercise.
Package-local CI cannot catch either class. They are
cross-repo, real-process, real-filesystem failure modes. The
end-to-end repo exists precisely to catch them, but it was treated as an
optional integration check. Two recent missions
(shared-package-boundary-cutover-01KQ22DS and the present
stability-and-hygiene-hardening mission) paid for that softness in
rework.
The mission spec (FR-038, FR-039, FR-040, FR-041) and the constraint C-010 require that the e2e suite be a hard gate at mission review, with a documented operator-exception path for cases where the dev SaaS endpoint is unreachable on the reviewer's machine. This ADR locks that gate.
Decision
The cross-repo end-to-end suite at
spec-kitty-end-to-end-testing/scenarios/ is a hard pass/fail gate
during spec-kitty-mission-review, alongside tests/contract/ in the
CLI repo and the populated traceability matrix at
kitty-specs/<slug>/issue-matrix.md.
Three concrete obligations:
The mission-review skill MUST run, in order:
pytest spec-kitty/tests/contract/ -v— non-zero exit ⇒ FAILpytest spec-kitty/tests/architectural/ -v— non-zero exit ⇒ FAILpytest spec-kitty-end-to-end-testing/scenarios/ -v— non-zero exit ⇒ FAIL unless an operator exception artifact exists atkitty-specs/<slug>/mission-exception.md.- Read
kitty-specs/<slug>/issue-matrix.mdand assert every row has a non-empty verdict that is one offixed,verified-already-fixed,deferred-with-followup. Any other value, including the literalunknownor an empty cell, is FAIL.
The exception artifact has a fixed shape. A
mission-exception.mdfile underkitty-specs/<slug>/MUST:- name the operator who granted the exception;
- identify the failing scenario by file path and assertion;
- explain why the failure is environmental (e.g., dev SaaS endpoint unreachable on reviewer's machine) and not a code defect;
- record the command the operator ran to reproduce the failure;
- include a follow-up issue link or a written commitment to retry the scenario in a documented future window. The mission-review skill rejects an exception artifact that is missing any of those fields.
The exception path is bounded. An operator exception covers a specific scenario, not the whole e2e suite. A mission whose e2e suite has more than one failing scenario MUST get a separate exception entry per scenario. A mission cannot ship with a blanket "e2e is unreachable, skip the whole suite" waiver.
The four scenarios this mission ships are the floor, not the ceiling. Future missions touching cross-repo behavior MUST add scenarios that prove their behavior:
dependent_wp_planning_lane.py— FR-001 + FR-005 + FR-038uninitialized_repo_fail_loud.py— FR-032 + FR-039saas_sync_enabled.py— FR-040contract_drift_caught.py— FR-041
Consequences
Positive:
- A
spec-kitty-eventscandidate cannot be promoted to stable while the e2e harness fails on a real consumer. The four stability and contract failure modes that drove this mission can no longer ship past mission review without a documented operator exception. - The exception path is explicit and audited. "The maintainer
decided" without a
mission-exception.mdartifact is no longer sufficient. - The contract layer and the e2e layer are mutually reinforcing: contract tests pin the expected shape against the resolved version, e2e tests pin the actual shape against a running downstream consumer, the matrix pins issue-level closure to a specific test or commit.
Negative / costs:
- Mission review is slower. The e2e suite has a wall-clock floor of several minutes against a real dev SaaS endpoint. NFR-006 caps this at 20 minutes.
- The dev SaaS endpoint becomes a hard dependency for full mission
acceptance. When the endpoint is unreachable on a reviewer's
machine, the operator must produce an exception artifact instead
of silently approving. This is documented in
docs/migrations/cross-repo-e2e-gate.md. - The e2e repo is now a release-blocking dependency for the CLI and the SaaS. Its own maintenance burden goes up; it cannot bit-rot without breaking unrelated mission reviews.
- Cross-repo workflow: implementers and reviewers must commit to two repos for missions that touch cross-repo behavior. The cross-repo-e2e-gate migration doc walks operators through the workflow.
Operational consequences:
- The
spec-kitty-mission-reviewskill (source atsrc/doctrine/skills/spec-kitty-mission-review/SKILL.md) is updated by T050 to enforce the gate. - The migration doc at
docs/migrations/cross-repo-e2e-gate.mdis the operator runbook. - The four shipped e2e scenarios live in the e2e repo's
scenarios/directory and are committed there as a separate commit.
Alternatives Considered
Alternative 1: Keep e2e as advisory, lean harder on
tests/contract/. Rejected. Contract tests pin the expected shape
against an in-repo fixture. They do not run the resolved package's
real code path against a real downstream consumer in a real venv.
The two failure modes that drove this mission both passed contract
tests and failed in real usage.
Alternative 2: Run e2e in CI on every PR, not at mission review. Considered. Worth doing; not a substitute for the mission-review gate. A PR-level e2e run does not catch a multi-WP mission that introduces cross-WP contract drift, because the drift only appears when all WPs land together. The mission-review gate is the post-merge "all WPs landed" check the per-PR check cannot do. We will likely add per-PR e2e in a future ops-CI mission, on top of this gate, not instead of it.
Alternative 3: A single combined "mission acceptance" pytest suite that runs contract + architectural + e2e in one command. Rejected for now. The three suites have different dependency profiles (contract is fast, architectural is fast, e2e needs a dev SaaS endpoint). Treating them uniformly would cargo-cult e2e's slow external dependency onto every contract run. The mission-review skill orchestrates the three calls separately and reports them separately.
Alternative 4: Allow a blanket e2e waiver under
SPEC_KITTY_E2E_OPTIONAL=1. Rejected. C-010 explicitly forbids
this: "the mission MUST NOT be marked complete without either
executed e2e evidence or an explicit operator-approved exception."
A blanket env-var override would re-introduce the silent-skip mode
the gate exists to prevent.
References
- Spec:
kitty-specs/stability-and-hygiene-hardening-2026-04-01KQ4ARB/spec.md(FR-038, FR-039, FR-040, FR-041, NFR-006, C-010) - Research:
kitty-specs/stability-and-hygiene-hardening-2026-04-01KQ4ARB/research.md(D1 issue traceability shape; this ADR formalizes the gate that reads D1's matrix) - Migration:
docs/migrations/cross-repo-e2e-gate.md - Skill source:
src/doctrine/skills/spec-kitty-mission-review/SKILL.md - Issue matrix:
kitty-specs/stability-and-hygiene-hardening-2026-04-01KQ4ARB/issue-matrix.md