• architecture/2.x/adr/2026-04-25-1-shared-package-boundary.md (the shared package boundary that this gate defends downstream of)
  • architecture/2.x/adr/2026-04-26-1-contract-pinning-resolved-version.md (paired with this gate for the in-repo contract layer)
  • architecture/2.x/adr/2026-04-26-2-auth-transport-boundary.md (a downstream consumer of this gate's protection)

Context

Spec Kitty is a multi-repo product. The CLI (spec-kitty) imports spec-kitty-events and spec-kitty-tracker as external PyPI dependencies. The SaaS surface (spec-kitty-saas) consumes the same contracts. The end-to-end repo (spec-kitty-end-to-end-testing) is the only place where a real cross-repo workflow runs against the resolved package versions in a clean venv.

Pre-mission, the release gating model was:

  1. Each package has its own CI suite (pytest + lint + types).
  2. The CLI has tests/contract/ that asserts the expected shape of events/tracker payloads.
  3. There is no required check that the resolved package versions in the CLI's lockfile actually behave the way tests/contract/ claims they do, end to end, against a real downstream consumer.

This produced two recurring failures the maintainers paid for:

  • Silent contract drift. A spec-kitty-events minor bump changed envelope semantics (omitted a previously-required field, widened a type). Package-local CI passed because the contract was never cross-checked. The CLI shipped with the new version pinned in uv.lock. Downstream consumers (spec-kitty-saas ingestion, the e2e harness) failed in production with a missing-field exception.
  • Uninitialized-repo regressions. spec-kitty specify (and friends) silently fell back to a sibling initialized repo when invoked in a non-Spec-Kitty directory, writing artifacts into the wrong tree. The bug was real but only reproduced cleanly in a multi-directory layout the CLI's own tests did not exercise.

Package-local CI cannot catch either class. They are cross-repo, real-process, real-filesystem failure modes. The end-to-end repo exists precisely to catch them, but it was treated as an optional integration check. Two recent missions (shared-package-boundary-cutover-01KQ22DS and the present stability-and-hygiene-hardening mission) paid for that softness in rework.

The mission spec (FR-038, FR-039, FR-040, FR-041) and the constraint C-010 require that the e2e suite be a hard gate at mission review, with a documented operator-exception path for cases where the dev SaaS endpoint is unreachable on the reviewer's machine. This ADR locks that gate.

Decision

The cross-repo end-to-end suite at spec-kitty-end-to-end-testing/scenarios/ is a hard pass/fail gate during spec-kitty-mission-review, alongside tests/contract/ in the CLI repo and the populated traceability matrix at kitty-specs/<slug>/issue-matrix.md.

Three concrete obligations:

  1. The mission-review skill MUST run, in order:

    • pytest spec-kitty/tests/contract/ -v — non-zero exit ⇒ FAIL
    • pytest spec-kitty/tests/architectural/ -v — non-zero exit ⇒ FAIL
    • pytest spec-kitty-end-to-end-testing/scenarios/ -v — non-zero exit ⇒ FAIL unless an operator exception artifact exists at kitty-specs/<slug>/mission-exception.md.
    • Read kitty-specs/<slug>/issue-matrix.md and assert every row has a non-empty verdict that is one of fixed, verified-already-fixed, deferred-with-followup. Any other value, including the literal unknown or an empty cell, is FAIL.
  2. The exception artifact has a fixed shape. A mission-exception.md file under kitty-specs/<slug>/ MUST:

    • name the operator who granted the exception;
    • identify the failing scenario by file path and assertion;
    • explain why the failure is environmental (e.g., dev SaaS endpoint unreachable on reviewer's machine) and not a code defect;
    • record the command the operator ran to reproduce the failure;
    • include a follow-up issue link or a written commitment to retry the scenario in a documented future window. The mission-review skill rejects an exception artifact that is missing any of those fields.
  3. The exception path is bounded. An operator exception covers a specific scenario, not the whole e2e suite. A mission whose e2e suite has more than one failing scenario MUST get a separate exception entry per scenario. A mission cannot ship with a blanket "e2e is unreachable, skip the whole suite" waiver.

The four scenarios this mission ships are the floor, not the ceiling. Future missions touching cross-repo behavior MUST add scenarios that prove their behavior:

  • dependent_wp_planning_lane.py — FR-001 + FR-005 + FR-038
  • uninitialized_repo_fail_loud.py — FR-032 + FR-039
  • saas_sync_enabled.py — FR-040
  • contract_drift_caught.py — FR-041

Consequences

Positive:

  • A spec-kitty-events candidate cannot be promoted to stable while the e2e harness fails on a real consumer. The four stability and contract failure modes that drove this mission can no longer ship past mission review without a documented operator exception.
  • The exception path is explicit and audited. "The maintainer decided" without a mission-exception.md artifact is no longer sufficient.
  • The contract layer and the e2e layer are mutually reinforcing: contract tests pin the expected shape against the resolved version, e2e tests pin the actual shape against a running downstream consumer, the matrix pins issue-level closure to a specific test or commit.

Negative / costs:

  • Mission review is slower. The e2e suite has a wall-clock floor of several minutes against a real dev SaaS endpoint. NFR-006 caps this at 20 minutes.
  • The dev SaaS endpoint becomes a hard dependency for full mission acceptance. When the endpoint is unreachable on a reviewer's machine, the operator must produce an exception artifact instead of silently approving. This is documented in docs/migrations/cross-repo-e2e-gate.md.
  • The e2e repo is now a release-blocking dependency for the CLI and the SaaS. Its own maintenance burden goes up; it cannot bit-rot without breaking unrelated mission reviews.
  • Cross-repo workflow: implementers and reviewers must commit to two repos for missions that touch cross-repo behavior. The cross-repo-e2e-gate migration doc walks operators through the workflow.

Operational consequences:

  • The spec-kitty-mission-review skill (source at src/doctrine/skills/spec-kitty-mission-review/SKILL.md) is updated by T050 to enforce the gate.
  • The migration doc at docs/migrations/cross-repo-e2e-gate.md is the operator runbook.
  • The four shipped e2e scenarios live in the e2e repo's scenarios/ directory and are committed there as a separate commit.

Alternatives Considered

Alternative 1: Keep e2e as advisory, lean harder on tests/contract/. Rejected. Contract tests pin the expected shape against an in-repo fixture. They do not run the resolved package's real code path against a real downstream consumer in a real venv. The two failure modes that drove this mission both passed contract tests and failed in real usage.

Alternative 2: Run e2e in CI on every PR, not at mission review. Considered. Worth doing; not a substitute for the mission-review gate. A PR-level e2e run does not catch a multi-WP mission that introduces cross-WP contract drift, because the drift only appears when all WPs land together. The mission-review gate is the post-merge "all WPs landed" check the per-PR check cannot do. We will likely add per-PR e2e in a future ops-CI mission, on top of this gate, not instead of it.

Alternative 3: A single combined "mission acceptance" pytest suite that runs contract + architectural + e2e in one command. Rejected for now. The three suites have different dependency profiles (contract is fast, architectural is fast, e2e needs a dev SaaS endpoint). Treating them uniformly would cargo-cult e2e's slow external dependency onto every contract run. The mission-review skill orchestrates the three calls separately and reports them separately.

Alternative 4: Allow a blanket e2e waiver under SPEC_KITTY_E2E_OPTIONAL=1. Rejected. C-010 explicitly forbids this: "the mission MUST NOT be marked complete without either executed e2e evidence or an explicit operator-approved exception." A blanket env-var override would re-introduce the silent-skip mode the gate exists to prevent.

References

  • Spec: kitty-specs/stability-and-hygiene-hardening-2026-04-01KQ4ARB/spec.md (FR-038, FR-039, FR-040, FR-041, NFR-006, C-010)
  • Research: kitty-specs/stability-and-hygiene-hardening-2026-04-01KQ4ARB/research.md (D1 issue traceability shape; this ADR formalizes the gate that reads D1's matrix)
  • Migration: docs/migrations/cross-repo-e2e-gate.md
  • Skill source: src/doctrine/skills/spec-kitty-mission-review/SKILL.md
  • Issue matrix: kitty-specs/stability-and-hygiene-hardening-2026-04-01KQ4ARB/issue-matrix.md