tests/contract/test_cross_repo_consumers.py
Context
After the shared-package-boundary cutover
(shared-package-boundary-cutover-01KQ22DS, 2026-04-25),
spec-kitty-events and spec-kitty-tracker are external PyPI
dependencies consumed via their public surfaces. Compatibility ranges
live in pyproject.toml; exact pins live in uv.lock. Contract tests
under tests/contract/ are the load-bearing assertions that protect the
CLI from upstream-shape regressions.
Prior to this ADR, a key contract test
(tests/contract/test_cross_repo_consumers.py) hard-coded the expected
spec-kitty-events version as "3.2.0". When the cutover bumped the
package to 4.0.0, that test went red and stayed red — visible to the
WP04 reviewer of mission
stability-and-hygiene-hardening-2026-04-01KQ4ARB as the canonical
example of contract-test drift behind real package state. A hard-coded
version string in a contract test is structurally guaranteed to drift
the moment uv sync resolves a different version. Since the goal of
tests/contract/ (per FR-023) is to be a hard mission-review gate,
that pattern silently inverts: instead of catching drift it produces
drift.
Decision
Every contract test that depends on a specific external-package version MUST resolve that version dynamically, rather than embedding it as a literal. The canonical resolution path is:
- Read
uv.lockviatomlliband look up the package in the[[package]]array. - If
uv.lockis unavailable (e.g. clean-install CI run), fall back toimportlib.metadata.version("<package-name>")and emit aRuntimeWarning.
For envelope-shape pinning we add a snapshot file at
tests/contract/snapshots/spec-kitty-events-<resolved-version>.json
generated by scripts/snapshot_events_envelope.py. The contract test
loads the snapshot keyed by the resolved version. Bumping the package
without regenerating the snapshot is, by design, a hard contract
failure with a structured diagnostic that points at the snapshot
script.
This pattern is owned by WP05 of mission
stability-and-hygiene-hardening-2026-04-01KQ4ARB (see
kitty-specs/stability-and-hygiene-hardening-2026-04-01KQ4ARB/contracts/events-envelope.md
and the WP05 task file).
Consequences
Positive
- Drift becomes loud, not silent. A bump to
spec-kitty-eventswithout a paired snapshot regeneration fails the contract gate (FR-022 + FR-023) immediately, instead of leaving the test pinned to a stale version forever. - Two-step bump workflow is explicit. Operators run
uv sync(or edit the version range), thenpython scripts/snapshot_events_envelope.py --force, thenpytest tests/contract/. This is documented indocs/guides/contract-pinning.md. - The mission-review gate has teeth. FR-023 declares
pytest tests/contract/a hard blocker; by removing hard-coded versions we eliminate the failure mode where the gate is "always red" and therefore ignored.
Negative / costs
- Contributors must run the snapshot script after a package bump; forgetting to do so produces an obvious red test, but adds a friction step.
- Snapshot files accumulate one-per-version. They are small JSON files
under
tests/contract/snapshots/, intentionally version-controlled to anchor regressions historically.
Operationally
- Bumping
spec-kitty-eventsis a 2-step PR:- Edit
pyproject.tomlrange and runuv lock. - Run
python scripts/snapshot_events_envelope.py --forceand commit the new snapshot.
- Edit
- The dev workflow is documented in
docs/guides/contract-pinning.md.
Alternatives considered
Alternative A — Hard-coded version literals in contract tests
This is the pre-2026-04-26 status quo. Rejected because:
- It guarantees drift the moment
uv.lockchanges, with no signal at the bump site. - It conflicts directly with FR-023 (hard mission-review gate): a permanently red contract test makes the gate informational, not blocking.
- The WP04 reviewer of this mission flagged it as the load-bearing example of why the contract surface needs resolved-version pinning.
Alternative B — Pin contract tests to pyproject.toml ranges (e.g. ">=4.0,<5")
Rejected because compatibility ranges describe what the CLI is allowed to install, not what it did install. A range-only contract test cannot distinguish 4.0.0 from 4.7.3 envelopes; the snapshot would have to be a union of all possible shapes within the range, which is the opposite of what a contract test should pin.
Alternative C — Generate the snapshot at test time
Rejected because the snapshot would then trivially match itself; there would be no shipped artifact to compare against. The whole point of the snapshot is that it is a reviewed, version-controlled representation of the upstream contract at a known point in time.
References
- Mission:
kitty-specs/stability-and-hygiene-hardening-2026-04-01KQ4ARB/spec.md(FR-022, FR-023, FR-024, FR-025, FR-026) - Research:
kitty-specs/stability-and-hygiene-hardening-2026-04-01KQ4ARB/research.mdD8 - Contract:
kitty-specs/stability-and-hygiene-hardening-2026-04-01KQ4ARB/contracts/events-envelope.md - Companion ADR:
architecture/2.x/adr/2026-04-25-1-shared-package-boundary.md - Dev workflow:
docs/guides/contract-pinning.md - Implementation:
scripts/snapshot_events_envelope.pytests/contract/test_events_envelope_matches_resolved_version.pytests/contract/test_cross_repo_consumers.py(rewritten as part of WP05)tests/contract/snapshots/spec-kitty-events-<version>.json