Phase 0 Research: Shared Package Boundary Cutover
Mission: shared-package-boundary-cutover-01KQ22DS Date: 2026-04-25 Branch contract: planning/base main → merge target main
This research log resolves the three open architectural questions identified in plan.md before implementation begins.
R0-1: Where exactly should the internalized runtime live, and what is its import surface?
Decision
The internalized runtime lives at src/specify_cli/next/_internal_runtime/, colocated with the existing runtime_bridge.py, decision.py, and prompt_builder.py modules.
The public surface (what runtime_bridge.py and next_cmd.py import) mirrors exactly what they import from spec_kitty_runtime today:
| Symbol | From spec_kitty_runtime (pre-cutover) | New location (post-cutover) |
|---|---|---|
DiscoveryContext | spec_kitty_runtime | specify_cli.next._internal_runtime |
MissionPolicySnapshot | spec_kitty_runtime | specify_cli.next._internal_runtime |
MissionRunRef | spec_kitty_runtime | specify_cli.next._internal_runtime |
NextDecision | spec_kitty_runtime | specify_cli.next._internal_runtime |
NullEmitter | spec_kitty_runtime | specify_cli.next._internal_runtime |
next_step (aliased runtime_next_step) | spec_kitty_runtime | specify_cli.next._internal_runtime |
provide_decision_answer | spec_kitty_runtime | specify_cli.next._internal_runtime |
start_mission_run | spec_kitty_runtime | specify_cli.next._internal_runtime |
ActorIdentity | spec_kitty_runtime.schema | specify_cli.next._internal_runtime.schema |
load_mission_template_file | spec_kitty_runtime.schema | specify_cli.next._internal_runtime.schema |
MissionRuntimeError | spec_kitty_runtime.schema | specify_cli.next._internal_runtime.schema |
_read_snapshot | spec_kitty_runtime.engine | specify_cli.next._internal_runtime.engine |
engine (module reference) | spec_kitty_runtime.engine | specify_cli.next._internal_runtime.engine |
plan_next | spec_kitty_runtime.planner | specify_cli.next._internal_runtime.planner |
Internal sub-module layout:
specify_cli/next/_internal_runtime/
├── __init__.py # re-exports the public surface above
├── models.py # DiscoveryContext, MissionPolicySnapshot, MissionRunRef, NextDecision
├── emitter.py # NullEmitter + emitter Protocol
├── lifecycle.py # next_step, provide_decision_answer, start_mission_run
├── engine.py # _read_snapshot, snapshot persistence
├── planner.py # plan_next, DAG planner
└── schema.py # ActorIdentity, load_mission_template_file, MissionRuntimeError
Rationale
diff surface auditable. runtime_bridge.py is the only module that consumes the full surface; its re-import lines change from from spec_kitty_runtime import (...) to from specify_cli.next._internal_runtime import (...) — every other file in the call graph imports from runtime_bridge (which already abstracts the runtime), not from spec_kitty_runtime directly.
Python import surface, even though it is shipped in the wheel. External Python importers must continue to use specify_cli.next.* public symbols, not reach into _internal_runtime.
scope specify_cli.* as the top layer; placing the new module under specify_cli leaves those rules undisturbed and adds only one new package-boundary rule (the C-001 enforcement) on top.
top-level package. The repo already has 4 (kernel, doctrine, charter, specify_cli); a fifth top-level for runtime would explode review surface and conflict with the in-flight runtime-mission-execution-extraction-01KPDYGW mission's own canonical-runtime location decision.
- Colocation with
runtime_bridge.pykeeps the import paths short and the - Underscore prefix marks
_internal_runtimeas not part of the public CLI - Existing layer rules in
tests/architectural/test_layer_rules.pyalready - No top-level package: spec C-009 explicitly forbids introducing a new
Alternatives considered
runtime-mission-execution-extraction-01KPDYGW mission already proposes src/runtime/ for a different extraction (the CLI's internal "next-step decisioning + agent dispatch" layer, not the standalone PyPI runtime package). Stomping on that namespace creates a merge conflict and confuses reviewers about which runtime is which.
owns CLI asset / home-directory bootstrap (different concern). Reusing the name would collide.
pre-cutover code is ~3 sub-modules (engine, planner, schema) plus public models. Inlining ~3kLoC into a single file destroys reviewability and breaks the existing function-decomposition style.
src/runtime/as a new top-level package: rejected. Thesrc/specify_cli/runtime/: rejected. That directory already exists and- Inlining the runtime code into
runtime_bridge.py: rejected. The
Behavior-equivalence strategy
WP01 captures behavior parity by: 1. Running spec-kitty next against a checked-in fixture mission with the currently-installed spec-kitty-runtime 0.4.3 and recording golden JSON snapshots in tests/fixtures/runtime_parity/. 2. Running the same fixture against the new internalized runtime and asserting byte-equal snapshots (modulo timestamp / path normalization, identical to how mission runtime-mission-execution-extraction-01KPDYGW does it). 3. Any delta forces an iteration on _internal_runtime before WP02 begins.
R0-2: What is the compatibility range strategy for events and tracker, and what
happens to [tool.uv.sources] and constraints.txt?
Decision
| Item | Pre-cutover state | Post-cutover state |
|---|---|---|
pyproject.toml events dep | spec-kitty-events==4.0.0 (exact pin) | spec-kitty-events>=4.0.0,<5.0.0 |
pyproject.toml tracker dep | spec-kitty-tracker==0.4.2 (exact pin) | spec-kitty-tracker>=0.4,<0.5 |
pyproject.toml runtime dep | Absent (intentionally) | Absent (locked in by FR-006 + arch test) |
[tool.uv.sources] for events | spec-kitty-events = { path = "../spec-kitty-events", editable = true } | Empty / absent |
[tool.uv.sources] for tracker | Absent | Absent |
constraints.txt | Pin spec-kitty-events==4.0.0 to paper over runtime's transitive <4.0 pin | Removed (no longer needed) |
uv.lock | Pinned to dev path | Regenerated; pins to PyPI versions |
Rationale
documents a SemVer policy: minor / patch within 4.x are non-breaking; major bumps (5.x) are breaking. The compatibility range >=4.0.0,<5.0.0 exactly matches that policy.
in implement-review at the time of this plan. The conservative range >=0.4,<0.5 matches the currently-published 0.4.2 line and reserves a tightening pass for when the upstream mission lands. The consumer test contract (FR-009) is the real safety belt; a too-loose range can only break CI explicitly via that contract, never silently in production.
PR #779's CI failure: they masked the missing spec-kitty-runtime dependency during local dev because the editable events install pulled the runtime transitively, while CI installed from PyPI and exploded. Removing them is non-negotiable.
separately in docs/development/local-overrides.md (created by WP10), using uv workspace patterns or explicit --with-editable flags that do not touch the committed pyproject.toml.
spec-kitty-runtime-transitive spec-kitty-events<4.0 pin conflict (see DRIFT-1 in mission review 01KPWT8P). With spec-kitty-runtime no longer a dependency, the constraint has no purpose and is removed in WP08.
- The events public-surface contract from
events-pypi-contract-hardening-01KQ1ZK7 - The tracker mission
tracker-pypi-sdk-independence-hardening-01KQ1ZKKis still - Editable / path overrides in
[tool.uv.sources]were the direct cause of - Developer overrides for cross-package work (events / tracker) are documented
constraints.txtexists today exclusively to paper over the
Alternatives considered
cross-package release lockstep the cutover exists to dissolve. Spec acceptance criterion A7 forbids this.
contract; a 5.x major bump is breaking and must require an explicit CLI version bump.
is fully servable by a separate developer-only override file. Committing editable overrides re-introduces the failure mode that rejected PR #779.
It would become a magnet for similar paper-over hacks. Better to delete it cleanly and document developer override patterns in docs/development/local-overrides.md.
- Keep exact pins (
==4.0.0,==0.4.2): rejected. Exact pins recreate the - Open-ended ranges (
>=4): rejected. SemVer minor windows are the right - Keep
[tool.uv.sources]for dev convenience: rejected. The dev workflow - Convert
constraints.txtinto a generic dev constraint file: rejected.
Cross-repo handshake
81d5ccd4. The compatibility range chosen here is committed against that sha.
implement-review. WP07 (consumer-test contract) and WP08 (pyproject.toml range) explicitly note that the tracker range may need to be tightened on rebase if the upstream mission lands a contract change before this mission's closing PR is merged. The orchestrator picks up that delta automatically when the WP07 work resumes.
- Events mission
events-pypi-contract-hardening-01KQ1ZK7is merged at sha - Tracker mission
tracker-pypi-sdk-independence-hardening-01KQ1ZKKis in
R0-3: How does the clean-install CI job structurally guarantee absence of
spec-kitty-runtime?
Decision
WP09 adds a new job named clean-install-verification to .github/workflows/ci-quality.yml. The job:
1. Checks out the repo at the PR head. 2. Builds the CLI wheel with python -m build --wheel. 3. Spins up a fresh container (e.g. python:3.12-slim) with no pre-installed packages. 4. Inside the container, runs: ``bash pip install dist/spec_kitty_cli-*.whl pip list | grep -i spec-kitty-runtime && exit 1 || true # must NOT be installed spec-kitty --version ` 5. Checks out a known fixture mission committed at tests/fixtures/clean_install_fixture_mission/ and runs: `bash spec-kitty agent mission setup-plan --mission <fixture-handle> --json spec-kitty next --agent claude --mission <fixture-handle> --json ` 6. Asserts the JSON output contains "result": "success" and that the loop advanced at least one step (event log gained at least one StatusEvent` row). 7. Asserts wall-clock runtime ≤ 5 minutes (NFR-004).
The local-runnable counterpart is tests/integration/test_clean_install_next.py, which uses subprocess and a tmp_path venv to perform the same flow on a dev machine. It is gated behind a marker (@pytest.mark.distribution) so the main test suite stays fast.
Rationale
python:3.12-slim image has no leftover state from the workspace's editable installs, no cached spec-kitty-runtime, and no [tool.uv.sources] precedence. If the CLI works there, it works for end users.
dep" into a CI-enforced observation.
tests/next/test_next_command_integration.py uses; reusing the assertion pattern keeps reviewer cognitive load low.
- Container isolation is the strongest available guarantee: a fresh
- Wheel install (not
pip install -e .) mirrors the user experience. - Explicit
pip list | grepassertion turns "spec-kitty-runtime is not a - JSON-asserted advancement is the same contract the existing
Alternatives considered
suite has hundreds of unit tests that don't need an isolated environment; running all of them in a clean venv would blow the 5-minute budget.
prove what users experience.
tox for one job is an unnecessary new tool.
- Run the existing test suite in a clean venv: rejected. The existing
- Mock the absence of
spec-kitty-runtime: rejected. Mocking does not - Use
tox: rejected. The repo standardized onuv/pytest; introducing
CI surface details
required to merge.
it is the smallest possible mission scaffold that exercises one spec-kitty next step (a single planned WP, no dependencies).
- The job key in
ci-quality.ymlisclean-install-verification. - It is added to the
protect-main.ymlrequired-check set so a green run is - Fixture mission lives under
tests/fixtures/clean_install_fixture_mission/;
Documentation handoff
CHANGELOG.md to point at this job for operators who want to verify the cutover locally.
- WP10 updates
docs/development/local-overrides.md(new) and
Resolved clarifications
No [NEEDS CLARIFICATION] markers existed in spec.md. None added by this research log.
Inputs for Phase 1
The following Phase 1 artifacts use this research as ground truth:
before / after.
in spec-kitty-events at sha 81d5ccd4, scoped to the subset CLI uses.
spec-kitty-tracker 0.4.2 SDK, scoped to the subset CLI uses.
data-model.md— uses R0-1's import-surface table to model the structuralcontracts/internal_runtime_surface.md— copies the symbol table from R0-1.contracts/events_consumer_surface.md— derives from the public-surface doccontracts/tracker_consumer_surface.md— derives from the publishedquickstart.md— uses R0-3's clean-install recipe.