Mission Specification: Centralize installed CLI runtime + remediation planning

Mission Branch: feat/installed-runtime-domain Mission ID: 01KW0NR6E9XCH0QAREQWQ5ZDPB Created: 2026-06-26 Status: Draft Source: GitHub issue #1358

Context

Spec-kitty has no single domain model for the installed CLI runtime. Receipt-parsing logic is copy-pasted across three modules and remediation-command construction is scattered across five independent sites. This mission introduces a unified InstalledCliRuntime domain type, a RemediationCommand type whose structured argv plus render() satisfies both subprocess and copy-paste consumers, a UvReceiptReader adapter collapsing the three parsers, and a plan_remediation() pure function routing all five construction sites. It also adds durable upgrade-attempt event recording for retry/backoff analysis (B2 scope).


User Scenarios & Testing (mandatory)

User Story 1 — Single runtime-detection call (Priority: P1)

A developer extending or debugging spec-kitty today must call detect_install_method() (which discards receipt/tool-dir/bin-dir context it already probed), then independently re-parse the uv receipt in each of review, upgrade_ux, and install_method.py to recover the fields the first call threw away. After this mission, a single call to detect_runtime() returns the complete installed-CLI snapshot — method, executable, receipt path, tool dir, bin dir, python override, injected requirements, derived provenance, platform, and safe-for-auto-upgrade flag — with nothing discarded.

Why this priority: This is the foundational type that all other work in this mission builds on. Subsystems cannot consolidate remediation logic until there is a single authoritative runtime record to draw from.

Independent Test: Fully testable by calling detect_runtime() in a simulated uv-tool environment and asserting all fields (receipt_path, tool_dir, bin_dir, python, requirements, safe_for_auto_upgrade) are populated from a single probe without any module reading the receipt a second time.

Acceptance Scenarios:

1. Given spec-kitty is installed via uv-tool with a non-default tool dir and a pinned Python override in the receipt, When detect_runtime() is called, Then the returned record carries tool_dir (custom), is_default_tool_dir=False, python (the overridden interpreter), and requirements (the receipt's requirement list) — all populated from the same receipt read. 2. Given spec-kitty is installed via pipx, When detect_runtime() is called, Then install_method is PIPX, receipt_path is None, is_default_tool_dir is None, and safe_for_auto_upgrade is True. 3. Given a malformed or missing receipt file, When detect_runtime() is called, Then it returns a valid record with receipt_path=None and requirements=() rather than raising.


User Story 2 — Consistent remediation across all upgrade surfaces (Priority: P1)

A spec-kitty user who runs spec-kitty review and sees an upgrade hint, then runs spec-kitty upgrade, should see the same upgrade command in both places for their install method. Today, review reconstructs the command by shlex-quoting argv built from the receipt; upgrade_ux builds argv from a method-keyed dict; upgrade_hint uses a static table; and version_checker/schema_version contain hardcoded pipx strings. After this mission, all five surfaces call plan_remediation() and RemediationCommand.render() so the output is structurally identical regardless of which surface the user encounters.

Why this priority: Divergence between surfaces confuses users about the right upgrade command for their environment. The review surface carries the richest receipt logic but that logic is not available to upgrade_ux or the static hint table.

Independent Test: Testable by asserting that plan_remediation(runtime, UPGRADE, None).render("posix") produces the same string as the equivalent call site in each of the five legacy sites, for each install method, with no surface re-implementing receipt-reading or env-prefix logic.

Acceptance Scenarios:

1. Given a uv-tool install with a custom tool dir and python-version override, When plan_remediation() is called with intent UPGRADE, Then the returned RemediationCommand.render("posix") includes the UV_TOOL_DIR env prefix and the --python flag reconstructed from the receipt — matching what the legacy review surface produced — and RemediationCommand.argv carries the same args as a usable subprocess tuple. 2. Given a pipx install, When plan_remediation() is called with intent UPGRADE, Then render() returns "pipx upgrade spec-kitty-cli" (CHK028-validated) and argv is ("pipx", "upgrade", "spec-kitty-cli"). 3. Given any install method, When UpgradeHint is constructed via the post-migration build_upgrade_hint(), Then UpgradeHint.command equals RemediationCommand.render(platform) for the same install method — the Plan.upgrade_hint JSON contract is identical to its pre-migration value. 4. Given a Windows platform and a uv-tool install, When RemediationCommand.render("windows") is called, Then the output uses PowerShell-safe quoting (matching the pre-migration behavior from review/__init__.py's PowerShell branch).


User Story 3 — Durable upgrade-attempt history for retry/backoff analysis (Priority: P2)

A spec-kitty operator or developer investigating why the auto-upgrade loop re-ran after a previous success wants to consult an upgrade-attempt history without digging through logs. After this mission, each upgrade attempt (initiated by _default_upgrade_runner) appends a structured record carrying timestamp, install method, intent, outcome, and exit code to a dedicated history store. The store supports idempotency queries ("was this exact upgrade already completed successfully?") and retry-eligibility checks ("how many consecutive failures in the last N minutes?").

Why this priority: Without attempt history, the NagCache only records preference state (snooze, always_upgrade, never_ask) and the current remote version. There is no way to distinguish a first failure from a repeated transient error, or to detect that an identical upgrade already succeeded earlier in the same session.

Independent Test: Testable by injecting a fake upgrade runner, triggering two successive upgrade calls with the same install method and target version (first call succeeds, second call is a no-op idempotency hit), and asserting the history store contains exactly one success record and one idempotency record — plus a UvToolInstallationVerified event with confidence=HIGH after the first call.

Acceptance Scenarios:

1. Given a uv-tool install and a successful auto-upgrade, When the upgrade runner completes with exit code 0, Then a UvToolInstallationVerified event is emitted carrying the receipt path, entrypoint_match=True, and confidence=HIGH, AND an attempt record (outcome=success, exit_code=0) is appended to the history store. 2. Given two consecutive upgrade attempts with identical install_method and target_version, When the history store is queried, Then is_idempotent(attempt) returns True for the second attempt if the first was a success. 3. Given three consecutive failed attempts in a 5-minute window, When the history store is queried for retry eligibility, Then the store reports consecutive_failures=3 and the caller can apply a backoff policy without re-reading the NagCache. 4. Given a non-uv-tool install (e.g. pipx), When a successful upgrade completes, Then an attempt record is appended to the history store but no UvToolInstallationVerified event is emitted (the event is uv-tool specific).


User Story 4 — Step-by-step migration with no regressions (Priority: P2)

A developer landing the migration in review wants to merge each step independently — types only, then adapter, then planner, then each deletion — with the full test suite green at every merge point. No big-bang rewrite. After this mission, the 7-step strangler plan (see FR-016 through FR-022) defines discrete, independently shippable units where each step builds on the last and existing call sites are never broken mid-migration.

Why this priority: The duplication spans files with high test coverage and active call sites. A single-step rewrite carries substantial regression risk and makes reviews harder. The strangler plan ensures reviewers see exactly the scope of each deletion.

Independent Test: Testable by creating a branch per step, running the full test suite (including the architectural terminology guard), and verifying each branch is independently green before the next step begins.

Acceptance Scenarios:

1. Given step 1 is merged (types introduced, no behavior change), When the existing test suite runs, Then all tests pass including tests for review/__init__.py, upgrade_ux.py, and upgrade_hint.py call paths. 2. Given step 2 is merged (receipt adapter and detect_runtime() live, shim in place), When any of the 7 existing detect_install_method() call sites runs, Then it returns the same InstallMethod value as before. 3. Given step 4 is merged (review migration), When review is called on a uv-tool install, Then the reinstall command emitted matches the pre-migration output byte-for-byte.


Edge Cases

  • What happens when the uv receipt exists but lists no requirements matching spec-kitty-cli? UvReceiptReader returns requirements=() and receipt_path=<path> (receipt found but no spec-kitty entry).
  • What happens when UV_TOOL_DIR is set to a non-default path but the receipt is missing? detect_runtime() records tool_dir from the env var and receipt_path=None — no attempt to read a non-existent receipt.
  • What happens when RemediationCommand.render() generates a string that fails CHK028 validation (e.g., a path containing shell metacharacters)? Construction raises ValueError at render time with a CHK028 violation message; the caller falls back to a MANUAL_GUIDANCE command with a safe note.
  • What happens when the history store file is corrupted or unreadable? Reads return an empty history (fail-open for queries); writes are best-effort with silent swallow (fail-safe for appends), matching the NagCache pattern.
  • What happens when plan_remediation() is called with intent=REINSTALL_WITH_TEST for a non-uv-tool install? It returns a RemediationCommand with intent=REINSTALL_WITH_TEST, argv=None, and note describing the appropriate manual steps for that install method.
  • What happens on Windows when the history store path uses a separator incompatible with the POSIX path construction? All history store paths are constructed via pathlib.Path, not string concatenation.

Requirements (mandatory)

Functional Requirements

IDTitleUser StoryPriorityStatus
FR-001InstalledCliRuntime domain typeAs a developer, I want a single frozen dataclass representing the full installed-CLI snapshot (install_method, executable, receipt_path, tool_dir, bin_dir, is_default_tool_dir, is_default_bin_dir, python, requirements, package_source, platform, safe_for_auto_upgrade) so no caller reconstructs these fields independently.HighOpen
FR-002detect_runtime() public functionAs a developer, I want detect_runtime() to return an InstalledCliRuntime by enriching the existing detection chain, so all receipt/dir/python fields are populated in one call.HighOpen
FR-003detect_install_method() backward-compatible shimAs a developer, I want detect_install_method() to remain callable and return runtime.install_method so all 7 existing call sites stay green without changes until step 7.HighOpen
FR-004RemediationCommand domain typeAs a developer, I want a frozen dataclass RemediationCommand with intent (UPGRADE / REINSTALL_WITH_TEST / MANUAL_GUIDANCE), argv (tuple or None), env (Mapping), and note (str or None) so subprocess and display consumers share one type.HighOpen
FR-005RemediationCommand.render(platform)As a developer, I want render(platform) to emit an env-prefixed, CHK028-validated, platform-appropriate quoted string (shlex on POSIX, PowerShell-safe on Windows) so the copy-paste string and the subprocess argv are derived from the same source.HighOpen
FR-006UpgradeHint thin façadeAs a developer, I want UpgradeHint.command to delegate to RemediationCommand.render() so the Plan.upgrade_hint JSON contract and all UpgradeHint consumers receive the same rendered string as before migration.HighOpen
FR-007UvReceiptReader adapterAs a developer, I want a single UvReceiptReader in compat/_adapters/uv_receipt.py that reads and parses uv-receipt.toml, returning all fields needed by InstalledCliRuntime (receipt_path, tool_dir, bin_dir, python, requirements), so the three independent receipt-reading implementations are eliminated.HighOpen
FR-008Receipt adapter is sole receipt-reading pathAs a developer, I want the three legacy local _active_uv_tool_receipt / _active_uv_tool_dir / _same_path / _uv_tool_python_args implementations deleted after migration, so no duplicate receipt logic remains.HighOpen
FR-009plan_remediation() pure functionAs a developer, I want plan_remediation(runtime, intent, target_version) -> RemediationCommand as a pure function (no I/O, deterministic) so all five construction sites route through one place.HighOpen
FR-010All five remediation sites routed through plannerAs a user, I want review/__init__.py, upgrade_ux.py, upgrade_hint.py, version_checker.py, and schema_version.py to all call plan_remediation() so the output is structurally identical for the same install method and intent.HighOpen
FR-011build_upgrade_hint() reimplemented on plannerAs a developer, I want build_upgrade_hint(install_method, *, package, target_version) to be reimplemented on top of plan_remediation() while preserving its public signature and UpgradeHint output type.HighOpen
FR-012UvToolInstallationVerified eventAs an operator, I want a UvToolInstallationVerified event emitted after each uv-tool upgrade attempt, carrying receipt_path, entrypoint_match, package_binding, and confidence (LOW / MEDIUM / HIGH), so post-upgrade verification is auditable.MediumOpen
FR-013Upgrade-attempt history storeAs an operator, I want a dedicated history store (separate from NagCache) that records each upgrade attempt with timestamp, install_method, intent, outcome, exit_code, and attempt_id, so idempotency and retry-eligibility queries are possible without parsing logs.MediumOpen
FR-014Upgrade runner emits event and history recordAs an operator, I want _default_upgrade_runner (in upgrade_ux.py) to emit a UvToolInstallationVerified event (for uv-tool installs) and append an UpgradeAttemptRecord to the history store on every completion (success or failure).MediumOpen
FR-015History store query interfaceAs an operator, I want the history store to expose queries for idempotency (was this exact upgrade already completed?), consecutive-failure count, and timestamp of last success, without reading the NagCache.MediumOpen
FR-016Migration step 1 — introduce typesAs a developer, I want InstalledCliRuntime, RemediationCommand, and UvToolInstallationVerified introduced with no behavior change so the new types can be reviewed in isolation.HighOpen
FR-017Migration step 2 — receipt adapter and detect_runtime()As a developer, I want UvReceiptReader and detect_runtime() introduced and the detect_install_method() shim in place so all 7 existing call sites remain green.HighOpen
FR-018Migration step 3 — route planner, preserve hint contractAs a developer, I want plan_remediation() introduced and build_upgrade_hint() reimplemented on it while UpgradeHint, _HINT_TABLE, CHK028, and Plan.upgrade_hint JSON contract remain unchanged.HighOpen
FR-019Migration step 4 — migrate review/__init__.pyAs a developer, I want the ~120 LOC of duplicate helpers (_active_uv_tool_receipt, _active_uv_tool_dir, _same_path, _uv_tool_python_args, _uv_tool_reinstall_command, _uv_tool_env_prefix, _uv_tool_env_values, _same_path, _powershell_quote) in review/__init__.py replaced by calls to UvReceiptReader and plan_remediation().HighOpen
FR-020Migration step 5 — migrate upgrade_ux.pyAs a developer, I want the duplicate helpers in upgrade_ux.py deleted and _default_upgrade_runner updated to consume RemediationCommand.argv and .env directly, with event emission and history record appended.HighOpen
FR-021Migration step 6 — fold hardcoded strings (optional)As a developer, I want the hardcoded "pipx upgrade spec-kitty-cli" strings in version_checker.py (line 218) and schema_version.py (line 182) routed through plan_remediation() so no install-method strings are hardcoded outside the planner.LowOpen
FR-022Migration step 7 — retire detect_install_method() shimAs a developer, I want the detect_install_method() shim retired and all remaining call sites updated to detect_runtime().install_method so the legacy bare-enum API is removed.LowOpen

Non-Functional Requirements

IDTitleRequirementCategoryPriorityStatus
NFR-001detect_runtime() never-raise contractdetect_runtime() must never raise; every probe wrapped in try/except … # noqa: BLE001 with silent fall-through to defaults, identical to the existing CHK032 guarantee on detect_install_method().ReliabilityHighOpen
NFR-002CHK028 validation on renderRemediationCommand.render() must raise ValueError at render time if the composed string does not match ^[A-Za-z0-9 .\-+_/=:]{1,128}$ (CHK028), preventing shell metacharacters from reaching display surfaces.SecurityHighOpen
NFR-003UvReceiptReader fail-softUvReceiptReader must return None (or empty fields) rather than raising on any filesystem error, TOML parse error, or schema mismatch.ReliabilityHighOpen
NFR-004plan_remediation() is pureplan_remediation() must be a pure function — no I/O, no side effects, deterministic output for the same inputs — so it is fully unit-testable without filesystem or subprocess stubs.TestabilityHighOpen
NFR-005History store idempotencyTwo history records with identical attempt_id must be deduplicated on read; the store must not grow unboundedly from retried attempts.ReliabilityMediumOpen
NFR-006History store concurrent-write safetyThe history store must tolerate concurrent invocations without corruption; atomic-replace or append-only write semantics required.ReliabilityMediumOpen
NFR-007No PII in events or history recordsUvToolInstallationVerified events and UpgradeAttemptRecord entries must contain no user paths, project slugs, hostnames, or machine IDs — extending CHK007/CHK048/CHK050.SecurityHighOpen
NFR-008Each migration step independently test-gatedEach of FR-016 through FR-022 must pass the full test suite (including tests/architectural/test_no_legacy_terminology.py) before the next step begins; no step introduces a regression.QualityHighOpen
NFR-009History store separate from NagCacheThe history store must use a separate file from upgrade-nag.json; NagCache schema must not be extended for attempt history.ArchitectureHighOpen
NFR-010Zero ruff/mypy issues in new codeAll new modules must pass ruff check and mypy with zero issues and zero warnings; no blanket # noqa or # type: ignore suppressions permitted (per project code-style policy).QualityHighOpen

Constraints

IDTitleConstraintCategoryPriorityStatus
C-001detect_install_method() shim preserved through step 6detect_install_method() must remain callable and return the correct InstallMethod enum from step 1 through step 6 (FR-016 through FR-021); it is removed only in step 7 (FR-022). No breaking API change before step 7.CompatibilityHighOpen
C-002UpgradeHint / _HINT_TABLE / CHK028 / Plan.upgrade_hint contract unchangedThe UpgradeHint type, _HINT_TABLE dict, CHK028 validation regex, and the Plan.upgrade_hint rendered-JSON contract must be functionally identical throughout all migration steps; no consumer of build_upgrade_hint() or Plan.upgrade_hint sees a behavior change.CompatibilityHighOpen
C-003No structural Protocol interfaces without a second implementationStructural Protocol classes must not be introduced unless a second concrete implementation exists at time of introduction, consistent with the existing compat/provider.py convention (LatestVersionProvider has PyPIProvider, NoNetworkProvider, FakeLatestVersionProvider).ArchitectureHighOpen
C-004New modules use frozen dataclasses + from __future__ import annotations + deferred importsAll new modules must follow the conventions established in the compat/ layer: frozen dataclasses, from __future__ import annotations at the top, deferred imports inside functions where needed to avoid circular-import issues.ArchitectureHighOpen
C-005PowerShell rendering branch preservedRemediationCommand.render("windows") must emit PowerShell-safe quoting for all commands that currently use _powershell_quote() in review/__init__.py; the Windows rendering path must not regress.CompatibilityHighOpen
C-006safe_for_auto_upgrade encodes the existing 5-method whitelistInstalledCliRuntime.safe_for_auto_upgrade must encode the same membership as _SAFE_AUTO_UPGRADE_METHODS in install_method.py (PIPX, UV_TOOL, BREW, PIP_USER, PIP_SYSTEM); no change to whitelist membership.CompatibilityHighOpen
C-007Receipt adapter targets the pre-staged directoryThe UvReceiptReader adapter must be placed in compat/_adapters/uv_receipt.py; the pre-staged compat/_adapters/__init__.py is the designated home and must not be relocated.ArchitectureMediumOpen
C-008History store schema design is a gated deliverableThe exact history store schema (fields, file format, retention policy) must be finalized and reviewed before FR-013 implementation begins; the schema design and blast-radius assessment are required inputs to the step-2 work package. See open question in Assumptions.RiskHighOpen

Key Entities (domain data)

  • InstalledCliRuntime: Immutable snapshot of a spec-kitty-cli installation. Carries: install_method (reuses InstallMethod enum), executable path, receipt_path, tool_dir, bin_dir, is_default_tool_dir, is_default_bin_dir, python override, requirements (tuple of UvRequirement), package_source (derived: pypi-specifier | git | url | directory | editable | path), platform (posix | windows), and safe_for_auto_upgrade flag.
  • UvRequirement: A single requirement entry from a uv receipt. Fields mirror the uv receipt TOML schema: name, specifier, directory, editable, path, git, url (all optional except name).
  • RemediationCommand: A fully specified remediation action. Fields: intent (UPGRADE | REINSTALL_WITH_TEST | MANUAL_GUIDANCE), argv (tuple of str or None), env (Mapping[str, str]), note (str or None). The render(platform) method emits a CHK028-validated, env-prefixed, platform-quoted string for display consumers.
  • UvReceiptReader: An adapter that reads and parses a uv-receipt.toml file given a tool-env directory, returning all InstalledCliRuntime fields derivable from the receipt. The single authoritative uv-receipt parser; replaces the three independent implementations.
  • UvToolInstallationVerified: An event (frozen dataclass) emitted after a uv-tool upgrade attempt. Carries: receipt_path (Path or None), entrypoint_match (bool), package_binding (str), confidence (LOW | MEDIUM | HIGH).
  • UpgradeAttemptRecord: A single history entry. Fields: attempt_id (str), timestamp (ISO datetime), install_method (InstallMethod), intent (str), outcome (success | failure | aborted), exit_code (int or None).
  • UpgradeHint: Existing frozen dataclass (unchanged). After migration it becomes a thin façade: command delegates to RemediationCommand.render(). The _HINT_TABLE and CHK028 validation are preserved.

Success Criteria (mandatory)

Measurable Outcomes

  • SC-001: A single call to detect_runtime() returns all runtime context needed by any of the five legacy remediation sites; zero modules read the uv receipt more than once per invocation after migration.
  • SC-002: The three independent receipt-parsing helper sets (_active_uv_tool_receipt, _active_uv_tool_dir, _same_path, _uv_tool_python_args and their variants) are fully deleted after migration steps 4 and 5; a codebase search for these function names returns zero results outside the new UvReceiptReader implementation.
  • SC-003: For every install method and intent combination, plan_remediation().render() produces byte-for-byte identical output to what the corresponding legacy site produced before migration (verified by snapshot tests committed in step 3).
  • SC-004: The upgrade-attempt history store supports idempotency and retry-eligibility queries with O(n) scan on at most the last 100 records; no external database is required.
  • SC-005: All existing tests remain green after each individual migration step (FR-016 through FR-022); no step introduces a red-test window exceeding a single commit.
  • SC-006: UpgradeHint consumers (Plan.upgrade_hint JSON, upgrade command display, review command display) continue to receive the same rendered command string before and after migration for every install method in _HINT_TABLE.

Assumptions

1. The history store can be implemented as an append-only JSONL file (or SQLite sibling table — see mission 047 OfflineQueue precedent) under the same ~/.spec-kitty/ XDG directory as upgrade-nag.json, with no schema migration of existing NagCache data. 2. The UvReceiptReader output fields are stable for the versions of uv currently in the integration test matrix; no uv-receipt schema migration is in scope. 3. FR-021 (routing the hardcoded strings in version_checker.py and schema_version.py through plan_remediation()) is independently shippable and may be deferred to a follow-up mission if it increases step 6 scope materially. 4. The detect_runtime() detection chain for non-uv-tool install methods (pipx, brew, pip-user, pip-system, source, unknown) returns InstalledCliRuntime with receipt_path=None and requirements=(); no additional probing is required for those methods beyond what detect_install_method() already performs. 5. RemediationCommand.render() is responsible for CHK028 validation of the final string; individual argv components do not need to be pre-validated before assembly.

[NEEDS CLARIFICATION: History store persistence substrate — SQLite sibling table (consistent with OfflineQueue precedent from mission 047) vs. JSONL append-only file. Determination affects schema design, retention policy, concurrent-write strategy, and whether a migration is needed for existing installations. Must be resolved before FR-013 implementation begins.] <!-- decision_id: 01KW0NR6_HISTORY_STORE_SUBSTRATE -->


Out of Scope

  • Unrelated install-method detection changes beyond enriching detect_runtime() with fields already inspected by the current detection chain.
  • Rewriting or changing the Plan planner decision logic (compat/planner.py) other than the upgrade_hint field.
  • Changing the NagCache schema or its upgrade-nag.json file format.
  • Introducing new install methods (e.g., snap, conda) — InstallMethod enum membership is unchanged.
  • Changing the public build_upgrade_hint() signature or the UpgradeHint type's public fields.
  • Migrating version_checker.py and schema_version.py hardcoded strings (FR-021) if the scope proves excessive — this is explicitly optional.