Research: Phase 6 Composition Stabilization
Mission: phase6-composition-stabilization-01KQ2JAS Created: 2026-04-25 Method: Read the relevant source and tests under /Users/robert/spec-kitty-dev/786-793-794-phase6-stabilization/spec-kitty/. Treat existing behavior as ground truth. No new external dependencies.
R1 — Where does the legacy fall-through happen? (#786)
Decision
The fall-through is in src/specify_cli/next/runtime_bridge.py::decide_next_via_runtime(...). After _dispatch_via_composition(...) returns None on the success path, control falls through to runtime_next_step(...), which both advances run state AND executes the legacy DAG action handler.
Evidence
_dispatch_via_composition(...)atruntime_bridge.py:393–486:- calls
StepContractExecutor(repo_root).execute(context)at line 441; - calls
_check_composed_action_guard(...)at lines 480–482; - returns
Noneon success at line 485. decide_next_via_runtime(...)atruntime_bridge.py:981:- the call to
_dispatch_via_compositionhappens above; - on
None, control falls through toruntime_next_step(...)at lines 986–991.
Rationale for the fix
Make _dispatch_via_composition(...) return a Decision (not None) on success, constructed from a new helper that advances run-state without running the legacy DAG action handler. Treat success as a terminating dispatch in decide_next_via_runtime(...). This is the option the brief recommends ("add a composition-specific advancement path"), and it minimizes changes to the legacy non-composed code path.
Alternatives considered
- Always-return-tuple: change
_dispatch_via_composition(...)totuple[bool, Decision | None]. Rejected because it changes a function-private contract that other callers may depend on (more blast radius for the same outcome). - Edit
mission-runtime.yamlto remove the legacy step. Rejected by FR-004 unless tests prove there is no other correct implementation; tests will pass with the helper approach, so this option is not exercised.
R2 — Where is the tasks guard? (#786 sub-question)
Decision
The fixed tasks guard is keyed by legacy_step_id and is invoked from _check_composed_action_guard(...) AFTER composition runs (already on main). Keep it exactly where it is.
Evidence
runtime_bridge.py_check_composed_action_guard(...)is invoked at lines 480–482, AFTER the composer runs. The branching bylegacy_step_idfortasks_outline/tasks_packages/tasks_finalize/tasksis the P0 fix already onmain(perstart-here.md).- The guard tests already cover the three terminal states:
tasks_outline→tasks.md;tasks_packages→tasks.md+ ≥1tasks/WP*.md;tasks_finalize/tasks→ terminaldependencies:block.
Rationale
The advancement helper introduced for FR-001 must run AFTER _check_composed_action_guard(...). We do not move or modify the guard.
R3 — complete_invocation(...) outcome values (#793)
Decision
Use outcome="done" on success, outcome="failed" on exception. Do not introduce a new outcome value.
Evidence
src/specify_cli/invocation/record.py:34:outcome: Literal["done", "failed", "abandoned"] | None.src/specify_cli/invocation/executor.py:218–225:complete_invocation(...)accepts that same set."abandoned"is reserved for user-initiated cancellation; this mission does not generate that case.
Rationale
start-here.mdexplicitly says: "Use an outcome value that matches current trail semantics, and document the reasoning in tests or comments if needed."- Adding a new outcome value would be unprincipled scope creep and would break the JSONL contract for downstream consumers.
- A one-line code comment at the close site captures the "completion does not imply the host LLM did the requested generation" semantic.
Alternatives considered
- Invent a new outcome
"composed": rejected (scope creep, breaksLiteralcontract). - Skip closing on success: rejected (re-introduces the bug).
- Always
"abandoned"for composed steps: rejected (semantic mismatch — abandonment is not what happened).
R4 — invoke(...) extension shape (#794)
Decision
Add *, action_hint: str | None = None (keyword-only). Use it inside the if profile_hint is not None: branch; if truthy, set action = action_hint; otherwise call _derive_action_from_request(...) as today. The else (router-backed) branch is untouched.
Evidence
``python def invoke( self, request_text: str, profile_hint: str | None = None, actor: str = "unknown", mode_of_work: ModeOfWork | None = None, ) -> InvocationPayload: ``
- Current signature at
src/specify_cli/invocation/executor.py:113–119: - The action derivation at line 132 only happens inside
if profile_hint is not None:. The router branch at lines 134–139 already setsaction = result.actionfrom the router decision and does not call the deriver — so the new kwarg has no semantic interaction with the router branch. - The action is stored in
InvocationRecord.action(line 185) and written to the started JSONL viawrite_started(record)(line 194). Governance context assembly reads from the same record.
Rationale
- Keyword-only avoids any positional drift in existing callers.
- Truthiness check (
if action_hint:) treats empty string as legacy-fallback (EDGE-005 in spec). - Limiting the new behavior to the
profile_hint-branch ensures the router-backed callers (advise/ask/do) cannot regress (FR-012).
Alternatives considered
- Make
action_hintpositional afterprofile_hint: rejected (positional drift risk). - Apply
action_hintin the router branch too: rejected (out of spec; router decisions own their action key by design). - Replace
_derive_action_from_request(...)entirely: rejected (legacy behavior is required for non-composition callers).
R5 — Where to close the invocation lifecycle (#793 sub-question)
Decision
Close at the call site in StepContractExecutor.execute(...). Wrap the per-step invoke(...) + composed-step body in a try/except/else; call complete_invocation(payload.invocation_id, outcome=...) in both branches.
Evidence
src/specify_cli/mission_step_contracts/executor.py:135–192(StepContractExecutor.execute(...)):invoke(...)is called at line 159; payloads are collected intostep_results.- No call to
complete_invocation(...)follows. No try/except around the call. complete_invocation(...)is the public method onProfileInvocationExecutorand writes the second JSONL line via the existing writer; no new writer is needed.
Rationale
- The composer is the closest scope that knows when a composed step has finished or raised. Closing here keeps lifecycle ownership at the same level as the start (FR-008).
- Using
complete_invocation(...)keeps all writes inside the executor/writer boundary (C-006/C-007). - Multi-step composed actions naturally pair each invocation independently (EDGE-004).
Alternatives considered
- Close in
runtime_bridge.py: rejected — the bridge does not know about per-step invocations and it would couple unrelated layers. - Close in
ProfileInvocationExecutor.invoke(...)itself via context manager: rejected —invoke(...)cannot know when the host-facing step is done; the composer is the right scope.
R6 — Test surface (FR-015 / FR-016 / FR-017)
Decision
Use the existing four test files; add new test functions; do not rename or split them.
Evidence and Rationale
tests/specify_cli/next/test_runtime_bridge_composition.pyalready covers_dispatch_via_composition,_should_dispatch_via_composition,_check_composed_action_guard. Adding negative-condition tests is a natural extension of existing scaffolding.tests/specify_cli/mission_step_contracts/test_software_dev_composition.pyalready loads the shippedtasksstep contract and resolves the default profile; it is the right home foraction_hintpass-through tests and lifecycle close tests on real composition flows.tests/specify_cli/invocation/test_invocation_e2e.pyalready verifiesstarted+complete_invocationJSONL pairing for non-composed invocations; it is the right home for the newaction_hintparametrized tests and for the success/failure pairing assertions seen by the executor.tests/specify_cli/invocation/test_writer.pyis unit-level for the writer; it should remain untouched in this tranche unless the writer surface is changed (it is not).
R7 — spec-kitty-runtime environment (per start-here.md)
Decision
Confirm spec-kitty-runtime is installable from the sibling checkout if missing in the env. Do NOT patch around environment-only import failures in source.
Rationale
start-here.md explicitly: "If spec-kitty-runtime is missing in the environment, install it from the sibling checkout or the expected package source before running runtime bridge tests. Do not patch around an environment-only import failure." The plan respects this; if a test session cannot import spec_kitty_runtime, fix the environment, not the code.
Open Questions
None. All clarification points are resolved by the source inspection above plus the explicit guidance in start-here.md.