Spec Kitty

└─ kitty-specs
   └─ Phase 6 Composition Stabilization

Mission Run:

📚 Docs ↗

Contracts

invocation_executor_invoke.md

Contract: `ProfileInvocationExecutor.invoke(...)`

Source file: src/specify_cli/invocation/executor.py Spec coverage: FR-009, FR-010, FR-011, FR-012, FR-013, EDGE-005

Signature (post-change)

def invoke(
    self,
    request_text: str,
    profile_hint: str | None = None,
    actor: str = "unknown",
    mode_of_work: ModeOfWork | None = None,
    *,
    action_hint: str | None = None,
) -> InvocationPayload:
    ...

The * separator makes action_hint keyword-only. Default None is preserved across all existing callers.

Behavioral Matrix

Inputs	Branch entered	Action source
`profile_hint` set, `action_hint` is a non-empty string	`profile_hint`-branch	`action_hint` verbatim
`profile_hint` set, `action_hint` is `None`	`profile_hint`-branch	`_derive_action_from_request(request_text, profile.role)` (legacy)
`profile_hint` set, `action_hint == ""`	`profile_hint`-branch	`_derive_action_from_request(...)` (legacy fallback per EDGE-005)
`profile_hint` not set	router-backed branch	`result.action` from router decision (legacy, unchanged)

Invariants

1. Backwards compatibility: any call site that does not pass action_hint produces byte-identical output to the pre-change code (FR-011, FR-012). 2. Hint truthiness: empty-string action_hint is treated as if the kwarg were not supplied (EDGE-005). Non-empty strings are passed through verbatim. 3. Router branch is untouched: action_hint has no effect when profile_hint is None (FR-012). 4. InvocationRecord.action is set to the chosen action key in all branches (already true; this contract just changes which value is chosen). 5. Governance context assembly reads action from the record; therefore FR-013 follows automatically when FR-010 holds.

Test Surface

Test name	File	Asserts
`test_invoke_with_action_hint_and_profile_hint_records_hint[specify\	plan\	tasks\
`test_invoke_profile_hint_only_falls_back_to_derived_action`	`test_invocation_e2e.py`	`invoke(profile_hint=...)` without `action_hint` records the role-default verb
`test_invoke_empty_action_hint_falls_back`	`test_invocation_e2e.py`	`action_hint=""` is treated as legacy fallback
`test_invoke_router_branch_unchanged_with_action_hint`	`test_invocation_e2e.py`	When `profile_hint` is `None`, `action_hint` is ignored
existing advise/ask/do tests	`test_invocation_e2e.py` and friends	Continue to pass without modification (FR-012)

Failure Modes

Type mismatch on action_hint: callers passing a non-string non-None value will fail mypy --strict; runtime TypeError is acceptable for clearly-buggy callers (no runtime validation needed beyond what mypy enforces).
Bogus action string: this contract does not validate the value of action_hint against any allow-list; downstream consumers (governance context, trail format) accept any string. Validation is the caller's responsibility.

Non-Goals

Validating action_hint against a fixed enum of contract actions.
Changing the InvocationRecord shape, the JSONL format, or the writer surface.
Affecting the router-backed branch.

runtime_bridge_dispatch.md

Contract: Runtime Bridge Dispatch (composition vs. legacy)

Source file: src/specify_cli/next/runtime_bridge.py Spec coverage: FR-001, FR-002, FR-003, FR-004, FR-005, FR-015, EDGE-002, EDGE-003

Public Surface

def decide_next_via_runtime(...) -> Decision:
    ...

The Decision shape (defined in src/specify_cli/next/decision.py) is unchanged by this mission. No new fields. No removed fields.

Single-Dispatch Invariant

For every call to decide_next_via_runtime(...):

> Exactly one of {composition path, legacy DAG path} executes per action attempt.

Concretely, when _should_dispatch_via_composition(action) == True: 1. _dispatch_via_composition(...) runs (composer + guard + advancement helper). 2. The returned Decision is yielded immediately. 3. runtime_next_step(...) is NOT called.

When _should_dispatch_via_composition(action) == False: 1. runtime_next_step(...) runs (legacy path) exactly as today. 2. _dispatch_via_composition(...) is NOT entered.

State Diagram

decide_next_via_runtime(action)
        |
        v
   _should_dispatch_via_composition(action) ?
        |
   yes  v                              no  v
   _dispatch_via_composition(action)        runtime_next_step(action)
        |                                     |
        v                                     v
   StepContractExecutor.execute(...)          (unchanged)
   _check_composed_action_guard(...)          |
   _advance_run_state_after_composition(...)  |
        |                                     |
        v                                     v
        Decision  (composition)               Decision  (legacy)

New Helper

def _advance_run_state_after_composition(
    repo_root: Path,
    action: str,
    ...,
) -> Decision:
    """Advance run state, lane events, and prompt progression for a composition-backed action.

    Reuses the same primitives as runtime_next_step(...) for state/lane/prompt
    progression but does NOT invoke the legacy DAG action handler — that is the
    point of single-dispatch (FR-001).
    """

The helper:

Emits lane status events via the same SyncRuntimeEventEmitter.
Records mission-state advancement via the same primitive that runtime_next_step(...) uses.
Computes the next public step.
Returns a Decision whose shape is identical to what runtime_next_step(...) would produce for the same advance — but with no legacy action dispatch performed.

Ordering Within `_dispatch_via_composition(...)`

The order within the composition path is fixed:

1. StepContractExecutor.execute(context) — composer runs the contract steps via ProfileInvocationExecutor.invoke(...). 2. _check_composed_action_guard(...) — fixed tasks guard (keyed by legacy_step_id) runs after composition. Unchanged. 3. _advance_run_state_after_composition(...) — new step. 4. Return Decision.

Step 2 must remain before step 3. Step 3 must run only on a successful step 1 + step 2.

`tasks` Guard Semantics (FR-003)

The fixed tasks guard is keyed by legacy_step_id:

`legacy_step_id`	Required state
`tasks_outline`	`tasks.md` exists
`tasks_packages`	`tasks.md` exists AND ≥1 `tasks/WP*.md` exists
`tasks_finalize` (and the public `tasks` step)	terminal state including a `dependencies:` block

This contract does not change. Existing tests for these branches must continue to pass without modification.

Test Surface

Test name	File	Asserts
`test_composition_success_skips_legacy_dispatch[<action>]`	`test_runtime_bridge_composition.py`	parametrized over the 5 composed actions; legacy dispatch entry point is not called after composition success (FR-001/FR-015)
`test_composition_success_advances_run_state_and_lane_events`	`test_runtime_bridge_composition.py`	lane events emitted; `Decision` reflects progression to next public step (FR-002)
`test_decision_shape_unchanged_for_composed_action`	`test_runtime_bridge_composition.py`	`Decision` field set is identical to legacy-path baseline (FR-005)
existing `test_tasks__guard_`	`test_runtime_bridge_composition.py`	continue to pass unchanged (FR-003)
`test_non_composed_action_uses_legacy_runtime_next_step`	`test_runtime_bridge_composition.py`	EDGE-002 — `runtime_next_step(...)` still runs for non-composed actions
`test_advancement_helper_failure_propagates_no_legacy_fallback`	`test_runtime_bridge_composition.py`	EDGE-003 — when the helper raises, the error surfaces through the existing `Decision` error shape; legacy dispatch is not entered as a fallback

Failure Modes

Helper raises: surfaced through the existing Decision error shape; no fallback to legacy dispatch (EDGE-003).
StepContractExecutor.execute(...) raises: existing behavior — error is propagated/wrapped through the existing Decision error shape (no behavioral change vs. current main).
Guard fails (tasks semantics): existing behavior — guard returns a structured failure; this is unchanged.

Non-Goals

Editing mission-runtime.yaml (FR-004).
Adding a new mission runner or mission step type (NFR-005).
Changing the public Decision shape (FR-005).
Affecting non-composed actions (EDGE-002).

step_contract_executor_lifecycle.md

Contract: `StepContractExecutor.execute(...)` Invocation Lifecycle

Source file: src/specify_cli/mission_step_contracts/executor.py Spec coverage: FR-006, FR-007, FR-008, FR-014, EDGE-001, EDGE-004

Lifecycle Invariant

For every composed step s executed inside StepContractExecutor.execute(...):

> The invocation file produced by ProfileInvocationExecutor.invoke(...) for s MUST contain exactly one started record AND exactly one closing record.

The closing record is:

completed with outcome="done" if the per-step body returns normally.
failed with outcome="failed" if the per-step body raises.

No other outcome values are produced by this path.

Pseudocode (post-change)

for selected_contract in contracts_to_run:
    payload = self._invocation_executor.invoke(
        request_text=...,
        profile_hint=...,
        actor=...,
        mode_of_work=...,
        action_hint=selected_contract.action,   # FR-014
    )
    try:
        # existing per-step body — unchanged
        ...
    except Exception:
        # NOTE: outcome="done" or "failed" reflects whether the
        # *composition* step ran cleanly; it does NOT imply the host LLM
        # finished generation. The composition is a governance/trail unit.
        self._invocation_executor.complete_invocation(
            payload.invocation_id,
            outcome="failed",
        )
        raise
    else:
        self._invocation_executor.complete_invocation(
            payload.invocation_id,
            outcome="done",
        )

Closure API

Use ProfileInvocationExecutor.complete_invocation(invocation_id, outcome=...).
Do not import InvocationWriter from this module.
Do not write to .kittify/events/profile-invocations/*.jsonl directly.
These constraints satisfy FR-008, C-006, and C-007.

Outcome Semantics

Path	Outcome value	Why
Per-step body returns normally	`"done"`	Existing literal; composition step completed without error
Per-step body raises	`"failed"`	Existing literal; composition step raised; original exception is re-raised after close
User cancellation (out of scope here)	`"abandoned"`	Reserved for user-initiated cancellation flows; not produced by this path

The "completion does not imply host LLM finished generation" semantic is documented:

as a one-line code comment at the close site;
as an explicit test name (test_composed_action_outcome_is_done_even_though_composition_does_not_run_llm).

Multi-Step Pairing (EDGE-004)

Each composed step has its own try/except/else block. Pairing is per-step:

if step 1 succeeds and step 2 raises, step 1 is closed with done, step 2 with failed.
if all steps succeed, every step is closed with done.
the executor never re-uses a single try/except across multiple steps.

Test Surface

Test name	File	Asserts
`test_composed_action_pairs_started_with_completed`	`test_invocation_e2e.py` and `test_software_dev_composition.py`	every JSONL produced by a composed action has exactly one `started` and one `completed` line (FR-006, Scenario B)
`test_composed_step_failure_writes_failed_completion`	`test_invocation_e2e.py`	per-step body patched to raise; JSONL has `started`+`failed`; original exception still propagates (FR-007, Scenario C, EDGE-001)
`test_executor_uses_complete_invocation_api_only`	`test_software_dev_composition.py`	monkey-patches verify `complete_invocation` is called from the executor; `InvocationWriter.write_*` is never called from this module (FR-008)
`test_step_contract_executor_passes_action_hint`	`test_software_dev_composition.py`	each `invoke(...)` call from the executor passes `action_hint=selected_contract.action` (FR-014)
`test_governance_context_uses_contract_action_when_hint_supplied`	`test_software_dev_composition.py`	governance context for a composed `software-dev/specify` reads `action="specify"` (FR-013)
`test_composed_action_outcome_is_done_even_though_composition_does_not_run_llm`	`test_invocation_e2e.py`	naming-as-documentation: outcome is `"done"` for composition success regardless of host execution
`test_composed_action_multistep_pairing`	`test_software_dev_composition.py`	composed action with ≥2 invocations pairs each independently (EDGE-004)

Failure Modes

complete_invocation(...) raises in the else branch (e.g. writer IO failure): existing failure mode for the writer; the per-step body has already returned normally so no original exception is masked. Acceptable; not a new failure mode introduced by this contract.
Per-step body raises a BaseException (e.g. KeyboardInterrupt): the except Exception clause does NOT catch it; the invocation is left unclosed. This matches existing semantics for catastrophic interruption and is consistent with charter expectations (don't swallow BaseException).

Non-Goals

Inventing a new outcome value ("composed", "governance_only", etc.).
Adding a JSONL field describing host LLM execution status.
Adding retry / backoff around the close call.
Changing the complete_invocation(...) signature.

Spec Kitty

Contracts

invocation_executor_invoke.md

Contract: ProfileInvocationExecutor.invoke(...)

Signature (post-change)

Behavioral Matrix

Invariants

Test Surface

Failure Modes

Non-Goals

runtime_bridge_dispatch.md

Contract: Runtime Bridge Dispatch (composition vs. legacy)

Public Surface

Single-Dispatch Invariant

State Diagram

New Helper

Ordering Within _dispatch_via_composition(...)

tasks Guard Semantics (FR-003)

Test Surface

Failure Modes

Non-Goals

step_contract_executor_lifecycle.md

Contract: StepContractExecutor.execute(...) Invocation Lifecycle

Lifecycle Invariant

Pseudocode (post-change)

Closure API

Outcome Semantics

Multi-Step Pairing (EDGE-004)

Test Surface

Failure Modes

Non-Goals

Contract: `ProfileInvocationExecutor.invoke(...)`

Ordering Within `_dispatch_via_composition(...)`

`tasks` Guard Semantics (FR-003)

Contract: `StepContractExecutor.execute(...)` Invocation Lifecycle