Contracts

invocation_executor_invoke.md

Contract: ProfileInvocationExecutor.invoke(...)

Source file: src/specify_cli/invocation/executor.py Spec coverage: FR-009, FR-010, FR-011, FR-012, FR-013, EDGE-005

Signature (post-change)

def invoke(
    self,
    request_text: str,
    profile_hint: str | None = None,
    actor: str = "unknown",
    mode_of_work: ModeOfWork | None = None,
    *,
    action_hint: str | None = None,
) -> InvocationPayload:
    ...

The * separator makes action_hint keyword-only. Default None is preserved across all existing callers.

Behavioral Matrix

InputsBranch enteredAction source
profile_hint set, action_hint is a non-empty stringprofile_hint-branchaction_hint verbatim
profile_hint set, action_hint is Noneprofile_hint-branch_derive_action_from_request(request_text, profile.role) (legacy)
profile_hint set, action_hint == ""profile_hint-branch_derive_action_from_request(...) (legacy fallback per EDGE-005)
profile_hint not setrouter-backed branchresult.action from router decision (legacy, unchanged)

Invariants

1. Backwards compatibility: any call site that does not pass action_hint produces byte-identical output to the pre-change code (FR-011, FR-012). 2. Hint truthiness: empty-string action_hint is treated as if the kwarg were not supplied (EDGE-005). Non-empty strings are passed through verbatim. 3. Router branch is untouched: action_hint has no effect when profile_hint is None (FR-012). 4. InvocationRecord.action is set to the chosen action key in all branches (already true; this contract just changes which value is chosen). 5. Governance context assembly reads action from the record; therefore FR-013 follows automatically when FR-010 holds.

Test Surface

Test nameFileAsserts
`test_invoke_with_action_hint_and_profile_hint_records_hint[specify\plan\tasks\
test_invoke_profile_hint_only_falls_back_to_derived_actiontest_invocation_e2e.pyinvoke(profile_hint=...) without action_hint records the role-default verb
test_invoke_empty_action_hint_falls_backtest_invocation_e2e.pyaction_hint="" is treated as legacy fallback
test_invoke_router_branch_unchanged_with_action_hinttest_invocation_e2e.pyWhen profile_hint is None, action_hint is ignored
existing advise/ask/do teststest_invocation_e2e.py and friendsContinue to pass without modification (FR-012)

Failure Modes

  • Type mismatch on action_hint: callers passing a non-string non-None value will fail mypy --strict; runtime TypeError is acceptable for clearly-buggy callers (no runtime validation needed beyond what mypy enforces).
  • Bogus action string: this contract does not validate the value of action_hint against any allow-list; downstream consumers (governance context, trail format) accept any string. Validation is the caller's responsibility.

Non-Goals

  • Validating action_hint against a fixed enum of contract actions.
  • Changing the InvocationRecord shape, the JSONL format, or the writer surface.
  • Affecting the router-backed branch.

runtime_bridge_dispatch.md

Contract: Runtime Bridge Dispatch (composition vs. legacy)

Source file: src/specify_cli/next/runtime_bridge.py Spec coverage: FR-001, FR-002, FR-003, FR-004, FR-005, FR-015, EDGE-002, EDGE-003

Public Surface

def decide_next_via_runtime(...) -> Decision:
    ...

The Decision shape (defined in src/specify_cli/next/decision.py) is unchanged by this mission. No new fields. No removed fields.

Single-Dispatch Invariant

For every call to decide_next_via_runtime(...):

> Exactly one of {composition path, legacy DAG path} executes per action attempt.

Concretely, when _should_dispatch_via_composition(action) == True: 1. _dispatch_via_composition(...) runs (composer + guard + advancement helper). 2. The returned Decision is yielded immediately. 3. runtime_next_step(...) is NOT called.

When _should_dispatch_via_composition(action) == False: 1. runtime_next_step(...) runs (legacy path) exactly as today. 2. _dispatch_via_composition(...) is NOT entered.

State Diagram

decide_next_via_runtime(action)
        |
        v
   _should_dispatch_via_composition(action) ?
        |
   yes  v                              no  v
   _dispatch_via_composition(action)        runtime_next_step(action)
        |                                     |
        v                                     v
   StepContractExecutor.execute(...)          (unchanged)
   _check_composed_action_guard(...)          |
   _advance_run_state_after_composition(...)  |
        |                                     |
        v                                     v
        Decision  (composition)               Decision  (legacy)

New Helper

def _advance_run_state_after_composition(
    repo_root: Path,
    action: str,
    ...,
) -> Decision:
    """Advance run state, lane events, and prompt progression for a composition-backed action.

    Reuses the same primitives as runtime_next_step(...) for state/lane/prompt
    progression but does NOT invoke the legacy DAG action handler — that is the
    point of single-dispatch (FR-001).
    """

The helper:

  • Emits lane status events via the same SyncRuntimeEventEmitter.
  • Records mission-state advancement via the same primitive that runtime_next_step(...) uses.
  • Computes the next public step.
  • Returns a Decision whose shape is identical to what runtime_next_step(...) would produce for the same advance — but with no legacy action dispatch performed.

Ordering Within _dispatch_via_composition(...)

The order within the composition path is fixed:

1. StepContractExecutor.execute(context) — composer runs the contract steps via ProfileInvocationExecutor.invoke(...). 2. _check_composed_action_guard(...) — fixed tasks guard (keyed by legacy_step_id) runs after composition. Unchanged. 3. _advance_run_state_after_composition(...) — new step. 4. Return Decision.

Step 2 must remain before step 3. Step 3 must run only on a successful step 1 + step 2.

tasks Guard Semantics (FR-003)

The fixed tasks guard is keyed by legacy_step_id:

legacy_step_idRequired state
tasks_outlinetasks.md exists
tasks_packagestasks.md exists AND ≥1 tasks/WP*.md exists
tasks_finalize (and the public tasks step)terminal state including a dependencies: block

This contract does not change. Existing tests for these branches must continue to pass without modification.

Test Surface

Test nameFileAsserts
test_composition_success_skips_legacy_dispatch[<action>]test_runtime_bridge_composition.pyparametrized over the 5 composed actions; legacy dispatch entry point is not called after composition success (FR-001/FR-015)
test_composition_success_advances_run_state_and_lane_eventstest_runtime_bridge_composition.pylane events emitted; Decision reflects progression to next public step (FR-002)
test_decision_shape_unchanged_for_composed_actiontest_runtime_bridge_composition.pyDecision field set is identical to legacy-path baseline (FR-005)
existing test_tasks__guard_test_runtime_bridge_composition.pycontinue to pass unchanged (FR-003)
test_non_composed_action_uses_legacy_runtime_next_steptest_runtime_bridge_composition.pyEDGE-002 — runtime_next_step(...) still runs for non-composed actions
test_advancement_helper_failure_propagates_no_legacy_fallbacktest_runtime_bridge_composition.pyEDGE-003 — when the helper raises, the error surfaces through the existing Decision error shape; legacy dispatch is not entered as a fallback

Failure Modes

  • Helper raises: surfaced through the existing Decision error shape; no fallback to legacy dispatch (EDGE-003).
  • StepContractExecutor.execute(...) raises: existing behavior — error is propagated/wrapped through the existing Decision error shape (no behavioral change vs. current main).
  • Guard fails (tasks semantics): existing behavior — guard returns a structured failure; this is unchanged.

Non-Goals

  • Editing mission-runtime.yaml (FR-004).
  • Adding a new mission runner or mission step type (NFR-005).
  • Changing the public Decision shape (FR-005).
  • Affecting non-composed actions (EDGE-002).

step_contract_executor_lifecycle.md

Contract: StepContractExecutor.execute(...) Invocation Lifecycle

Source file: src/specify_cli/mission_step_contracts/executor.py Spec coverage: FR-006, FR-007, FR-008, FR-014, EDGE-001, EDGE-004

Lifecycle Invariant

For every composed step s executed inside StepContractExecutor.execute(...):

> The invocation file produced by ProfileInvocationExecutor.invoke(...) for s MUST contain exactly one started record AND exactly one closing record.

The closing record is:

  • completed with outcome="done" if the per-step body returns normally.
  • failed with outcome="failed" if the per-step body raises.

No other outcome values are produced by this path.

Pseudocode (post-change)

for selected_contract in contracts_to_run:
    payload = self._invocation_executor.invoke(
        request_text=...,
        profile_hint=...,
        actor=...,
        mode_of_work=...,
        action_hint=selected_contract.action,   # FR-014
    )
    try:
        # existing per-step body — unchanged
        ...
    except Exception:
        # NOTE: outcome="done" or "failed" reflects whether the
        # *composition* step ran cleanly; it does NOT imply the host LLM
        # finished generation. The composition is a governance/trail unit.
        self._invocation_executor.complete_invocation(
            payload.invocation_id,
            outcome="failed",
        )
        raise
    else:
        self._invocation_executor.complete_invocation(
            payload.invocation_id,
            outcome="done",
        )

Closure API

  • Use ProfileInvocationExecutor.complete_invocation(invocation_id, outcome=...).
  • Do not import InvocationWriter from this module.
  • Do not write to .kittify/events/profile-invocations/*.jsonl directly.
  • These constraints satisfy FR-008, C-006, and C-007.

Outcome Semantics

PathOutcome valueWhy
Per-step body returns normally"done"Existing literal; composition step completed without error
Per-step body raises"failed"Existing literal; composition step raised; original exception is re-raised after close
User cancellation (out of scope here)"abandoned"Reserved for user-initiated cancellation flows; not produced by this path

The "completion does not imply host LLM finished generation" semantic is documented:

  • as a one-line code comment at the close site;
  • as an explicit test name (test_composed_action_outcome_is_done_even_though_composition_does_not_run_llm).

Multi-Step Pairing (EDGE-004)

Each composed step has its own try/except/else block. Pairing is per-step:

  • if step 1 succeeds and step 2 raises, step 1 is closed with done, step 2 with failed.
  • if all steps succeed, every step is closed with done.
  • the executor never re-uses a single try/except across multiple steps.

Test Surface

Test nameFileAsserts
test_composed_action_pairs_started_with_completedtest_invocation_e2e.py and test_software_dev_composition.pyevery JSONL produced by a composed action has exactly one started and one completed line (FR-006, Scenario B)
test_composed_step_failure_writes_failed_completiontest_invocation_e2e.pyper-step body patched to raise; JSONL has started+failed; original exception still propagates (FR-007, Scenario C, EDGE-001)
test_executor_uses_complete_invocation_api_onlytest_software_dev_composition.pymonkey-patches verify complete_invocation is called from the executor; InvocationWriter.write_* is never called from this module (FR-008)
test_step_contract_executor_passes_action_hinttest_software_dev_composition.pyeach invoke(...) call from the executor passes action_hint=selected_contract.action (FR-014)
test_governance_context_uses_contract_action_when_hint_suppliedtest_software_dev_composition.pygovernance context for a composed software-dev/specify reads action="specify" (FR-013)
test_composed_action_outcome_is_done_even_though_composition_does_not_run_llmtest_invocation_e2e.pynaming-as-documentation: outcome is "done" for composition success regardless of host execution
test_composed_action_multistep_pairingtest_software_dev_composition.pycomposed action with ≥2 invocations pairs each independently (EDGE-004)

Failure Modes

  • complete_invocation(...) raises in the else branch (e.g. writer IO failure): existing failure mode for the writer; the per-step body has already returned normally so no original exception is masked. Acceptable; not a new failure mode introduced by this contract.
  • Per-step body raises a BaseException (e.g. KeyboardInterrupt): the except Exception clause does NOT catch it; the invocation is left unclosed. This matches existing semantics for catastrophic interruption and is consistent with charter expectations (don't swallow BaseException).

Non-Goals

  • Inventing a new outcome value ("composed", "governance_only", etc.).
  • Adding a JSONL field describing host LLM execution status.
  • Adding retry / backoff around the close call.
  • Changing the complete_invocation(...) signature.