Feature Specification: Hybrid Prompt and Shim Agent Surface

Feature Branch: 058-hybrid-prompt-and-shim-agent-surface Created: 2026-03-30 Status: Draft Input: Complete the 057 thin-shim architecture by restoring full prompts for planning commands while keeping thin shims for execution commands.

User Scenarios & Testing (mandatory)

User Story 1 — Agent Runs Specify in a Fresh Consumer Project (Priority: P1)

A contributor runs spec-kitty init in a new project, then invokes /spec-kitty.specify. The agent reads the full specify workflow prompt (200+ lines of discovery, template structure, and validation instructions) from .claude/commands/spec-kitty.specify.md and successfully conducts the specify workflow without errors, without searching for missing templates, and without hitting "missing required argument" failures.

Why this priority: This is the primary broken user journey. Every new consumer project is currently unable to run planning workflows.

Independent Test: Run spec-kitty init in a temp directory, then verify .claude/commands/spec-kitty.specify.md contains the full prompt (not a 3-line shim). Verify the agent can read and follow the instructions.

Acceptance Scenarios:

1. Given a fresh project after spec-kitty init, When an agent reads .claude/commands/spec-kitty.specify.md, Then the file contains the complete specify workflow prompt with discovery gates, branch context instructions, and spec template structure. 2. Given a fresh project after spec-kitty init, When an agent reads .claude/commands/spec-kitty.plan.md, Then the file contains the complete plan workflow prompt with planning interrogation and phase instructions. 3. Given a fresh project after spec-kitty init, When an agent reads .claude/commands/spec-kitty.tasks.md, Then the file contains the complete tasks workflow prompt with WP sizing guidance, ownership metadata requirements, and prompt generation rules.


User Story 2 — Agent Runs Implement via Thin Shim (Priority: P1)

A contributor invokes /spec-kitty.implement WP03 in a consumer project. The agent reads a thin shim from .claude/commands/spec-kitty.implement.md which dispatches to spec-kitty agent shim implement. The CLI resolves context, creates the worktree, emits the status event, and returns the workspace path and prompt file. The agent works in the worktree.

Why this priority: Execution commands must continue working via thin shims. The existing implement and review handlers are the model for CLI-driven commands.

Independent Test: Run spec-kitty init in a temp directory, then verify .claude/commands/spec-kitty.implement.md is a thin shim (3-4 lines). Verify the shim dispatches to a working CLI handler.

Acceptance Scenarios:

1. Given a project with a feature and WPs, When an agent invokes the implement shim, Then the CLI creates a worktree, emits a status event, and returns the workspace path. 2. Given a project with a feature and a WP in for_review, When an agent invokes the review shim, Then the CLI generates the review prompt and moves the WP to doing. 3. Given a project after init, When an agent reads spec-kitty.implement.md, Then it contains a thin shim (under 5 lines), not a full prompt.


User Story 3 — Existing Consumer Project Upgrades (Priority: P1)

A maintainer of an existing spec-kitty project runs spec-kitty upgrade. The migration replaces thin shims for prompt-driven commands with full prompts, while leaving thin shims for CLI-driven commands intact.

Why this priority: Existing consumer projects (spec-kitty-saas, spec-kitty-tracker, spec-kitty-planning) are currently broken. They need the migration to restore working slash commands.

Independent Test: Set up a project with thin shims for all 16 commands. Run spec-kitty upgrade. Verify prompt-driven commands now have full prompts and CLI-driven commands still have thin shims.

Acceptance Scenarios:

1. Given an existing project with thin shims for all commands, When the user runs spec-kitty upgrade, Then spec-kitty.specify.md is replaced with the full prompt and spec-kitty.implement.md remains a thin shim. 2. Given an already-upgraded project, When the user runs spec-kitty upgrade again, Then no changes are made (idempotent).


User Story 4 — Prompt Content Stays Current Across Upgrades (Priority: P2)

When spec-kitty releases a new version with updated prompt content (e.g., new ownership metadata guidance, corrected terminology), spec-kitty upgrade refreshes the full prompts from the global runtime without losing any project-specific overrides.

Why this priority: Prompt content evolves rapidly. The system must keep consumer projects current.

Independent Test: Modify a prompt in the package source, bump the version, run ensure_runtime(), then spec-kitty upgrade in a consumer project. Verify the prompt was updated.

Acceptance Scenarios:

1. Given a prompt update in the package, When ensure_runtime() runs, Then ~/.kittify/prompts/ contains the updated prompt. 2. Given a consumer project, When spec-kitty upgrade runs after a runtime refresh, Then prompt-driven command files are updated to the latest version.


User Story 5 — CLI-Driven Shim Dispatches to Real Handler (Priority: P2)

An agent invokes /spec-kitty.accept via the thin shim. The shim calls spec-kitty agent shim accept, which dispatches to the existing accept.py CLI handler. The handler runs the acceptance workflow and returns results. No "context resolution failed" errors occur because accept is classified as CLI-driven and doesn't require WP context resolution.

Why this priority: Shim entrypoints for CLI-driven commands currently fail because shim_dispatch() tries to resolve WP context for all commands.

Independent Test: Run spec-kitty agent shim accept --agent claude --raw-args "--feature 058-test" and verify it delegates to the accept handler.

Acceptance Scenarios:

1. Given a CLI-driven command (accept, merge, status, dashboard), When invoked via shim, Then shim_dispatch() delegates to the existing CLI handler without requiring WP context. 2. Given a prompt-driven command (specify, plan, tasks), When invoked via shim, Then shim_dispatch() returns immediately (the full prompt file handles the workflow, not the CLI).


Edge Cases

  • What happens when a user has custom overrides in .kittify/overrides/prompts/? The override takes precedence over the package prompt.
  • What happens when the global runtime (~/.kittify/prompts/) is stale? ensure_runtime() refreshes it on every CLI startup when the version changes.
  • What happens when a CLI-driven shim command (e.g., merge) is invoked without --feature? The existing handler's error message (with available features list) is shown.
  • What happens when both a full prompt and a thin shim exist for the same command? The full prompt wins (it's what the agent reads from .claude/commands/).

Requirements (mandatory)

Functional Requirements

IDTitleUser StoryPriorityStatus
FR-001Prompt-driven command classificationAs a maintainer, I want commands classified as prompt-driven or CLI-driven so that init and upgrade install the correct file type for each.HighOpen
FR-002Canonical prompt sourceAs a maintainer, I want cleaned-up canonical prompts stored in src/doctrine/prompts/ so that all consumer projects get consistent, up-to-date workflow instructions.HighOpen
FR-003Full prompt installation on initAs a contributor, I want spec-kitty init to install full prompt files for prompt-driven commands so that agents can run planning workflows immediately after init.HighOpen
FR-004Thin shim installation on initAs a contributor, I want spec-kitty init to install thin shims for CLI-driven commands so that execution workflows dispatch to CLI handlers.HighOpen
FR-005Global runtime prompt deploymentAs a maintainer, I want ensure_runtime() to deploy prompts to ~/.kittify/prompts/ so that init and upgrade can source them from a consistent location.HighOpen
FR-006CLI-driven shim dispatchAs an agent, I want shim_dispatch() to delegate CLI-driven commands (accept, merge, status, dashboard, tasks-finalize) to their existing CLI handlers so that shim invocations actually work.MediumOpen
FR-007Prompt-driven shim passthroughAs an agent, I want shim_dispatch() to return immediately for prompt-driven commands so that it doesn't fail on missing WP context.HighOpen
FR-008Upgrade migrationAs a maintainer of an existing project, I want spec-kitty upgrade to replace thin shims with full prompts for prompt-driven commands while preserving thin shims for CLI-driven commands.HighOpen
FR-009Idempotent upgradeAs a maintainer, I want the migration to be idempotent so that running upgrade multiple times produces the same result.MediumOpen
FR-010Prompt override supportAs a power user, I want to place custom overrides in .kittify/overrides/prompts/ that take precedence over package prompts.LowOpen

Non-Functional Requirements

IDTitleRequirementCategoryPriorityStatus
NFR-001Init speedspec-kitty init completes in under 5 seconds including prompt installation.PerformanceMediumOpen
NFR-002Prompt freshnessAfter ensure_runtime(), prompts in ~/.kittify/prompts/ match the installed package version exactly.ReliabilityHighOpen
NFR-003Test coverageAll new code achieves 90%+ test coverage with pytest. mypy --strict passes.QualityHighOpen
NFR-004Agent compatibilityPrompts install correctly for all 12 supported agent directories.CompatibilityHighOpen

Constraints

IDTitleConstraintCategoryPriorityStatus
C-001No new external dependenciesNo new Python package dependencies may be added.TechnicalHighOpen
C-002Backward compatibilityExisting CLI-driven shim behavior must not change. Only prompt-driven commands gain full prompts.TechnicalHighOpen
C-003Python 3.11+All code targets Python 3.11+ using typer, rich, ruamel.yaml, pytest, mypy (strict).TechnicalHighOpen
C-004Prompt content is genericPrompts must not contain dev-repo-specific references (feature 057 slugs, hardcoded paths). They must work in any consumer project.ContentHighOpen

Key Entities

  • Prompt-Driven Command: A slash command whose workflow is defined by rich markdown prompt content that guides an LLM through discovery, decision-making, and artifact generation. Examples: specify, plan, tasks.
  • CLI-Driven Command: A slash command whose workflow is implemented in Python CLI handlers. The agent invokes a thin shim that dispatches to the handler. Examples: implement, review, merge.
  • Canonical Prompt: A cleaned-up, generic markdown file in src/doctrine/prompts/ that contains the full workflow instructions for a prompt-driven command. Deployed to ~/.kittify/prompts/ by ensure_runtime() and installed to .claude/commands/ by init.
  • Thin Shim: A 3-4 line markdown file that tells the agent to run a CLI command. Installed to .claude/commands/ for CLI-driven commands only.

Success Criteria (mandatory)

Measurable Outcomes

  • SC-001: After spec-kitty init in a fresh project, every prompt-driven command file in .claude/commands/ contains 100+ lines of workflow instructions (not a 3-line shim).
  • SC-002: After spec-kitty init in a fresh project, every CLI-driven command file in .claude/commands/ contains a thin shim (under 5 lines).
  • SC-003: An agent can successfully run /spec-kitty.specify in a newly initialized consumer project without hitting "missing required argument" errors or searching for template files.
  • SC-004: spec-kitty upgrade in spec-kitty-saas, spec-kitty-tracker, and spec-kitty-planning restores working slash commands for all prompt-driven commands.
  • SC-005: The 9 canonical prompts in src/doctrine/prompts/ contain zero references to specific feature slugs, hardcoded dev-repo paths, or .kittify/missions/ template file paths.
  • SC-006: CLI-driven shim dispatch (accept, merge, status, dashboard, tasks-finalize) delegates to existing handlers without errors.