Work Packages: Autonomous Multi-Agent Orchestrator

Inputs: Design documents from /kitty-specs/020-autonomous-multi-agent-orchestrator/ Prerequisites: plan.md (required), spec.md (user stories), data-model.md

Tests: Only include explicit testing work when stakeholders request it.

Organization: Fine-grained subtasks (Txxx) roll up into work packages (WPxx). Each work package must be independently deliverable and testable.

Prompt Files: Each work package references a matching prompt file in /tasks/ generated by /spec-kitty.tasks.

Subtask Format: [Txxx] [P?] Description

  • [P] indicates the subtask can proceed in parallel (different files/components).
  • Include precise file paths or modules.

Work Package WP01: Foundation & Config (Priority: P0)

Goal: Establish orchestrator package structure and configuration system. Independent Test: Config loads from YAML, validation rules enforced, sensible defaults work. Prompt: /tasks/WP01-foundation-and-config.md Estimated Size: ~350 lines

Included Subtasks

  • ✅ T001 Create orchestrator package structure with __init__.py files
  • ✅ T002 Implement enums: OrchestrationStatus, WPStatus, FallbackStrategy
  • ✅ T003 Implement OrchestratorConfig and AgentConfig dataclasses
  • ✅ T004 Implement config.py with YAML parsing and validation
  • ✅ T005 Implement default config generation for installed agents

Implementation Notes

  • Create src/specify_cli/orchestrator/ package
  • Follow data-model.md schemas exactly
  • Use ruamel.yaml for YAML parsing (consistent with existing codebase)
  • Validate all config constraints from data-model.md

Parallel Opportunities

  • T002, T003 can proceed in parallel (different files)

Dependencies

  • None (starting package)

Risks & Mitigations

  • Config validation edge cases → comprehensive unit tests

Work Package WP02: Agent Invokers - Core Agents (Priority: P0)

Goal: Implement base AgentInvoker protocol and invokers for the 5 most common agents. Independent Test: Each invoker correctly builds CLI commands, detects installation, parses output. Prompt: /tasks/WP02-agent-invokers-core.md Estimated Size: ~450 lines

Included Subtasks

  • ✅ T006 Implement AgentInvoker protocol in agents/base.py
  • ✅ T007 [P] Implement Claude Code invoker in agents/claude.py
  • ✅ T008 [P] Implement GitHub Codex invoker in agents/codex.py
  • ✅ T009 [P] Implement GitHub Copilot invoker in agents/copilot.py
  • ✅ T010 [P] Implement Google Gemini invoker in agents/gemini.py

Implementation Notes

  • Base protocol defines: is_installed(), build_command(), parse_output()
  • Each invoker implements agent-specific CLI flags from plan.md table
  • Use shutil.which() for installation detection
  • Support both stdin and argument-based prompt passing

Parallel Opportunities

  • T007-T010 are fully parallel (separate files, same interface)

Dependencies

  • Depends on WP01 (needs config dataclasses)

Risks & Mitigations

  • CLI flag changes → document agent versions, add fallback detection

Work Package WP03: Agent Invokers - Additional Agents (Priority: P0)

Goal: Complete the remaining 4 agent invokers plus agent detection utilities. Independent Test: All 9 agents can be invoked when installed, detection identifies available agents. Prompt: /tasks/WP03-agent-invokers-additional.md Estimated Size: ~400 lines

Included Subtasks

  • ✅ T011 [P] Implement Qwen Code invoker in agents/qwen.py
  • ✅ T012 [P] Implement OpenCode invoker in agents/opencode.py
  • ✅ T013 [P] Implement Kilocode invoker in agents/kilocode.py
  • ✅ T014 [P] Implement Augment Code invoker in agents/augment.py
  • ✅ T015 Implement Cursor invoker with timeout wrapper in agents/cursor.py
  • ✅ T016 Implement agent registry and detection in agents/__init__.py

Implementation Notes

  • Cursor requires special timeout 300 wrapper for hanging issue
  • Agent registry maps agent_id → invoker class
  • Detection returns list of installed agents sorted by priority

Parallel Opportunities

  • T011-T015 are parallel (separate agent files)

Dependencies

  • Depends on WP01, WP02 (base protocol)

Risks & Mitigations

  • Cursor hanging → timeout wrapper with configurable duration

Work Package WP04: State Management (Priority: P0)

Goal: Implement orchestration state persistence for resume capability. Independent Test: State saves to JSON, loads correctly, atomic writes prevent corruption. Prompt: /tasks/WP04-state-management.md Estimated Size: ~380 lines

Included Subtasks

  • ✅ T017 Implement OrchestrationRun dataclass with serialization
  • ✅ T018 Implement WPExecution dataclass with status tracking
  • ✅ T019 Implement InvocationResult dataclass
  • ✅ T020 Implement state.py with save/load JSON functions
  • ✅ T021 Implement atomic writes with backup-before-modify pattern

Implementation Notes

  • Follow merge/state.py patterns for consistency
  • State file: .kittify/orchestration-state.json
  • Support datetime serialization/deserialization
  • Validate state transitions per data-model.md rules

Parallel Opportunities

  • T017-T019 can proceed in parallel (dataclasses)

Dependencies

  • Depends on WP01 (enums and config)

Risks & Mitigations

  • State corruption → atomic writes, backup before modify

Work Package WP05: Scheduler (Priority: P1)

Goal: Implement WP dependency resolution and agent assignment logic. Independent Test: Scheduler correctly identifies ready WPs, assigns agents by priority, respects concurrency. Prompt: /tasks/WP05-scheduler.md Estimated Size: ~420 lines

Included Subtasks

  • ✅ T022 Implement dependency graph reading from WP frontmatter
  • ✅ T023 Implement ready WP detection (all dependencies satisfied)
  • ✅ T024 Implement agent selection by role and priority
  • ✅ T025 Implement concurrency semaphores (global and per-agent)
  • ✅ T026 Implement single-agent mode handling

Implementation Notes

  • Reuse existing core/dependency_graph.py for graph operations
  • Ready = dependencies all in "done" lane
  • Selection: filter by role → sort by priority → check concurrency
  • Single-agent mode: same agent for both roles, configurable delay

Parallel Opportunities

  • Limited - scheduler logic is sequential

Dependencies

  • Depends on WP01, WP04 (config, state tracking)

Risks & Mitigations

  • Circular dependencies → detect and reject at startup

Work Package WP06: Executor (Priority: P1)

Goal: Implement agent process spawning and management with asyncio. Independent Test: Processes spawn correctly, stdin piping works, output captured, timeouts enforced. Prompt: /tasks/WP06-executor.md Estimated Size: ~450 lines

Included Subtasks

  • ✅ T027 Implement async process spawning with asyncio.create_subprocess_exec
  • ✅ T028 Implement stdin piping for WP prompt content
  • ✅ T029 Implement stdout/stderr capture to log files
  • ✅ T030 Implement timeout handling with asyncio.wait_for
  • ✅ T031 Implement worktree creation integration

Implementation Notes

  • Use asyncio.create_subprocess_exec for clean process management
  • Pipe WP prompt file content to agent stdin
  • Capture logs to .kittify/logs/WP##-{role}.log
  • Integrate with existing spec-kitty implement for worktree creation

Parallel Opportunities

  • T029, T030 can proceed in parallel (different concerns)

Dependencies

  • Depends on WP02, WP03, WP04 (invokers, state)

Risks & Mitigations

  • Runaway processes → timeout with SIGTERM then SIGKILL

Work Package WP07: Monitor (Priority: P1)

Goal: Implement completion detection, failure handling, and lane status updates. Independent Test: Exit codes detected, JSON parsed, retries work, fallback strategies execute. Prompt: /tasks/WP07-monitor.md Estimated Size: ~480 lines

Included Subtasks

  • ✅ T032 Implement exit code detection and success/failure determination
  • ✅ T033 Implement JSON output parsing for structured results
  • ✅ T034 Implement retry logic with configurable limits
  • ✅ T035 Implement fallback strategy execution (next_in_list, same_agent, fail)
  • ✅ T036 Implement lane status updates via existing commands
  • ✅ T037 Implement human escalation for unrecoverable failures

Implementation Notes

  • Exit code 0 = success, non-zero = failure
  • Parse JSON when available for detailed results
  • Retry with same agent first, then apply fallback strategy
  • Use spec-kitty agent tasks move-task for lane updates
  • Escalation: pause orchestration, print alert, wait for user

Parallel Opportunities

  • T032, T033 can proceed in parallel (detection vs parsing)

Dependencies

  • Depends on WP06 (executor output)

Risks & Mitigations

  • Ambiguous failure states → log everything, fail closed

Work Package WP08: CLI Commands (Priority: P2)

Goal: Implement the spec-kitty orchestrate CLI command with all options. Independent Test: All CLI flags work: --feature, --status, --resume, --abort. Prompt: /tasks/WP08-cli-commands.md Estimated Size: ~400 lines

Included Subtasks

  • ✅ T038 Implement spec-kitty orchestrate --feature command
  • ✅ T039 Implement spec-kitty orchestrate --status command
  • ✅ T040 Implement spec-kitty orchestrate --resume command
  • ✅ T041 Implement spec-kitty orchestrate --abort command
  • ✅ T042 Add help text and CLI documentation

Implementation Notes

  • Use typer for CLI (consistent with existing commands)
  • --feature: start new orchestration
  • --status: show progress, active WPs, elapsed time
  • --resume: load state, continue from where stopped
  • --abort: cleanup processes, save state, exit cleanly

Parallel Opportunities

  • T039-T041 can proceed in parallel after T038 structure exists

Dependencies

  • Depends on WP05, WP06, WP07 (scheduler, executor, monitor)

Risks & Mitigations

  • Signal handling for Ctrl+C → proper cleanup hooks

Work Package WP09: Integration & Polish (Priority: P2)

Goal: Integrate all components, handle edge cases, add progress display. Independent Test: Full orchestration runs on a test feature, handles all edge cases. Prompt: /tasks/WP09-integration-and-polish.md Estimated Size: ~350 lines

Included Subtasks

  • ✅ T043 Implement main orchestration loop connecting all components
  • ✅ T044 Add progress display during execution (Rich console)
  • ✅ T045 Add summary report on completion
  • ✅ T046 Handle edge cases: circular deps, no agents, worktree failures

Implementation Notes

  • Main loop: schedule → execute → monitor → repeat
  • Progress: show active WPs, completed count, elapsed time
  • Summary: total time, WPs completed, agents used, errors
  • Edge cases from spec.md edge cases section

Parallel Opportunities

  • T044, T045 can proceed in parallel (display vs summary)

Dependencies

  • Depends on WP08 (CLI entry point)

Risks & Mitigations

  • Integration bugs → comprehensive integration tests

Dependency & Execution Summary

Phase 0 - Foundation:
  WP01 (Foundation) ──────────────────────┐
                                          │
Phase 1 - Components:                     ▼
  WP02 (Core Invokers) ◄─────────────── WP01
  WP03 (Additional Invokers) ◄───────── WP01, WP02
  WP04 (State Management) ◄──────────── WP01
                                          │
Phase 2 - Core Logic:                     ▼
  WP05 (Scheduler) ◄─────────────────── WP01, WP04
  WP06 (Executor) ◄──────────────────── WP02, WP03, WP04
  WP07 (Monitor) ◄───────────────────── WP06
                                          │
Phase 3 - CLI & Integration:              ▼
  WP08 (CLI Commands) ◄──────────────── WP05, WP06, WP07
  WP09 (Integration) ◄───────────────── WP08

Parallelization Opportunities:

  • WP02, WP03, WP04 can run in parallel after WP01
  • WP05, WP06 can start in parallel once their deps complete
  • Individual agent invokers (T007-T015) are highly parallelizable

MVP Scope: WP01-WP08 (core functionality without polish)


Subtask Index (Reference)

Subtask IDSummaryWork PackagePriorityParallel?
T001Create orchestrator package structureWP01P0No
T002Implement enumsWP01P0Yes
T003Implement config dataclassesWP01P0Yes
T004Implement config.py YAML parsingWP01P0No
T005Implement default config generationWP01P0No
T006Implement AgentInvoker protocolWP02P0No
T007Implement Claude Code invokerWP02P0Yes
T008Implement GitHub Codex invokerWP02P0Yes
T009Implement GitHub Copilot invokerWP02P0Yes
T010Implement Google Gemini invokerWP02P0Yes
T011Implement Qwen Code invokerWP03P0Yes
T012Implement OpenCode invokerWP03P0Yes
T013Implement Kilocode invokerWP03P0Yes
T014Implement Augment Code invokerWP03P0Yes
T015Implement Cursor invoker with timeoutWP03P0Yes
T016Implement agent registry and detectionWP03P0No
T017Implement OrchestrationRun dataclassWP04P0Yes
T018Implement WPExecution dataclassWP04P0Yes
T019Implement InvocationResult dataclassWP04P0Yes
T020Implement state.py save/loadWP04P0No
T021Implement atomic writesWP04P0No
T022Implement dependency graph readingWP05P1No
T023Implement ready WP detectionWP05P1No
T024Implement agent selectionWP05P1No
T025Implement concurrency semaphoresWP05P1No
T026Implement single-agent modeWP05P1No
T027Implement async process spawningWP06P1No
T028Implement stdin pipingWP06P1No
T029Implement stdout/stderr captureWP06P1Yes
T030Implement timeout handlingWP06P1Yes
T031Implement worktree creation integrationWP06P1No
T032Implement exit code detectionWP07P1Yes
T033Implement JSON output parsingWP07P1Yes
T034Implement retry logicWP07P1No
T035Implement fallback strategiesWP07P1No
T036Implement lane status updatesWP07P1No
T037Implement human escalationWP07P1No
T038Implement orchestrate --featureWP08P2No
T039Implement orchestrate --statusWP08P2Yes
T040Implement orchestrate --resumeWP08P2Yes
T041Implement orchestrate --abortWP08P2Yes
T042Add help text and CLI docsWP08P2No
T043Implement main orchestration loopWP09P2No
T044Add progress displayWP09P2Yes
T045Add summary reportWP09P2Yes
T046Handle edge casesWP09P2No

<!-- status-model:start -->

Canonical Status (Generated)

<!-- status-model:end -->

  • WP01: done
  • WP02: done
  • WP03: done
  • WP04: done
  • WP05: done
  • WP06: done
  • WP07: done
  • WP08: done
  • WP09: done