Work Packages: Autonomous Multi-Agent Orchestrator
Inputs: Design documents from /kitty-specs/020-autonomous-multi-agent-orchestrator/ Prerequisites: plan.md (required), spec.md (user stories), data-model.md
Tests: Only include explicit testing work when stakeholders request it.
Organization: Fine-grained subtasks (Txxx) roll up into work packages (WPxx). Each work package must be independently deliverable and testable.
Prompt Files: Each work package references a matching prompt file in /tasks/ generated by /spec-kitty.tasks.
Subtask Format: [Txxx] [P?] Description
- [P] indicates the subtask can proceed in parallel (different files/components).
- Include precise file paths or modules.
Work Package WP01: Foundation & Config (Priority: P0)
Goal: Establish orchestrator package structure and configuration system. Independent Test: Config loads from YAML, validation rules enforced, sensible defaults work. Prompt: /tasks/WP01-foundation-and-config.md Estimated Size: ~350 lines
Included Subtasks
- ✅ T001 Create orchestrator package structure with
__init__.pyfiles - ✅ T002 Implement enums: OrchestrationStatus, WPStatus, FallbackStrategy
- ✅ T003 Implement OrchestratorConfig and AgentConfig dataclasses
- ✅ T004 Implement config.py with YAML parsing and validation
- ✅ T005 Implement default config generation for installed agents
Implementation Notes
- Create
src/specify_cli/orchestrator/package - Follow data-model.md schemas exactly
- Use
ruamel.yamlfor YAML parsing (consistent with existing codebase) - Validate all config constraints from data-model.md
Parallel Opportunities
- T002, T003 can proceed in parallel (different files)
Dependencies
- None (starting package)
Risks & Mitigations
- Config validation edge cases → comprehensive unit tests
Work Package WP02: Agent Invokers - Core Agents (Priority: P0)
Goal: Implement base AgentInvoker protocol and invokers for the 5 most common agents. Independent Test: Each invoker correctly builds CLI commands, detects installation, parses output. Prompt: /tasks/WP02-agent-invokers-core.md Estimated Size: ~450 lines
Included Subtasks
- ✅ T006 Implement AgentInvoker protocol in
agents/base.py - ✅ T007 [P] Implement Claude Code invoker in
agents/claude.py - ✅ T008 [P] Implement GitHub Codex invoker in
agents/codex.py - ✅ T009 [P] Implement GitHub Copilot invoker in
agents/copilot.py - ✅ T010 [P] Implement Google Gemini invoker in
agents/gemini.py
Implementation Notes
- Base protocol defines:
is_installed(),build_command(),parse_output() - Each invoker implements agent-specific CLI flags from plan.md table
- Use
shutil.which()for installation detection - Support both stdin and argument-based prompt passing
Parallel Opportunities
- T007-T010 are fully parallel (separate files, same interface)
Dependencies
- Depends on WP01 (needs config dataclasses)
Risks & Mitigations
- CLI flag changes → document agent versions, add fallback detection
Work Package WP03: Agent Invokers - Additional Agents (Priority: P0)
Goal: Complete the remaining 4 agent invokers plus agent detection utilities. Independent Test: All 9 agents can be invoked when installed, detection identifies available agents. Prompt: /tasks/WP03-agent-invokers-additional.md Estimated Size: ~400 lines
Included Subtasks
- ✅ T011 [P] Implement Qwen Code invoker in
agents/qwen.py - ✅ T012 [P] Implement OpenCode invoker in
agents/opencode.py - ✅ T013 [P] Implement Kilocode invoker in
agents/kilocode.py - ✅ T014 [P] Implement Augment Code invoker in
agents/augment.py - ✅ T015 Implement Cursor invoker with timeout wrapper in
agents/cursor.py - ✅ T016 Implement agent registry and detection in
agents/__init__.py
Implementation Notes
- Cursor requires special
timeout 300wrapper for hanging issue - Agent registry maps agent_id → invoker class
- Detection returns list of installed agents sorted by priority
Parallel Opportunities
- T011-T015 are parallel (separate agent files)
Dependencies
- Depends on WP01, WP02 (base protocol)
Risks & Mitigations
- Cursor hanging → timeout wrapper with configurable duration
Work Package WP04: State Management (Priority: P0)
Goal: Implement orchestration state persistence for resume capability. Independent Test: State saves to JSON, loads correctly, atomic writes prevent corruption. Prompt: /tasks/WP04-state-management.md Estimated Size: ~380 lines
Included Subtasks
- ✅ T017 Implement OrchestrationRun dataclass with serialization
- ✅ T018 Implement WPExecution dataclass with status tracking
- ✅ T019 Implement InvocationResult dataclass
- ✅ T020 Implement state.py with save/load JSON functions
- ✅ T021 Implement atomic writes with backup-before-modify pattern
Implementation Notes
- Follow
merge/state.pypatterns for consistency - State file:
.kittify/orchestration-state.json - Support datetime serialization/deserialization
- Validate state transitions per data-model.md rules
Parallel Opportunities
- T017-T019 can proceed in parallel (dataclasses)
Dependencies
- Depends on WP01 (enums and config)
Risks & Mitigations
- State corruption → atomic writes, backup before modify
Work Package WP05: Scheduler (Priority: P1)
Goal: Implement WP dependency resolution and agent assignment logic. Independent Test: Scheduler correctly identifies ready WPs, assigns agents by priority, respects concurrency. Prompt: /tasks/WP05-scheduler.md Estimated Size: ~420 lines
Included Subtasks
- ✅ T022 Implement dependency graph reading from WP frontmatter
- ✅ T023 Implement ready WP detection (all dependencies satisfied)
- ✅ T024 Implement agent selection by role and priority
- ✅ T025 Implement concurrency semaphores (global and per-agent)
- ✅ T026 Implement single-agent mode handling
Implementation Notes
- Reuse existing
core/dependency_graph.pyfor graph operations - Ready = dependencies all in "done" lane
- Selection: filter by role → sort by priority → check concurrency
- Single-agent mode: same agent for both roles, configurable delay
Parallel Opportunities
- Limited - scheduler logic is sequential
Dependencies
- Depends on WP01, WP04 (config, state tracking)
Risks & Mitigations
- Circular dependencies → detect and reject at startup
Work Package WP06: Executor (Priority: P1)
Goal: Implement agent process spawning and management with asyncio. Independent Test: Processes spawn correctly, stdin piping works, output captured, timeouts enforced. Prompt: /tasks/WP06-executor.md Estimated Size: ~450 lines
Included Subtasks
- ✅ T027 Implement async process spawning with
asyncio.create_subprocess_exec - ✅ T028 Implement stdin piping for WP prompt content
- ✅ T029 Implement stdout/stderr capture to log files
- ✅ T030 Implement timeout handling with
asyncio.wait_for - ✅ T031 Implement worktree creation integration
Implementation Notes
- Use
asyncio.create_subprocess_execfor clean process management - Pipe WP prompt file content to agent stdin
- Capture logs to
.kittify/logs/WP##-{role}.log - Integrate with existing
spec-kitty implementfor worktree creation
Parallel Opportunities
- T029, T030 can proceed in parallel (different concerns)
Dependencies
- Depends on WP02, WP03, WP04 (invokers, state)
Risks & Mitigations
- Runaway processes → timeout with SIGTERM then SIGKILL
Work Package WP07: Monitor (Priority: P1)
Goal: Implement completion detection, failure handling, and lane status updates. Independent Test: Exit codes detected, JSON parsed, retries work, fallback strategies execute. Prompt: /tasks/WP07-monitor.md Estimated Size: ~480 lines
Included Subtasks
- ✅ T032 Implement exit code detection and success/failure determination
- ✅ T033 Implement JSON output parsing for structured results
- ✅ T034 Implement retry logic with configurable limits
- ✅ T035 Implement fallback strategy execution (next_in_list, same_agent, fail)
- ✅ T036 Implement lane status updates via existing commands
- ✅ T037 Implement human escalation for unrecoverable failures
Implementation Notes
- Exit code 0 = success, non-zero = failure
- Parse JSON when available for detailed results
- Retry with same agent first, then apply fallback strategy
- Use
spec-kitty agent tasks move-taskfor lane updates - Escalation: pause orchestration, print alert, wait for user
Parallel Opportunities
- T032, T033 can proceed in parallel (detection vs parsing)
Dependencies
- Depends on WP06 (executor output)
Risks & Mitigations
- Ambiguous failure states → log everything, fail closed
Work Package WP08: CLI Commands (Priority: P2)
Goal: Implement the spec-kitty orchestrate CLI command with all options. Independent Test: All CLI flags work: --feature, --status, --resume, --abort. Prompt: /tasks/WP08-cli-commands.md Estimated Size: ~400 lines
Included Subtasks
- ✅ T038 Implement
spec-kitty orchestrate --featurecommand - ✅ T039 Implement
spec-kitty orchestrate --statuscommand - ✅ T040 Implement
spec-kitty orchestrate --resumecommand - ✅ T041 Implement
spec-kitty orchestrate --abortcommand - ✅ T042 Add help text and CLI documentation
Implementation Notes
- Use typer for CLI (consistent with existing commands)
--feature: start new orchestration--status: show progress, active WPs, elapsed time--resume: load state, continue from where stopped--abort: cleanup processes, save state, exit cleanly
Parallel Opportunities
- T039-T041 can proceed in parallel after T038 structure exists
Dependencies
- Depends on WP05, WP06, WP07 (scheduler, executor, monitor)
Risks & Mitigations
- Signal handling for Ctrl+C → proper cleanup hooks
Work Package WP09: Integration & Polish (Priority: P2)
Goal: Integrate all components, handle edge cases, add progress display. Independent Test: Full orchestration runs on a test feature, handles all edge cases. Prompt: /tasks/WP09-integration-and-polish.md Estimated Size: ~350 lines
Included Subtasks
- ✅ T043 Implement main orchestration loop connecting all components
- ✅ T044 Add progress display during execution (Rich console)
- ✅ T045 Add summary report on completion
- ✅ T046 Handle edge cases: circular deps, no agents, worktree failures
Implementation Notes
- Main loop: schedule → execute → monitor → repeat
- Progress: show active WPs, completed count, elapsed time
- Summary: total time, WPs completed, agents used, errors
- Edge cases from spec.md edge cases section
Parallel Opportunities
- T044, T045 can proceed in parallel (display vs summary)
Dependencies
- Depends on WP08 (CLI entry point)
Risks & Mitigations
- Integration bugs → comprehensive integration tests
Dependency & Execution Summary
Phase 0 - Foundation:
WP01 (Foundation) ──────────────────────┐
│
Phase 1 - Components: ▼
WP02 (Core Invokers) ◄─────────────── WP01
WP03 (Additional Invokers) ◄───────── WP01, WP02
WP04 (State Management) ◄──────────── WP01
│
Phase 2 - Core Logic: ▼
WP05 (Scheduler) ◄─────────────────── WP01, WP04
WP06 (Executor) ◄──────────────────── WP02, WP03, WP04
WP07 (Monitor) ◄───────────────────── WP06
│
Phase 3 - CLI & Integration: ▼
WP08 (CLI Commands) ◄──────────────── WP05, WP06, WP07
WP09 (Integration) ◄───────────────── WP08
Parallelization Opportunities:
- WP02, WP03, WP04 can run in parallel after WP01
- WP05, WP06 can start in parallel once their deps complete
- Individual agent invokers (T007-T015) are highly parallelizable
MVP Scope: WP01-WP08 (core functionality without polish)
Subtask Index (Reference)
| Subtask ID | Summary | Work Package | Priority | Parallel? |
|---|---|---|---|---|
| T001 | Create orchestrator package structure | WP01 | P0 | No |
| T002 | Implement enums | WP01 | P0 | Yes |
| T003 | Implement config dataclasses | WP01 | P0 | Yes |
| T004 | Implement config.py YAML parsing | WP01 | P0 | No |
| T005 | Implement default config generation | WP01 | P0 | No |
| T006 | Implement AgentInvoker protocol | WP02 | P0 | No |
| T007 | Implement Claude Code invoker | WP02 | P0 | Yes |
| T008 | Implement GitHub Codex invoker | WP02 | P0 | Yes |
| T009 | Implement GitHub Copilot invoker | WP02 | P0 | Yes |
| T010 | Implement Google Gemini invoker | WP02 | P0 | Yes |
| T011 | Implement Qwen Code invoker | WP03 | P0 | Yes |
| T012 | Implement OpenCode invoker | WP03 | P0 | Yes |
| T013 | Implement Kilocode invoker | WP03 | P0 | Yes |
| T014 | Implement Augment Code invoker | WP03 | P0 | Yes |
| T015 | Implement Cursor invoker with timeout | WP03 | P0 | Yes |
| T016 | Implement agent registry and detection | WP03 | P0 | No |
| T017 | Implement OrchestrationRun dataclass | WP04 | P0 | Yes |
| T018 | Implement WPExecution dataclass | WP04 | P0 | Yes |
| T019 | Implement InvocationResult dataclass | WP04 | P0 | Yes |
| T020 | Implement state.py save/load | WP04 | P0 | No |
| T021 | Implement atomic writes | WP04 | P0 | No |
| T022 | Implement dependency graph reading | WP05 | P1 | No |
| T023 | Implement ready WP detection | WP05 | P1 | No |
| T024 | Implement agent selection | WP05 | P1 | No |
| T025 | Implement concurrency semaphores | WP05 | P1 | No |
| T026 | Implement single-agent mode | WP05 | P1 | No |
| T027 | Implement async process spawning | WP06 | P1 | No |
| T028 | Implement stdin piping | WP06 | P1 | No |
| T029 | Implement stdout/stderr capture | WP06 | P1 | Yes |
| T030 | Implement timeout handling | WP06 | P1 | Yes |
| T031 | Implement worktree creation integration | WP06 | P1 | No |
| T032 | Implement exit code detection | WP07 | P1 | Yes |
| T033 | Implement JSON output parsing | WP07 | P1 | Yes |
| T034 | Implement retry logic | WP07 | P1 | No |
| T035 | Implement fallback strategies | WP07 | P1 | No |
| T036 | Implement lane status updates | WP07 | P1 | No |
| T037 | Implement human escalation | WP07 | P1 | No |
| T038 | Implement orchestrate --feature | WP08 | P2 | No |
| T039 | Implement orchestrate --status | WP08 | P2 | Yes |
| T040 | Implement orchestrate --resume | WP08 | P2 | Yes |
| T041 | Implement orchestrate --abort | WP08 | P2 | Yes |
| T042 | Add help text and CLI docs | WP08 | P2 | No |
| T043 | Implement main orchestration loop | WP09 | P2 | No |
| T044 | Add progress display | WP09 | P2 | Yes |
| T045 | Add summary report | WP09 | P2 | Yes |
| T046 | Handle edge cases | WP09 | P2 | No |
<!-- status-model:start -->
Canonical Status (Generated)
<!-- status-model:end -->
- WP01: done
- WP02: done
- WP03: done
- WP04: done
- WP05: done
- WP06: done
- WP07: done
- WP08: done
- WP09: done