Research Plan: Autonomous Multi-Agent Orchestration

Branch: 019-autonomous-multi-agent-orchestration-research | Date: 2026-01-18 | Spec: spec.md Mission: Research

Summary

Investigate the headless/CLI invocation capabilities of all 12 AI coding agents supported by spec-kitty. The research combines documentation review with hands-on local CLI testing to produce a comprehensive capability matrix and orchestration feasibility assessment.

Primary Deliverable: CLI capability matrix with working invocation examples for all agents that support headless operation.

Research Context

Methodology: Documentation review + local CLI testing Data Sources: Official docs, GitHub repos, npm/pip packages, local CLI execution Verification: All CLI-capable agents tested locally with --help and basic invocation Output Format: Markdown reports with source links and command examples Scope: Headless/CLI only (no IDE integration)

Constitution Check

No constitution file defined. Proceeding with standard research practices.

  • Research will cite all sources
  • Findings will be verifiable through provided commands
  • No proprietary information will be included

Research Structure

Documentation (this feature)

kitty-specs/019-autonomous-multi-agent-orchestration-research/
├── plan.md              # This file - research methodology
├── spec.md              # Research specification
├── research/            # Individual agent research files
│   ├── 01-claude-code.md
│   ├── 02-github-copilot.md
│   ├── 03-google-gemini.md
│   ├── 04-cursor.md
│   ├── 05-qwen-code.md
│   ├── 06-opencode.md
│   ├── 07-windsurf.md
│   ├── 08-github-codex.md
│   ├── 09-kilocode.md
│   ├── 10-augment-code.md
│   ├── 11-roo-cline.md
│   └── 12-amazon-q.md
├── research.md          # Consolidated findings & capability matrix
├── data-model.md        # Agent profile schema for orchestration config
└── quickstart.md        # How to test each agent CLI locally

No Source Code Changes

This is a research-only mission. No implementation code will be written.

Research Methodology

Phase 1: Agent Investigation (Per Agent)

For each of the 12 agents:

1. Documentation Review

  • Official documentation site
  • GitHub repository README
  • npm/pip package documentation
  • Release notes for CLI features

2. Local CLI Testing (if available)

  • Check if installed: which <cli-name> or <cli-name> --version
  • Inspect help: <cli-name> --help
  • Test basic invocation with a simple prompt
  • Document authentication requirements
  • Test task file input methods

3. Capability Assessment

  • Can it run headless (no IDE)?
  • How do you pass a task/prompt?
  • How do you detect completion?
  • What are the exit codes?
  • Can multiple instances run in parallel?

Phase 2: Synthesis

1. Build Capability Matrix

  • All 12 agents in rows
  • Columns: CLI available, invocation command, task input method, completion detection, parallel support

2. Identify Orchestration-Ready Agents

  • Which agents can participate in autonomous workflows?
  • What are the limiting factors for others?

3. Propose Configuration Schema

  • YAML schema for .kittify/agents.yaml
  • Implementation vs review role assignment
  • Fallback strategies

Agent Investigation Template

Each agent research file (research/XX-agent-name.md) will follow this structure:

# Agent: [Name]

## Basic Info
- **Directory**: `.agent/`
- **Primary Interface**: CLI / IDE / API
- **Vendor**: [Company]
- **Documentation**: [URL]

## CLI Availability

### Installation
[How to install the CLI tool]

### Verification

Command to verify installation


### Local Test Results

Actual output from running --help or version


## Task Specification

### How to Pass Instructions
- [ ] Command line argument
- [ ] Stdin
- [ ] File path (--file, -f)
- [ ] Prompt file in working directory
- [ ] Environment variable

### Example Invocation

Working command example


### Context Handling
[How does it handle codebase context?]

## Completion Detection

### Exit Codes
| Code | Meaning |
|------|---------|
| 0 | Success |
| 1 | Error |
| ... | ... |

### Output Format
- [ ] Stdout (plain text)
- [ ] Stdout (JSON)
- [ ] File output
- [ ] Structured logs

## Parallel Execution

### Rate Limits
[Documented rate limits or quotas]

### Concurrent Sessions
[Can multiple instances run simultaneously?]

### Resource Requirements
[Memory, CPU, tokens]

## Orchestration Assessment

### Can participate in autonomous workflow?
[ ] Yes / [ ] No / [ ] Partial

### Limitations
[What prevents full participation?]

### Integration Complexity
Low / Medium / High

## Sources
- [Link 1]
- [Link 2]

Data Collection Checklist

#AgentDocs ReviewedCLI TestedAssessment Complete
1Claude Code[ ][ ][ ]
2GitHub Copilot[ ][ ][ ]
3Google Gemini[ ][ ][ ]
4Cursor[ ][ ][ ]
5Qwen Code[ ][ ][ ]
6OpenCode[ ][ ][ ]
7Windsurf[ ][ ][ ]
8GitHub Codex[ ][ ][ ]
9Kilocode[ ][ ][ ]
10Augment Code[ ][ ][ ]
11Roo Cline[ ][ ][ ]
12Amazon Q[ ][ ][ ]

Quality Gates

From spec - research must satisfy:

  • QG-001: At least 6 of 12 agents have documented CLI invocation paths
  • QG-002: Cursor CLI specifically documented (user priority)
  • QG-003: All findings include source links
  • QG-004: Parallel execution constraints documented for CLI-capable agents

Risk Mitigation

RiskMitigation
Agent has no CLIDocument API alternative if available
CLI is undocumentedTest empirically, note "unofficial" status
Rate limits block testingDocument limits, test one request only
Auth requirementsNote setup steps, don't store credentials