Context and Problem Statement

Agents rely on JSON output from spec-kitty commands to determine next actions. However, the outputs were too vague, causing confusion when unrelated files were dirty.

Real example from ~/tmp:

Agent: Run finalize-tasks
Output: {"result": "success", "updated_wp_count": 0}

Agent: Did the commit happen? Let me check...
Git status: Shows 41 deleted template files (UNRELATED)
Agent: Must have failed! Let me commit again...
Git: "nothing to commit" for kitty-specs/
Agent: ๐Ÿค” Confused - did my commit work or not?

Root cause: Vague JSON output didn't confirm:

  • Did the commit actually happen?
  • What was the commit hash (for verification)?
  • Which files were committed?
  • Why are there dirty files (related or unrelated)?

Question: Should agent commands provide explicit verification data in JSON output instead of vague status messages?

Decision Drivers

  • Agent decision-making - LLMs need explicit confirmation, not implicit success
  • Debugging - Agents should be able to verify operations succeeded
  • Unrelated dirty files - Common in repos (templates, config, experimental code)
  • Idempotency - Agents should know when operation already completed
  • Verification - Agents should check results without re-running operations

Considered Options

  • Option 1: Rich JSON with commit_hash, commit_created, files_committed
  • Option 2: Verbose mode flag (--verbose adds details)
  • Option 3: Separate verify command (spec-kitty verify finalize-tasks)
  • Option 4: Status quo (vague "success" message)

Decision Outcome

Chosen option: "Option 1: Rich JSON outputs", because:

  • Explicit confirmation (commit_created: true/false)
  • Verifiable (commit_hash for git verification)
  • Clear (files_committed lists what changed)
  • No extra commands needed (all in one response)
  • Prevents redundant operations (agent checks commit_created)

Consequences

Positive

  • Explicit confirmation - commit_created: true means commit happened
  • Verifiable results - commit_hash for git rev-parse verification
  • Clear scope - files_committed shows exactly what changed
  • Prevents confusion - Unrelated dirty files don't mislead agent
  • Idempotency check - commit_created: false means already done
  • Debugging-friendly - JSON contains all info needed to verify operation

Negative

  • Larger JSON payloads - More fields (commit_hash is 40 chars, files_committed is array)
  • Breaking change - Agents relying on old schema need updates
  • Not all commands updated - Only finalize-tasks enhanced so far (more work needed)
  • Complexity - More logic to populate additional fields

Neutral

  • Backward compatibility - Old fields still present (result, updated_wp_count)
  • JSON only - Human output unchanged (separate logic)
  • Optional fields - commit_hash can be null if no commit created
  • Extensible pattern - Can apply to other commands (move-task, mark-status)

Confirmation

We validated this decision by:

  • โœ… 6 tests for finalize-tasks JSON output
  • โœ… Test simulating ~/tmp scenario (unrelated dirty files)
  • โœ… commit_created prevents redundant commits
  • โœ… commit_hash enables verification
  • โœ… files_committed provides clarity on scope

Pros and Cons of the Options

Option 1: Rich JSON outputs (CHOSEN)

Add commit_hash, commit_created, files_committed to JSON response.

Pros:

  • Explicit: Agent knows if commit happened
  • Verifiable: Can check git rev-parse HEAD = commit_hash
  • Clear scope: files_committed lists what changed
  • One response: All info in JSON (no follow-up needed)
  • Prevents confusion: Distinguishes related from unrelated changes

Cons:

  • Larger JSON: More data to transmit
  • Breaking change: Requires agent updates
  • Code complexity: More fields to populate

Option 2: Verbose mode flag

Add --verbose flag to include details.

Pros:

  • Opt-in: Agents choose verbosity level
  • No breaking change: Default behavior unchanged
  • Smaller default response

Cons:

  • Agents must know to use --verbose
  • Two modes to maintain
  • Still doesn't prevent confusion (agent might not use it)

Option 3: Separate verify command

Add spec-kitty verify finalize-tasks to check results.

Pros:

  • Separation of concerns
  • No changes to existing command

Cons:

  • Two commands instead of one
  • Agent must know to run verify
  • Duplicates logic (finalize-tasks knows what it did)
  • Slower (extra command execution)

Option 4: Status quo

Keep vague "success" message.

Pros:

  • No changes needed
  • Minimal JSON payload

Cons:

  • Agents get confused by unrelated dirty files
  • No verification mechanism
  • Cannot distinguish "already done" from "just done"
  • Poor debugging experience

More Information

Implementation:

  • src/specify_cli/cli/commands/agent/feature.py::finalize_tasks() (enhanced)
  • JSON schema:
    {
      "result": "success",
      "commit_created": true,           // NEW: Explicit boolean
      "commit_hash": "5030c9c98d...",  // NEW: Verification SHA
      "files_committed": [              // NEW: Scope clarity
        "kitty-specs/.../tasks.md",
        "kitty-specs/.../WP01.md",
        ...
      ],
      "updated_wp_count": 2             // Original field
    }
    

Tests:

  • tests/integration/test_finalize_tasks_json_output.py (6 tests)
  • test_json_output_prevents_agent_confusion - Simulates ~/tmp scenario

Template Updates:

  • src/specify_cli/missions/software-dev/command-templates/tasks.md
  • Added: "DO NOT commit after finalize-tasks (commits automatically)"
  • Added: "Check commit_created and commit_hash from JSON"

Future Enhancements:

  • Apply to move-task, mark-status (status commit confirmations)
  • Apply to implement (worktree creation confirmation)
  • Apply to merge (merge commit hash and files)

Related ADRs:

  • ADR-15: Merge-First Suggestion (reduces need for auto-merge, but this improves fallback UX)

Version: 0.13.8 improvement (agent UX enhancement)

Pattern: "Explicit Confirmation over Implicit Success" - Agents need verification data, not vague status