Phase 0 Research: Mission Dossier & Parity Export

Date: 2026-02-21 | Feature: 042-local-mission-dossier-authority-parity-export

Research Questions & Findings

RQ1: What artifacts exist in current missions?

Question: Which files are currently created by the spec-kitty workflow for each mission type?

Findings:

Software-Dev Mission (src/specify_cli/missions/software-dev/):

  • spec.md (spec command template output)
  • plan.md (planning command output)
  • tasks.md (tasks command output)
  • tasks/WP*.md (one per work package, generated by tasks command)
  • Optional: research.md, data-model.md, quickstart.md, contracts/ (from /spec-kitty.plan)

Research Mission (src/specify_cli/missions/research/):

  • Typically only spec.md, plan.md, tasks.md (similar pattern)
  • May include research-specific outputs (e.g., literature review, methodology)

Documentation Mission (src/specify_cli/missions/documentation/):

  • spec.md, plan.md, tasks.md
  • Multiple generated doc files depending on iteration mode and generators

Key Finding: No formal manifest exists. Artifacts are implicit in command-template outputs. We must define explicitly what is "required" vs "optional" for completeness detection.


RQ2: How to compute deterministic parity hashes?

Question: Given a set of artifact files, how do we compute a hash that is deterministic (identical content → identical hash) across machines, timezones, and repeated scans?

Findings:

Approach: SHA256 hash of each artifact's content (bytes), then combine: 1. Hash each artifact file independently 2. Sort artifact hashes lexicographically 3. Concatenate sorted hashes 4. Compute SHA256 of concatenation

Determinism Guarantees:

  • No system-dependent path separators (use artifact_key, not filesystem path)
  • No timestamps (content-based only)
  • No git metadata (commit SHA optional, for provenance only)
  • No locale/timezone effects (UTF-8 enforced, no locale-aware string ops)

Encoding Safety:

  • Read files as binary (bytes)
  • Validate UTF-8 only for display/filtering
  • If UTF-8 fails: record error in anomaly event, continue with binary hash

Order Independence:

  • Sort hashes before combining → order doesn't matter
  • Enables incremental updates (recompute only if artifacts change)

RQ3: What event infrastructure already exists?

Question: Can we use spec-kitty-events contracts and OfflineQueue?

Findings:

Existing Infrastructure (src/specify_cli/sync/):

  • spec_kitty_events: Shared event schema library
  • event.schema.json: Base event envelope (type, actor, timestamp, lamport_clock, etc.)
  • OfflineQueue: Async event routing (emit → queue → webhook/file)
  • Event types registered in events.py

Compatibility:

  • ✅ Can extend event.schema.json with 4 new dossier payload schemas
  • ✅ OfflineQueue works with async/await (FastAPI compatible)
  • ✅ Webhook simulator already exists for testing

Constraint: Events are immutable append-only. No event replay/update this phase.


RQ4: How does the dashboard API layer work?

Question: What is the FastAPI pattern for dashboard endpoints?

Findings:

Existing Pattern (src/specify_cli/dashboard/api.py):

  • FastAPI with async handlers
  • Routes: /api/{resource}/{id}, /api/{resource}?filters
  • Response: JSON (Pydantic models or plain dicts)
  • No authentication (local-only, assumes trusted environment)

Dashboard Server (src/specify_cli/dashboard/server.py):

  • Uvicorn ASGI server
  • Serves Vue.js frontend + API routes
  • Accessible at http://localhost:8000 (default)

Vue Integration:

  • Dashboard is single-page app (SPA)
  • API calls via fetch or axios
  • Components reactive with Vue 3 composition API

Compatibility: ✅ Can add new dossier endpoints following existing pattern.


RQ5: What are encoding edge cases?

Question: How do we handle UTF-8 issues, special characters, binary files?

Findings:

UTF-8 Robustness:

  • spec.md, plan.md, tasks.md are always UTF-8 text
  • Optional research.md, data-model.md: assume UTF-8 if present
  • tasks/WP*.md: always UTF-8 (generated by spec-kitty)

Edge Cases: 1. File deleted between scans: emit MissionDossierArtifactMissing with reason_code="deleted_after_scan" 2. File permission denied: emit MissionDossierArtifactMissing with reason_code="unreadable" 3. Symlink or hard link: read content via symlink (use inode, not path, for dedup detection) 4. BOM (byte order mark): Include in hash (rare in spec-kitty artifacts, but handle consistently) 5. CRLF vs LF: Hash as-is (no normalization). Users responsible for consistent line endings.

Decision: No silent failures. Every read error → explicit anomaly event.


RQ6: How to structure work packages for parallelization?

Question: Can WP02 and WP03 be parallelized?

Findings:

Dependency Analysis:

  • WP01 (Models + Hashing): No dependencies
  • WP02 (Manifests): No code dependencies, but uses WP01 models in tests
  • WP03 (Indexing + Detection): Requires WP01 models, WP02 manifest schema

Parallelization Strategy:

  • WP01 and WP02 can run in parallel (share model contracts)
  • WP03 depends on both → must start after
  • WP04 (Events) depends on WP03 output schemas
  • WP05 (Snapshot) depends on WP04

Optimal Sequence:

WP01 ──────┐
           ├─→ WP03 ──→ WP04 ──→ WP05
WP02 ──────┘

WP02 and WP01 are truly parallel (no code coupling).


RQ7: What are test strategy implications?

Question: How do we test determinism and reproducibility?

Findings:

Determinism Tests: 1. Create identical artifacts on 2 filesystem paths 2. Scan both, compute parity hashes 3. Assert hashes are identical 4. Repeat with different artifact order → hash must be identical

Encoding Tests: 1. UTF-8 with BOM 2. UTF-8 with special chars (emoji, CJK) 3. Binary artifacts (if any in future) 4. CRLF vs LF

Missing Detection Tests: 1. Each mission type with all required artifacts → completeness_status="complete" 2. Each required artifact removed one-by-one → blocker event emitted 3. Optional artifact missing → no blocker event

Integration Tests: 1. Create a full feature (spec → plan → tasks) 2. Scan dossier 3. Emit all 4 event types 4. Verify SaaS webhook simulator can parse all events


RQ8: What is the SaaS parity contract?

Question: How does SaaS reconstruct artifact state from events?

Findings:

Proposed Contract: 1. SaaS receives MissionDossierArtifactIndexed events (one per artifact) 2. SaaS builds artifact catalog from all indexed events 3. SaaS receives MissionDossierSnapshotComputed event 4. Parity check: compare SaaS artifact count vs snapshot artifact_count 5. Optional: SaaS receives MissionDossierParityDriftDetected event if local and SaaS disagree

Event Ordering:

  • All MissionDossierArtifactIndexed events MUST precede MissionDossierSnapshotComputed
  • Ensures SaaS can reconstruct state deterministically

Immutability: Events are append-only. No update/delete this phase.


Manifest Design Decision: Step-Aware (Not Phase-Locked)

Expected Artifacts YAML Schema

Per mission type, define completeness requirements by mission step (from mission.yaml state machine), not hardcoded phases:

# src/specify_cli/missions/software-dev/expected-artifacts.yaml
schema_version: "1.0"
mission_type: "software-dev"
manifest_version: "1"

# Artifacts required at any workflow step
required_always:
  - artifact_key: "input.spec.main"
    artifact_class: "input"
    path_pattern: "spec.md"

# Artifacts required at specific mission steps
# Step names from mission.yaml states: discover, specify, plan, implement, review, done
required_by_step:
  discover:
    # No additional requirements beyond required_always

  specify:
    # Staying in specify step requires spec.md (already in required_always)

  plan:
    # Advancing to plan step requires plan.md
    - artifact_key: "output.plan.main"
      artifact_class: "output"
      path_pattern: "plan.md"

  implement:
    # Advancing to implement requires plan + tasks
    - artifact_key: "output.tasks.main"
      artifact_class: "output"
      path_pattern: "tasks.md"

    - artifact_key: "output.tasks.per_wp"
      artifact_class: "output"
      path_pattern: "tasks/*.md"

  review:
    # Advancing to review (post-implementation) requires all above

# Artifacts checked if present, but never block completeness
optional_always:
  - artifact_key: "evidence.research"
    artifact_class: "evidence"
    path_pattern: "research.md"

  - artifact_key: "evidence.data_model"
    artifact_class: "evidence"
    path_pattern: "data-model.md"

  - artifact_key: "evidence.contracts"
    artifact_class: "evidence"
    path_pattern: "contracts/*"

Why Step-Aware (Not Phase-Locked)?

  • Manifests use mission-defined states (from mission.yaml), not generic phases
  • Supports diverse workflows:
  • software-dev: discover → specify → plan → implement → review → done
  • research: scoping → methodology → gathering → synthesis → output → done
  • documentation: (mission-specific)
  • Decoupled from hardcoded workflow assumptions
  • Git-friendly (version control friendly)
  • YAML → pydantic model (type-safe runtime)

Artifact Classes (6 total, deterministic):

  • input: Requirements, specs, PRDs (user-provided)
  • workflow: Plan, roadmap, design docs
  • output: Generated artifacts (tasks, code, docs)
  • evidence: Research, analysis, proofs, test results
  • policy: Standards, guidelines, templates
  • runtime: Configuration, deployment, operational artifacts

Code Organization Decision

Dossier Subsystem Structure

src/specify_cli/dossier/
├── __init__.py              # Public API, exports
├── models.py                # Pydantic models (ArtifactRef, MissionDossier, etc.)
├── manifest.py              # ExpectedArtifactManifest, registry loader
├── indexer.py               # Artifact indexing, class detection
├── hasher.py                # SHA256 hashing, parity computation
├── events.py                # Dossier event emission (wraps sync.events)
├── store.py                 # Snapshot persistence (JSON/JSONL)
├── drift_detector.py        # Local parity-drift detection
└── __init__.py              # (already exists)

Rationale:

  • Self-contained subsystem (can be imported independently)
  • Clear separation: models → indexing → hashing → events
  • Tests mirror structure
  • Optional dashboard integration via import (not modification of core)

SaaS Integration Plan

Phase 1 (this feature): Local emission, offline baseline

  • Emit dossier events to OfflineQueue
  • Store local parity baseline in .kittify/dossier-baseline.json
  • Detect drift locally (no SaaS call)

Phase 2+ (future): SaaS backend integration

  • SaaS receives events via webhook or polling
  • SaaS stores artifacts and verifies parity
  • SaaS dashboard displays artifact catalog

Current Phase 1 Scope: Local runtime owns determinism and drift detection.


Key Assumptions Validated

1. ✅ Filesystem Stability: Files not being written during scan (user responsibility) 2. ✅ UTF-8 Encoding: All spec-kitty artifacts are UTF-8 or binary (no mixed) 3. ✅ No Real-Time Sync: Batch scans sufficient (no need for file watchers) 4. ✅ Sync Infrastructure Ready: spec_kitty_events and OfflineQueue are stable 5. ✅ Dashboard Server Running: Local HTTP server assumed available during scans 6. ✅ Determinism Achievable: No platform-dependent code needed (SHA256 + sort)


Open Questions Resolved

  • How to handle missing manifests? → Degrade gracefully (index-only, no missing detection)
  • How to handle new mission types? → Manifest registry extensible, fallback to index-only
  • How to compute parity hash? → SHA256 of sorted artifact hashes
  • Can dashboard work offline? → Yes, all indexing local
  • Can events be replayed? → No (Phase 1 is emit-only, append-only, immutable)

References

  • Existing Dashboard: src/specify_cli/dashboard/api.py, server.py
  • Sync Infrastructure: src/specify_cli/sync/events.py, OfflineQueue
  • Mission Structure: src/specify_cli/missions/{mission}/
  • Spec-Kitty-Events: /Users/robert/ClaudeCowork/Spec-Kitty-Cowork/spec-kitty-events/