DRG Phase Zero --- Graph Model, Context Parity, and Surface Calibration

Mission ID: 01KP2YCESBSG61KQH5PQZ9662H Mission type: software-dev Parent epic: #461 --- Charter as Synthesis & Doctrine Reference Graph Phase: 0 (DRG prototype + invariant test) Scoped issues: #462, #470, #471, #472, #473, #474

Problem Statement

Spec Kitty's governance context assembly depends on implicit inline references scattered across doctrine artifact YAML files (tactic_refs, directive_refs, references, opposed_by) and action index files (directives, tactics, styleguides, toolguides, procedures). These references are not queryable as a graph, not independently testable, and not calibrated per action. Adding, removing, or reweighting doctrine artifacts requires editing multiple YAML files with no automated consistency check.

Phase 0 introduces the Doctrine Reference Graph (DRG) as an explicit, validated, queryable graph model that replaces the implicit reference web. It ships alongside the existing inline references --- Phase 1 deletes the inline references after Phase 0 proves parity.

Motivation

  • Governance context is assembled ad hoc. build_charter_context() loads action indices, intersects with project selections, and renders inline. There is no centralized model of what references what, so changes to doctrine artifacts can silently alter governance context.
  • Action surfaces are uncalibrated. The specify action may receive the same governance surface as implement, violating the minimum-effective-dose principle. There is no automated way to measure or enforce surface size inequalities across actions.
  • Phase 1 excision has no safety net. Deleting inline references and the curation pipeline requires proving the DRG path produces identical or intentionally-improved output. Without an invariant test, excision is a blind deletion.

Actors

  • Agent operator: Runs spec-kitty commands; receives governance context in prompts. Primary beneficiary of right-sized action surfaces.
  • Doctrine maintainer: Edits doctrine artifacts and action indices. Benefits from a single graph model instead of scattered inline references.
  • CI pipeline: Runs invariant and calibration tests on every PR that touches doctrine artifacts or context assembly code.

User Scenarios & Testing

Scenario 1: Migration produces a valid DRG from existing artifacts

A doctrine maintainer runs the migration extractor against the current shipped artifacts. The extractor walks all directive, tactic, paradigm, and action index YAML files, extracts every inline reference, and emits a graph.yaml file with typed edges. The resulting graph validates against the DRG Pydantic model: no dangling references, no unknown relation types, no malformed URNs.

Verification: Load graph.yaml via the Pydantic model; assert zero validation errors. Assert edge count >= the sum of all inline reference fields across all shipped artifacts.

Scenario 2: DRG-driven context matches legacy context

For every supported (profile, action, depth) combination, build_context_v2() resolves the same set of governance artifacts (by URN) as the currently shipped src/charter/context.py path. Reachability differences are either exact matches or itemized intentional improvements (e.g., legacy path was wrong-sized). Rendered-text differences (formatting, section ordering) are out of scope for Phase 0.

Verification: Run invariant test matrix; assert artifact-set identity or accepted difference with explicit justification per entry.

Scenario 3: Action surfaces respect calibration inequalities

The surface size (measured as artifact count or token estimate) for each action respects the minimum-effective-dose ordering:

|context(specify)| < |context(plan)| < |context(implement)|
|context(tasks)|   < |context(implement)|
|context(review)|  ~= |context(implement)|

Verification: Calibration test asserts these inequalities. When violated, the fix is adjusting the migration calibrator inputs or the action index files and regenerating graph.yaml, never adding filtering logic in the context builder. Post-Phase-0, graph.yaml becomes the authoritative source and can be edited directly.

Scenario 4: CI catches governance artifact reachability regressions

A contributor modifies a directive YAML file and opens a PR. The invariant test detects that the DRG path and the canonical build_charter_context() path now resolve different artifact sets for one (profile, action) pair. The PR is blocked until the contributor either regenerates graph.yaml to reflect the change or updates the expected-differences manifest.

Verification: CI runs both test harnesses on every PR touching src/doctrine/, src/charter/, or graph.yaml.

Baseline (already shipped or landing with PR #608)

The following are NOT greenfield --- they exist on main and must be preserved:

SurfaceLocationStatus
Action-scoped build_charter_context() with depth parametersrc/charter/context.pyCanonical baseline; parity oracle
Legacy build_charter_context() without depthsrc/specify_cli/charter/context.pyCompatibility surface; 2 callers remain
Action index files per mission/actionsrc/doctrine/missions/software-dev/actions/*/index.yamlInput to migration extractor
DoctrineService with 8 lazy-loaded repositoriessrc/doctrine/service.pyPreserved; DRG does not replace it
Glossary scope/store/middleware/CLIsrc/specify_cli/glossary/Untouched by this mission
Typed WPMetadata, 9-lane Lane, InReviewStateVariousUntouched by this mission
mission_id mandatory, drift fallback removedPR #608 (sync/emitter, status/models)Landing; treated as baseline
Schema version gating on artifacts (1.0)src/doctrine/shared/schema_utils.pyPreserved
Charter bundle layout.kittify/charter/Preserved
Prompt builder governance integrationsrc/specify_cli/next/prompt_builder.pyCall-site reroute needed (see FR-001)

Call-Site Audit Finding

Two callers still import from the OLD src/specify_cli/charter/context.py:

  • src/specify_cli/next/prompt_builder.py:13
  • src/specify_cli/cli/commands/agent/workflow.py:20

One caller uses the canonical NEW path:

  • src/specify_cli/cli/commands/charter.py:13

Phase 0 does NOT reroute these callers. The behavioral delta between the two implementations (canonical has depth/action-doctrine/guidelines; legacy does not) means rerouting is a user-visible behavior change, not a transparent import swap. The reroute is Phase 1 scope. The invariant test (FR-007) calls the canonical path directly as the parity oracle without changing any production call sites.

Functional Requirements

IDRequirementStatus
FR-001Audit all build_charter_context() call sites, document the behavioral delta between the canonical src/charter/context.py and the legacy src/specify_cli/charter/context.py, and confirm the canonical path is the correct parity oracle for FR-007. The actual reroute of callers is Phase 1 scope (after parity is confirmed).Proposed
FR-002Define a DRG schema as a single YAML file (graph.yaml) with node URN format (kind:id), typed edges (v1 relation types: requires, suggests, applies, scope, vocabulary, instantiates, replaces, delegates_to), and a Pydantic model that validates the graph.Proposed
FR-003The Pydantic model rejects malformed graphs: dangling references (edge target not a known node), unknown relation types, malformed URNs, and cycles in requires edges.Proposed
FR-004Provide a migration extractor that walks all shipped doctrine artifacts (directives, tactics, paradigms) and action index files, extracts every inline reference field, and emits equivalent typed edges into graph.yaml.Proposed
FR-005The migration extractor applies per-action surface calibration: each action (specify, plan, tasks, implement, review) receives scope edges that respect the minimum-effective-dose principle.Proposed
FR-006Implement build_context_v2(profile, action, depth) that queries the merged DRG (shipped + project-local layers), walks scope edges from the action node to depth 1, walks requires transitively, walks suggests to user-configured depth, includes vocabulary edges as glossary scope, materializes each resolved artifact, and returns a structured prompt block. (applies is defined in the v1 schema but not populated by the Phase 0 migration; it is reserved for Phase 2+ when artifacts self-declare applicability.)Proposed
FR-007Provide an invariant regression test that compares the artifact reachability of build_context_v2(profile, action, depth) against the canonical src/charter/context.py build_charter_context() for every shipped profile x action x depth combination. Parity means: the same set of artifact URNs is resolved by both paths. Rendered-text parity (guidelines, reference filtering, section formatting) is a Phase 1 concern when callers are switched to build_context_v2. Intentional reachability differences must be itemized in an accepted-differences manifest.Proposed
FR-008Provide an action surface calibration test that asserts the minimum-effective-dose inequalities for every shipped action. Violations are fixed by adjusting scope edges in graph.yaml, never by adding filtering logic in the context builder.Proposed
FR-009The DRG is the only knob for calibrating action surfaces. No per-action filtering logic exists in build_context_v2 or the prompt builder. Context size is determined entirely by graph topology.Proposed
FR-010Both the invariant test and the calibration test run in CI on every PR that touches src/doctrine/, src/charter/, or graph.yaml.Proposed

Non-Functional Requirements

IDRequirementThresholdStatus
NFR-001graph.yaml loads and validates in under 500ms for a graph with up to 500 nodes and 2000 edges.< 500ms cold loadProposed
NFR-002build_context_v2() resolves a single (profile, action) query in under 200ms including artifact materialization.< 200ms per queryProposed
NFR-003The full invariant test matrix (all profile x action x depth combinations) completes in under 60 seconds.< 60s CI wall timeProposed
NFR-004New code ships with 90%+ test coverage (project standard).>= 90% line coverageProposed
NFR-005All new code passes mypy --strict with zero type errors.0 type errorsProposed

Constraints

IDConstraintStatus
C-001Inline references in doctrine YAML files must remain in place. Phase 0 ships the DRG alongside them. Phase 1 deletes them after both test harnesses pass.Active
C-002The DoctrineService and its 8 repositories are preserved. The DRG is an additional index, not a replacement for the service layer.Active
C-003The curation pipeline (_proposed/ directories, doctrine curate/promote/reset/status commands, src/doctrine/curation/ package) is untouched. Phase 1 excises it.Active
C-004No cross-repo changes. spec-kitty-events, spec-kitty-saas, spec-kitty-tracker, and spec-kitty-runtime are out of scope unless a hard contract dependency is proven.Active
C-005The glossary module (src/specify_cli/glossary/) is not modified. vocabulary edges in the DRG reference glossary scopes but do not alter glossary internals.Active
C-006Mission identity surfaces (mission_id, sync emission) are not reopened. PR #608 is the final word.Active
C-007graph.yaml uses YAML format consistent with existing doctrine YAML conventions (schema_version: "1.0", sorted keys, human-readable).Active

Success Criteria

1. The invariant regression test passes for 100% of shipped profile x action x depth combinations (artifact reachability parity), with any accepted differences explicitly itemized and reviewed. 2. The calibration test confirms all minimum-effective-dose inequalities hold for every shipped action. 3. Both test harnesses run green in CI before Phase 1 work begins. 4. The behavioral delta between src/charter/context.py and src/specify_cli/charter/context.py is documented, and the canonical path is confirmed as the correct parity oracle. 5. graph.yaml validates with zero errors against the DRG Pydantic model. 6. The migration extractor accounts for every inline reference field across all shipped artifacts (zero missed references).

Scope Boundary: What Phase 1 Will Delete

Once Phase 0's test harnesses pass, Phase 1 (#463) will delete the following. Phase 0 must NOT delete any of these:

SurfaceLocationPhase 1 action
_proposed/ directoriessrc/doctrine/*/_ proposed/Delete entirely
Curation packagesrc/doctrine/curation/Delete entirely
Curation CLI commandsdoctrine curate/promote/reset/statusRemove from CLI
Inline tactic_refs fieldsDirective and paradigm YAMLsRemove field
Inline directive_refs fieldsParadigm YAMLsRemove field
Inline references arraysDirective and tactic YAMLsRemove field
Inline opposed_by arraysParadigm YAMLsRemove field
Action index inline listsactions/*/index.yaml directives, tactics, etc.Replace with DRG edge queries
Call-site reroute (prompt_builder, workflow)src/specify_cli/next/prompt_builder.py, src/specify_cli/cli/commands/agent/workflow.pySwitch imports to charter.context + test rendered-text parity
Legacy context compatibility surfacesrc/specify_cli/charter/context.pyDelete after reroute confirmed
Validators for inline refsSchema validation for above fieldsRemove validators
build_charter_context() in src/charter/context.pysrc/charter/context.pyReplace with build_context_v2

Key Entities

DRG Node

A doctrine artifact addressable by URN (kind:id).

  • URN format: {kind}:{id} (e.g., directive:DIRECTIVE_001, tactic:tdd-red-green-refactor, paradigm:domain-driven-design, action:software-dev/specify)
  • Kind: One of directive, tactic, paradigm, styleguide, toolguide, procedure, agent_profile, action, glossary_scope
  • ID: Artifact's canonical identifier as defined in its YAML id field

DRG Edge

A typed relationship between two nodes.

  • Relation types (v1): requires, suggests, applies, scope, vocabulary, instantiates, replaces, delegates_to
  • Source/target: Node URNs
  • Metadata: Optional when (applicability context), reason (for opposition/conflict edges)

Accepted-Differences Manifest

A structured file listing intentional divergences between legacy and DRG context output.

  • Per entry: (profile, action, depth) tuple, legacy artifact set, DRG artifact set, reason for divergence, reviewer who accepted it

Assumptions

1. The set of shipped doctrine artifacts on main at mission start is the canonical input for the migration extractor. No new artifacts will be added during this mission. 2. The software-dev mission is the primary mission type for calibration testing. research and documentation missions are included if they have action indices; otherwise they are documented as future scope. 3. The "profile" dimension in the invariant test matrix uses shipped agent profiles from src/doctrine/agent_profiles/shipped/. If no profiles meaningfully alter context assembly today, the test degenerates to action-only and that is acceptable. 4. graph.yaml lives at src/doctrine/graph.yaml alongside the artifacts it indexes.

Risks

RiskLikelihoodImpactMitigation
Inline reference extraction misses edge cases (e.g., directive ID format mismatch between DIRECTIVE_NNN and NNN-slug in action indices)MediumHighMigration extractor normalizes both ID formats; validation step counts extracted edges vs source field counts
Call-site reroute to src/charter/context.py introduces subtle behavior changeLowMediumReroute is a separate, independently testable WP with before/after output comparison
Invariant test produces too many accepted differences, reducing Phase 1 confidenceLowHighDifferences must be individually reviewed and justified; a threshold (e.g., >10% divergence) triggers mission pause
graph.yaml becomes a merge-conflict hotspotMediumLowGraph is generated by migration script, not hand-edited. Regeneration is idempotent.

Rollback Criteria

Phase 0 can be rolled back cleanly because: 1. Inline references remain in place (C-001). Deleting graph.yaml and build_context_v2 restores the prior state. 2. No production call sites are changed. Phase 0 adds new code and tests; it does not modify any existing import paths or prompt behavior. 3. No existing tests are modified. New tests are additive.

Rollback trigger: If the invariant test reveals that the DRG resolves a fundamentally different artifact set than the canonical path and the root cause is unclear, pause the mission and escalate. Do not paper over divergence with a large accepted-differences manifest.

Non-Goals (Explicit)

  • Deleting _proposed/ directories (Phase 1)
  • Removing doctrine curate/promote/reset/status commands (Phase 1)
  • Charter Synthesizer pipeline (Phase 3)
  • spec-kitty do/ask/advise (Phase 4)
  • ProfileInvocationExecutor (Phase 4)
  • StepContractExecutor (Phase 6)
  • Retrospective contract (Phase 6)
  • Mission identity cleanup (PR #608)
  • Cross-repo changes to spec-kitty-events, spec-kitty-saas, spec-kitty-tracker, spec-kitty-runtime
  • Doctrine-specific compatibility registry in src/doctrine/versioning.py (Phase 7)