Phase 1 Data Model: Phase 3 Charter Synthesizer Pipeline

Mission: phase-3-charter-synthesizer-pipeline-01KPE222 Source: confirmed planning answers (KD-1…KD-6) + Phase 0 research. Runtime language: Python 3.11+. All entities are Pydantic v2 models or @dataclass(frozen=True), chosen per field's mutability and schema-surfacing needs.


E-1 · SynthesisRequest (frozen dataclass)

The input envelope handed to a single adapter generate(...) call.

FieldTypeRequiredDescription
targetSynthesisTargetyesWhat to synthesize.
interview_snapshotInterviewAnswersSnapshotyesFrozen copy of current interview answers. Never mutated.
doctrine_snapshotDoctrineCatalogSnapshotyesFrozen read-only view of shipped doctrine relevant to this target.
drg_snapshotDRGGraphSnapshotyesMerged (shipped + pre-existing project layer, if any) DRG used as resolution context. Shipped-only when first-time synthesizing.
adapter_hintsMapping[str, str] \Noneno
run_idULIDyesRun-scoped identity; NOT included in fixture-hash (see normalization rule 4).

Invariants:

  • All *_snapshot fields are immutable.
  • Equality is structural, excluding run_id.

Normalization for fixture keying (R-0-6): canonical JSON over {target, interview_snapshot, doctrine_snapshot, drg_snapshot, adapter_hints, adapter_id, adapter_version} with rules 1-3 from R-0-6; run_id excluded.


E-2 · SynthesisTarget (frozen dataclass)

One unit of synthesis.

FieldTypeRequiredDescription
kindLiteral["directive", "tactic", "styleguide"]yesArtifact kind (C-005 bounds this to three values in tranche 1).
slugstryesKebab-case artifact slug. Unique per (kind,).
source_sectionstr \Nonemaybe
source_urnstuple[str, ...]maybeDRG URNs (e.g. directive:DIRECTIVE_003) this target derives from.
titlestryesHuman-readable title (flows to artifact YAML).
artifact_idstryesCanonical artifact identity. For directive, conforms to Directive.id regex ^[A-Z][A-Z0-9_-]*$ (tranche-1 default: PROJECT_<NNN>, disjoint from shipped DIRECTIVE_<NNN>). For tactic / styleguide, equal to slug. Used as the URN identifier (see below) and, for directives, as the id field in the emitted artifact body.

URN rule (computed, not stored): urn = f"{kind}:{artifact_id}". This is the node URN emitted to the project DRG layer. For tactic / styleguide this reduces to f"{kind}:{slug}" because artifact_id == slug; for directive it is f"directive:{PROJECT_<NNN>}".

Filename rule (computed, used by the storage writer): matches existing repository globs.

  • directive: <NNN>-<slug>.directive.yaml where <NNN> is the numeric segment extracted from artifact_id (e.g. PROJECT_001001).
  • tactic: <slug>.tactic.yaml.
  • styleguide: <slug>.styleguide.yaml.

Validation:

  • slug matches ^[a-z][a-z0-9-]*$ (aligned with Tactic.id and Styleguide.id regex).
  • For kind == "directive", artifact_id matches ^[A-Z][A-Z0-9_-]$ (aligned with Directive.id regex) and must not* start with DIRECTIVE_ (namespace reserved for shipped directives).
  • At least one of source_section or source_urns is non-empty (otherwise the target has no provenance story).
  • Every URN in source_urns must resolve in drg_snapshot.

E-3 · AdapterOutput (frozen dataclass)

What an adapter returns from generate(...).

FieldTypeRequiredDescription
bodyMapping[str, Any]yesThe artifact body, matching the shipped-layer Pydantic schema for kind.
adapter_id_overridestr \Noneno
adapter_version_overridestr \Noneno
generated_atdatetime (aware, UTC)yesWhen the adapter produced this output.
notesstr \Noneno

Validation (performed by orchestrator, not adapter):

  • body parses against shipped-layer schema for kind (FR-019). Failure → SynthesisSchemaError; artifact rejected; no provenance written.

E-4 · ProvenanceEntry (Pydantic model, round-tripped via ruamel.yaml)

Per-artifact provenance sidecar at .kittify/charter/provenance/<kind>-<slug>.yaml. Lives under the bookkeeping tree, separate from the content it describes (.kittify/doctrine/<kind-dir>/…), so doctrine loaders never see it.

FieldTypeRequiredDescription
schema_versionLiteral["1"]yesReserved for future provenance schema changes.
artifact_urnstryese.g. tactic:how-we-apply-directive-003.
artifact_kindLiteral["directive","tactic","styleguide"]yes
artifact_slugstryes
artifact_content_hashstryesblake3-256 hex over the emitted artifact YAML bytes.
inputs_hashstryesblake3-256 hex over the normalized SynthesisRequest (R-0-6).
adapter_idstryesThe effective adapter id for this call (override-first, fallback to adapter.id).
adapter_versionstryesThe effective adapter version.
source_sectionstr \Nonemaybe
source_urnslist[str]yesCopied from SynthesisTarget.source_urns (may be empty if source_section is set).
generated_atstr (ISO 8601 UTC)yesCopied from AdapterOutput.generated_at.
adapter_notesstr \Noneno

Invariants:

  • artifact_urn == f"{artifact_kind}:{artifact_slug}".
  • inputs_hash is byte-stable under normalization (NFR-006 / test lock).

E-5 · ProjectDRGOverlay (Pydantic model)

Additive overlay graph. Emitted to .kittify/doctrine/graph.yaml — the exact path the existing src/charter/_drg_helpers.py project-layer loader already reads. No loader change is required.

Reuses the existing src/doctrine/drg/models.py :: DRGGraph schema verbatim. Additional discipline:

  • Every DRGNode.urn in the overlay is a <kind>:<artifact_id> (e.g. directive:PROJECT_001, tactic:how-we-apply-directive-003) that is NOT present in the shipped graph's nodes — synthesized artifacts carry new URNs; they do not shadow shipped URNs.
  • Every DRGEdge.source is either a shipped URN or a newly-emitted overlay URN.
  • Every DRGEdge.target is either a shipped URN or a newly-emitted overlay URN (never a dangling reference).
  • The generated_by field is set to "spec-kitty charter synthesize <version>" for auditability.

Validation gate (FR-008, NFR-009, US-5): the merged graph (shipped + overlay via existing merge_layers()) must pass validate_graph with zero errors before promote.


E-6 · SynthesisManifest (Pydantic model, manifest-last commit marker)

Top-of-bundle manifest at .kittify/charter/synthesis-manifest.yaml. The manifest lives under bookkeeping but lists content paths under .kittify/doctrine/ — it is the explicit bridge between the two trees.

FieldTypeRequiredDescription
schema_versionLiteral["1"]yesReserved.
mission_idstr \Noneno
created_atstr (ISO 8601 UTC)yesWhen the manifest was written (== commit time).
run_idstr (ULID)yesMatches the staging dir that promoted.
adapter_idstryesPrimary adapter id used for this run (aggregated from ProvenanceEntry.adapter_id — for runs that mixed overrides, this field is empty string and per-artifact provenance is the authoritative record).
adapter_versionstryesPrimary adapter version (see above).
artifactslist[ManifestArtifactEntry]yesOne entry per committed artifact.

E-6a · ManifestArtifactEntry

FieldTypeRequiredDescription
kindLiteral["directive","tactic","styleguide"]yes
slugstryes
pathstryesRepo-relative path to the artifact YAML under .kittify/doctrine/<kind-dir>/. Filename matches the existing repository glob: <NNN>-<slug>.directive.yaml / <slug>.tactic.yaml / <slug>.styleguide.yaml.
provenance_pathstryesRepo-relative path to the provenance YAML under .kittify/charter/provenance/<kind>-<slug>.yaml.
content_hashstryesblake3-256 hex of the artifact YAML bytes.

Invariants:

  • For every entry, the file at path exists and its blake3-256 hash equals content_hash. Readers verify this before trusting the live tree.
  • provenance_path exists and contains artifact_content_hash == content_hash.
  • run_id matches the staging dir that produced it — forensically useful when .staging/<runid>.failed markers need to be correlated.

Authority rule (KD-2): live tree is authoritative IFF manifest is present AND all content_hash checks pass. Otherwise treat as partial-and-rerunable.


E-7 · TopicSelector (discriminated union, Pydantic)

Input to resynthesize --topic <selector>.

TopicSelector = Annotated[
    DRGUrnSelector | KindSlugSelector | InterviewSectionSelector,
    Field(discriminator="kind"),
]

E-7a · DRGUrnSelector

FieldTypeRequiredDescription
kindLiteral["drg_urn"]yesDiscriminator.
urnstryese.g. directive:DIRECTIVE_003. Must match ^[a-z_]+:[A-Za-z0-9_.-]+$.

E-7b · KindSlugSelector

FieldTypeRequiredDescription
kindLiteral["kind_slug"]yesDiscriminator.
artifact_kindLiteral["directive","tactic","styleguide"]yes
artifact_slugstryes

E-7c · InterviewSectionSelector

FieldTypeRequiredDescription
kindLiteral["interview_section"]yesDiscriminator.
sectionstryesMust match a known interview section label.

Parsing rule (FR-012 order — local-first for synthesizable kinds): 1. If the string contains : AND LHS ∈ {"directive","tactic","styleguide"}, try KindSlugSelector against the project-local artifact set first. Hit → resolve, done. This is the "local-first for synthesizable kinds" rule: operators editing their project doctrine naturally type tactic:how-we-apply-directive-003, and we must not route that to a shipped DRG URN lookup when a project artifact exists. 2. Else if the string contains :, try DRGUrnSelector against the merged (shipped + project) DRG graph. 3. Else (no :) try InterviewSectionSelector (exact match against interview section labels). 4. Else raise TopicSelectorUnresolvedError with candidates.

Disambiguation: for a string like directive:PROJECT_001 where PROJECT_001 is both a project-local directive artifact AND a project-layer DRG URN (which it will be after synthesis, because synthesis emits the corresponding node), step 1 matches first. The resolution is unambiguous — the local artifact and the DRG node refer to the same thing; regenerating the artifact regenerates the node. For directive:DIRECTIVE_003 (a shipped URN), step 1 does not match (no project-local artifact of that slug), so step 2 resolves it as a DRG URN and the resynthesizer regenerates every project-local artifact whose provenance references it.


E-8 · Error taxonomy

All errors inherit from SynthesisError(Exception). All carry structured fields for rich-rendered CLI output.

ErrorTriggerKey fields
PathGuardViolationWrite target under src/doctrine/ (FR-016, US-7).attempted_path, caller
SynthesisSchemaErrorAdapterOutput.body fails shipped schema (FR-019).artifact_kind, artifact_slug, validation_errors
ProjectDRGValidationErrorvalidate_graph returns ≥1 errors on merged graph (FR-008).errors: list[str], merged_graph_summary
DuplicateTargetErrorTwo targets in one run share (kind, slug) (EC-7).kind, slug, occurrences
TopicSelectorUnresolvedError--topic selector does not resolve (US-6).raw, candidates: list[str]
TopicSelectorAmbiguousError(Reserved — ambiguity rule above makes this rare; raised only if an explicit disambiguation call flags it.)raw, candidates: list[str]
FixtureAdapterMissingErrorFixture adapter cannot find fixture for hash (test-only).expected_path, kind, slug, inputs_hash
ProductionAdapterUnavailableErrorProduction adapter cannot instantiate (R-0-5).adapter_id, reason, remediation
StagingPromoteErroros.replace or manifest write fails during promote; orchestration rolls back.run_id, staging_dir, cause
ManifestIntegrityErrorA reader finds manifest-listed content_hash not matching disk content.manifest_path, offending_artifact

Every error is structured: it carries fields, not just a message. CLI renders via a shared rich panel helper in src/charter/synthesizer/errors.py.


E-9 · State transitions

Run lifecycle

CREATED  ──▶  STAGING  ──▶  VALIDATING  ──▶  PROMOTING  ──▶  COMMITTED
   │              │               │               │
   │              ▼               ▼               ▼
   │           FAILED          FAILED          FAILED
   │         (adapter/       (schema/         (os.replace
   │          schema)         DRG/path-        or manifest
   │                          guard)           error)
   ▼
ABORTED
  • CREATED → new staging dir opened, no writes yet.
  • STAGING → writes inside staging only (never in live tree, per path guard).
  • VALIDATING → schema + DRG + path-guard + cross-checks on staged tree.
  • PROMOTING → ordered os.replace of artifact + provenance files; finally manifest.
  • COMMITTED → manifest written; staging dir wiped.
  • Any FAILED transition preserves staging as .staging/<runid>.failed/ with a cause.yaml diagnostic and a nonzero CLI exit.

Resynthesis lifecycle

  • Identical to run lifecycle above, but STAGING only stages the targeted artifacts; PROMOTING replaces only those files; manifest is rewritten (not appended) with the new run_id and updated entries for the regenerated artifacts. Untouched artifacts retain their prior content_hash in the manifest.

E-10 · Entity → requirement traceability

Confirms every FR/NFR has at least one entity footprint:

ReqEntities
FR-001E-1
FR-002E-2 (Literal bound)
FR-003SynthesisAdapter Protocol (see contracts/adapter.py)
FR-004FixtureAdapter; E-8 FixtureAdapterMissingError
FR-005E-6 paths; R-0-2 layout
FR-006E-4
FR-007E-5
FR-008E-8 ProjectDRGValidationError; validation gate in E-9
FR-009compiler.py/context.py DoctrineService wiring (plan §Modified)
FR-010 / FR-011CLI surfaces — contracts in contracts/topic-selector.md
FR-012 / FR-013E-7 + E-8 TopicSelectorUnresolvedError
FR-014E-4 inputs_hash byte-stability
FR-015E-6 + bundle manifest additive fields (R-0-4)
FR-016E-8 PathGuardViolation
FR-017E-9 resynthesis lifecycle (only targeted artifacts replaced)
FR-018compiler.py/context.py DoctrineService wiring
FR-019E-3 + E-8 SynthesisSchemaError
FR-020E-5 additive-only invariants
NFR-001…010Tracked via plan §Review & Validation Strategy
C-001…012Tracked via path guard, E-2 Literal bound, CLI selector contract, etc.