Context and Problem Statement
Through 3.1.x, retrospective learning in Spec Kitty was effectively gated by two environment variables (SPEC_KITTY_RETROSPECTIVE, SPEC_KITTY_MODE), and the runtime wired facilitator_callback=None so even an "enabled" strict path failed closed instead of producing useful learning. The user-facing spec-kitty agent retrospect synthesize command — a proposal preview/apply tool — became the de facto authoring surface because it had a silent fallback that fabricated empty completed records when artifacts looked sufficient. As a result, completed missions rarely produced useful retrospective records, and post-merge documentation (PR #1136) overstated what summary and synthesize actually capture.
The product thesis from the 3.2.0 epic: retrospective learning should be a core feedback loop, not an env-gated experimental terminus trap. Specifically:
- Every completed mission should produce a useful
retrospective.yaml. - Policy lives in durable, project-level configuration (
.kittify/config.yamland charter frontmatter), not in environment variables. - The default behavior is post-completion best-effort with warn-on-failure; strict governed projects can opt into pre-completion blocking gates.
- Authoring and proposal-application are distinct surfaces with distinct semantics.
- Doctrine/DRG/glossary changes never auto-apply by default.
This mission decides the architecture for that overhaul. Two architectural questions had to be answered before tasks could land:
- Generator shape: should the runtime invoke an agent profile (
retrospective-facilitator.agent.yaml) for richer mediated analysis, or call a pure-Python module directly? - Policy precedence: when both
.kittify/config.yamland charter frontmatter defineretrospective:settings, which wins?
A third operational question — how to respond to the pytest collection blocker tracked in #1137 — was decided in the same planning session and is recorded here for completeness.
Decision Drivers
- Default-on is non-negotiable: every completed mission must attempt generation. Latency budget therefore matters more than richness.
- Determinism and testability: the generator must be unit-testable without an agent harness and must produce the same output for the same inputs (load-bearing for FR-021's "existing reductions remain byte-identical" guarantee).
- Governance scope (DIRECTIVE_031, DIRECTIVE_032): retrospective authoring is a distinct bounded context from doctrine/DRG/glossary mutation; the boundary must be explicit and crossing it requires an anti-corruption layer.
- Charter authority (DIRECTIVE_001): governed projects expect charter to be authoritative for policy; config-file precedence cannot leak in by default.
- FR-024 frozen public surface for
spec_kitty_events: any "fix" for #1137 that imports fromspec_kitty_events.models.*(instead of the top-level re-exports) would violate the architectural contract; the issue's closing comment makes this explicit. - No SaaS dependency: this mission is local CLI/runtime work. Generator must run without hosted services and without
SPEC_KITTY_ENABLE_SAAS_SYNC=1. - No structural auto-apply: doctrine/DRG/glossary mutation always requires explicit human approval (carried in
C-005of the spec).
Considered Options
For generator shape
- Option A: Pure-Python module — deterministic function in
src/specify_cli/retrospective/generator.pythat reads mission artifacts and returns aRetrospectiveRecord. Runtime calls it directly. - Option B: Profile invocation — runtime invokes the
retrospective-facilitator.agent.yamlprofile via the existing profile-invocation pipeline; output is an agent-authored record. - Option C: Hybrid — pure-Python is the runtime default; policy can opt into profile invocation via
retrospective.generator: profile.
For policy precedence
- Option P-A:
.kittify/config.yamlwins by default; charter must explicitly opt in to authority. - Option P-B: Charter wins by default; charter may delegate to config via an explicit
retrospective.precedence: configdirective. - Option P-C: Both contribute additively; conflicting fields produce a resolution error.
For #1137 resolution
- Option E-A: Upstream fix in
spec_kitty_events5.1.1 — restorenormalize_event_idandEventto top-level exports. - Option E-B: Local import fallback in
src/specify_cli/status/validate.pyto use a stable subpath (spec_kitty_events.models). - Option E-C: Pin
spec_kitty_eventsto a known-good version. - Option E-D: Documentation-only — diagnose as local env corruption (PEP 420 namespace package state), add CONTRIBUTING note, no code change.
Decision Outcome
Generator shape — Option A (Pure-Python module)
Decision Moment: 01KS051316C8Z0SDEKZ2B088CS.
Chosen because:
- Default-on at every mission boundary requires sub-second latency. Profile invocation routinely takes 5–30s (agent dispatch, network round-trip, completion).
- Determinism is a hard requirement for FR-021 (byte-identical historical reductions). A pure-Python function is byte-deterministic given the same inputs; an agent-invoked generator is not.
- Testability: unit tests scaffold mission artifacts on disk and call the generator directly; no agent harness mocks needed.
- The
retrospective-facilitatorprofile remains useful as a human-mediated tool for richer post-mortems via the existingspec-kitty agent action retrospectstyle invocations. It is simply not the runtime default. - Forward compatibility: the policy schema lands with a
generatorfield that today accepts only"python". Adding"profile"later requires no schema break — Option C is preserved as a future opt-in without committing to it now.
Policy precedence — Option P-B (Charter wins; config may delegate)
Chosen because:
- DIRECTIVE_001 ("Architectural Integrity Standard") and the project's charter-first governance model require charter to be the authoritative governance surface. Letting config silently override would violate that expectation.
- The escape hatch (
retrospective.precedence: configin charter frontmatter) covers the rare case where a charter explicitly wants config-level flexibility, without requiring boilerplate in every charter. - The resolver returns
(policy, source_map)wheresource_maprecords the origin of every leaf field; observers can therefore always tell which file/key drove a given decision.
#1137 — Option E-D (Documentation only)
Decision Moment: 01KS0513SEHSEE82WN4RJBFDRG.
Chosen because:
- The issue is closed as not-a-bug. The 5.1.0 wheel is fine; the symptom is local PEP 420 namespace-package corruption from a partial
pip uninstall. CI is unaffected (per the closing comment,tests/agent/test_orchestrator_commands_integration.py::TestAcceptMissioncollects cleanly in fresh CI venvs). - Option E-B (local fallback) was explicitly rejected by the issue's closing comment: importing from
spec_kitty_events.models.*instead of the top-level surface would either violate the FR-024 frozen public-surface contract (enforced bytests/architectural/test_events_tracker_public_imports.py) or require expanding that contract. Both options trade architectural integrity for a workaround to a local-env problem. - Option E-A (upstream fix) is on the critical path of the 3.2.0 release and requires a separate package release; not justified for a not-a-bug.
- Option E-C (version pin) defers contract drift and may regress other 5.1.x improvements.
Action: add a diagnostic note to CONTRIBUTING.md (the python -c "import spec_kitty_events; print(spec_kitty_events.__file__, spec_kitty_events.__path__)" check, the _NamespacePath(...) symptom, and the uv sync --reinstall-package spec-kitty-events fix command).
Consequences
Positive
- The default-on retrospective path adds < 2s wall-clock to mission completion (NFR-005) — meets the budget for routine use.
- Unit tests do not require agent dispatch infrastructure; the entire retrospective surface can be exercised without a live LLM or external service.
- Charter remains the authoritative governance surface for governed projects. Operators reading the resolved policy can always point to a specific charter or config key as its source.
- The frozen
spec_kitty_eventspublic surface (FR-024 from the consumer-contract dossier) remains intact. Future contributors who hit the #1137 namespace-package symptom locally get a clear diagnostic in CONTRIBUTING.md instead of a hidden code path that masks the corruption. - Forward compatibility for profile-invocation generators is preserved without committing to the implementation cost now.
Negative
- The default generator produces a less "thoughtful" record than a profile-mediated one might. Mitigated by SC-004 (the generator is validated against three real completed missions in
kitty-specs/during the mission-review report) and by keeping the profile-invocation path available as an explicit operator action. - Governed projects must understand that charter wins by default. This is a learning cost for operators coming from systems where config-files-override is conventional. Mitigated by the documented
retrospective.precedence: configescape hatch and by the resolver'ssource_mapalways being inspectable. - The CONTRIBUTING note for #1137 places the fix in a contributor-facing doc rather than a code path. Contributors who never read CONTRIBUTING and hit the symptom waste cycles. Mitigated by the fix command (
uv sync --reinstall-package spec-kitty-events) being short and the error signature being googleable.
Neutral
- The
retrospective-facilitator.agent.yamlprofile is retained but reframed as descriptive metadata for human-mediated retrospectives rather than the runtime's authoring path. No deletion. - The
agent retrospect synthesizecommand keeps its existing signature; only the default-path behavior (when no record exists) tightens — the legacy fabrication path is preserved behind an explicit--fabricate-emptyflag.
Confirmation
We will know this decision was correct if, post-merge:
uv run pytest tests/retrospective/ -qexits 0 and the generator unit tests reach ≥ 90% coverage (NFR-004, matching charter).- A representative mission completion under default policy adds ≤ 2s wall-clock (NFR-005), verified by a focused integration test.
- The shipped
spec-kitty retrospect create --mission <handle>command produces real, schema-valid records for at least three already-completed missions in this repo (068-post-merge-reliability-and-release-hardening,034-feature-status-state-model-remediation,047-namespace-aware-artifact-body-syncper SC-004). - The
policy_sourcemap on emitted retrospective events lets operators trace every blocking or warning event back to a specific.kittify/config.yamlor charter frontmatter key. - The FR-024 architectural test (
tests/architectural/test_events_tracker_public_imports.py) remains green, confirming no code path importsspec_kitty_events.models.*directly.
Confidence: high for the generator and #1137 decisions (the latter is constrained by an existing immutable architectural contract). Medium-high for the policy-precedence decision — if operator feedback during the 3.2.0 stabilization window reveals charter-wins-by-default is confusing for adopters, a future ADR may flip the default after gathering data; the retrospective.precedence field is the forward-compat surface for that.
Pros and Cons of the Options
Option A (Pure-Python generator) — chosen
- ✅ Deterministic, byte-stable output for FR-021 reduction guarantee
- ✅ Sub-second latency budget
- ✅ Unit-testable without agent harness
- ✅ Forward-compat: schema accepts
generator: pythontoday,generator: profilelater - ❌ Less "thoughtful" than a profile-mediated record could be (mitigated by SC-004 quality gate)
Option B (Profile invocation)
- ✅ Aligns with the agent-mediated learning doctrine direction
- ✅ Richer post-mortem signal per mission
- ❌ 5–30s latency per mission completion — incompatible with default-on
- ❌ Couples retrospective generation to whichever coding agent is configured
- ❌ Testing requires agent harness mocks
- ❌ Non-deterministic output complicates FR-021
Option C (Hybrid)
- ✅ Best of both worlds in principle
- ❌ Doubles the test surface for the runtime default
- ❌ Requires shipping the profile-invocation path in this mission even if no project uses it
- 💡 Preserved as a future opt-in via the
generatorfield's enum. If operator demand materializes during 3.2.x, a follow-up ADR can extend the resolver and runtime without a schema break.
Option P-A (Config wins by default)
- ✅ Matches conventional config-file precedence
- ❌ Violates charter-first governance for governed projects without explicit opt-in
- ❌ Charter would need boilerplate to assert authority on every governed project
Option P-B (Charter wins by default; config may delegate) — chosen
- ✅ Charter-first governance preserved
- ✅ Single, documented escape hatch (
retrospective.precedence: config) for projects that want config-flexibility - ✅
source_mapmakes resolution decisions inspectable - ❌ Learning curve for operators who expect config-wins
- 💡 If feedback during 3.2.x stabilization shows this is consistently confusing, the decision is reversible via a follow-up ADR —
retrospective.precedenceis the forward-compat lever.
Option P-C (Additive)
- ❌ Conflicting fields would raise resolution errors, blocking healthy projects
- ❌ Operators would need to keep both files in sync field-by-field
Option E-A (Upstream fix #1137)
- ✅ Cleanest long-term
- ❌ Separate package release on the 3.2.0 critical path
- ❌ Not justified for a not-a-bug
Option E-B (Local fallback for #1137)
- ❌ Violates the FR-024 frozen public-surface contract
- ❌ Masks local-env corruption that future contributors should still hit visibly
- ❌ The issue's closing comment explicitly rules this out
Option E-C (Version pin)
- ❌ Defers contract drift; may regress 5.1.x improvements
Option E-D (Documentation only) — chosen
- ✅ Respects the issue's closing decision (not-a-bug)
- ✅ Preserves the frozen public surface
- ✅ Zero blast radius — single doc edit
- ❌ Contributors who skip CONTRIBUTING and hit the symptom waste cycles (mitigated by short fix command)
Bounded-Context Map (per DIRECTIVE_031)
This mission spans four bounded contexts. The crossings are explicit and mediated:
┌──────────────────────────────┐ ┌──────────────────────────────┐
│ Retrospective Authoring │ │ Mission Lifecycle / │
│ │ │ Event Log │
│ - RetrospectivePolicy │ │ │
│ - RetrospectiveRecord │ │ - runtime_bridge.py │
│ - generator (pure Python) ├────────►│ - retrospective_terminus.py │
│ - writer (merge/overwrite) │ emits │ - status.events.jsonl │
│ - events (additive) │ events │ - reducer (no-op for │
│ │ │ retrospective events) │
└────────┬─────────────────────┘ └──────────────────────────────┘
│
│ proposals[] (data only)
▼
┌──────────────────────────────┐ ┌──────────────────────────────┐
│ agent retrospect synthesize │ │ Doctrine / DRG / Glossary │
│ (anti-corruption layer) ├────────►│ │
│ │ apply │ - human-approved mutations │
│ - preview proposals │ (gated) │ - structural changes always │
│ - apply with human approval │ │ require explicit consent │
└──────────────────────────────┘ └──────────────────────────────┘
- Retrospective Authoring → Event Log: explicit, additive event payloads (
RetrospectiveCaptured,RetrospectiveCaptureFailed). Reducer treats them as no-ops for lane state. - Retrospective Authoring → Doctrine/DRG/Glossary: never direct. Proposals are data; application goes through the
synthesizeanti-corruption layer with human approval. - CLI Surface → Retrospective Authoring: through documented JSON contracts in
kitty-specs/retrospective-default-policy-01KS049J/contracts/.
References
- Mission spec:
kitty-specs/retrospective-default-policy-01KS049J/spec.md - Implementation plan:
kitty-specs/retrospective-default-policy-01KS049J/plan.md - Decision rationale:
kitty-specs/retrospective-default-policy-01KS049J/research.md - Data model:
kitty-specs/retrospective-default-policy-01KS049J/data-model.md - Contracts:
kitty-specs/retrospective-default-policy-01KS049J/contracts/ - Decision Moment artifacts:
kitty-specs/retrospective-default-policy-01KS049J/decisions/DM-01KS051316C8Z0SDEKZ2B088CS.md,DM-01KS0513SEHSEE82WN4RJBFDRG.md - Related ADR:
2026-04-27-1-retrospective-gate-shared-module.md— prior retrospective gate architecture this mission builds on - Related ADR:
2026-04-25-1-shared-package-boundary.md— establishes the frozenspec_kitty_eventspublic surface that constrains the #1137 decision - DIRECTIVES applied:
DIRECTIVE_001(Architectural Integrity),DIRECTIVE_003(Decision Documentation),DIRECTIVE_031(Context-Aware Design),DIRECTIVE_032(Conceptual Alignment)