description: "Work package task list template for feature implementation"


Work Packages: Mutmut Mutation Testing CI Integration

Inputs: Design documents from /kitty-specs/047-mutmut-mutation-testing-ci/ Prerequisites: plan.md (required), spec.md (user stories), research.md, quickstart.md

Execution model: Sequential only. Each WP must reach done before the next begins. No worktrees: All work is performed directly on the architecture/restructure_and_proposals branch.


Work Package WP01: Toolchain Setup (Priority: P0)

Goal: Add mutmut>=3.5.0 to the project and configure it so that mutmut run works locally against src/specify_cli/ without errors. Independent Test: mutmut run completes for at least one module and mutmut results shows killed/surviving/timeout counts without configuration errors. Prompt: tasks/WP01-toolchain-setup.md Requirement Refs: FR-001, FR-002, NFR-002, C-001, C-002, SC-001

Included Subtasks

  • □ T001 Add mutmut>=3.5.0 to [project.optional-dependencies].test in pyproject.toml
  • □ T002 Add [tool.mutmut] configuration section to pyproject.toml (paths, runner, excludes)
  • □ T003 Add mutmut.db, mutmut-cache/, and .mutmut-cache to .gitignore
  • □ T004 Install dependencies locally (pip install -e ".[test]") and run mutmut run --paths-to-mutate src/specify_cli/status/ to verify config loads
  • □ T005 Run mutmut results and confirm output shows killed/surviving/timeout counts

Implementation Notes

  • The [tool.mutmut] section goes in pyproject.toml (PEP 517 standard, consistent with ruff/mypy/pytest in this project).
  • Runner must use python -m pytest -x --timeout=30 -q to avoid spawning a new interpreter per mutant.
  • Exclude src/specify_cli/upgrade/migrations/ (idempotent, hard to mutate meaningfully) and __pycache__.
  • T004 is a local verification step — the implementer runs mutmut and captures the output to confirm no config errors.

Parallel Opportunities

  • None. Steps are sequential within this WP.

Dependencies

  • None (first WP).

Risks & Mitigations

  • mutmut 3.x [tool.mutmut] schema may differ from researched format → check mutmut --help and mutmut run --help for actual CLI flags if the toml config doesn't load correctly.
  • Long run time for full codebase → scope T004 to a single small module (status/) to keep verification fast.

Work Package WP02: CI Integration (Priority: P0)

Goal: A mutation-testing job exists in ci-quality.yml, runs on push/dispatch, uploads HTML + JSON artifacts, and enforces a 0% floor (always passes initially). Independent Test: Push a commit; verify the mutation-testing job appears and completes with artifacts downloadable from the CI run. Verify PR runs skip the job. Prompt: tasks/WP02-ci-integration.md Requirement Refs: FR-003, FR-004, FR-005, FR-006, FR-007, FR-008, NFR-001, C-003, C-004, SC-002, SC-003, SC-004, SC-005

Included Subtasks

  • □ T006 Write scripts/check_mutation_floor.py — reads out/reports/mutation/mutation-stats.json, computes score, enforces MUTATION_FLOOR env var (default 0)
  • □ T007 Add mutation-testing job to .github/workflows/ci-quality.yml with needs: unit-tests, correct if: guard, timeout-minutes: 75
  • □ T008 Add job steps: checkout → Python setup → install .[test] → prepare output dir → mutmut runexport-cicd-statsmutmut html → floor check → upload artifacts
  • □ T009 Set initial MUTATION_FLOOR: 0 as a job-level env var in ci-quality.yml
  • □ T010 Verify the if: condition skips the job on pull_request events (read the condition, confirm logic is correct)

Implementation Notes

  • Mirror the integration-smoke job pattern: if: always() && (github.event_name == 'push' || (github.event_name == 'workflow_dispatch' && inputs.run_extended)).
  • mutmut export-cicd-stats must run before the floor-check script (ordered steps).
  • Artifact name: mutation-reports; upload path: out/reports/mutation/.
  • check_mutation_floor.py must handle: zero mutants (warn + exit 0), missing JSON (exit 1 with clear error), score below floor (exit 1 with descriptive message).
  • The floor env var lives in the job definition (env: block) so it can be edited in one place when raised later.

Parallel Opportunities

  • T006 (script authoring) can be done independently of T007 (CI YAML editing).

Dependencies

  • Depends on WP01.

Risks & Mitigations

  • mutmut run may exceed 60-minute budget on full codebase → set timeout-minutes: 75 to give buffer; mutmut will mark timed-out mutants separately, not fail.
  • export-cicd-stats output schema differs from researched format → adjust check_mutation_floor.py to log the raw JSON if key not found, to aid debugging.

Work Package WP03: Squash Survivors — Batch 1 (status/, glossary/) (Priority: P1)

Goal: All killable surviving mutants in src/specify_cli/status/ and src/specify_cli/glossary/ are killed by targeted tests; mutation score for these modules is measurably higher than the pre-campaign baseline. Independent Test: Run mutmut run --paths-to-mutate src/specify_cli/status/ src/specify_cli/glossary/ after adding the new tests; confirm surviving mutant count is lower than before and all remaining survivors are documented as equivalent. Prompt: tasks/WP03-squash-batch1-status-glossary.md Requirement Refs: FR-009, FR-010, FR-011, FR-012, SC-006, SC-007

Included Subtasks

  • ✅ T011 Run mutmut run --paths-to-mutate src/specify_cli/status/ and record surviving mutant IDs and locations
  • ✅ T012 Triage status/ survivors: classify each as killable or equivalent; for each killable, inspect diff with mutmut show <id>
  • ✅ T013 Write targeted tests in tests/unit/status/ (or extend existing ones) to kill each killable status/ mutant; re-run to verify kills
  • ✅ T014 Run mutmut run --paths-to-mutate src/specify_cli/glossary/ and record surviving mutant IDs and locations
  • ✅ T015 Triage glossary/ survivors: classify each as killable or equivalent
  • ✅ T016 Write targeted tests in tests/unit/glossary/ (or extend existing ones) to kill each killable glossary/ mutant; re-run to verify kills
  • ✅ T017 Create kitty-specs/047-mutmut-mutation-testing-ci/mutmut-equivalents.md documenting any equivalent mutants with rationale

Implementation Notes

  • A killable mutant is one where a reasonable test can distinguish the original from the mutated behaviour.
  • An equivalent mutant is one where the mutation produces semantically identical behaviour (e.g., x + 0x - 0); these are documented, not killed.
  • Re-run mutmut after each batch of new tests to confirm kills before moving to the next group.
  • Focus on correctness-critical code paths: state machine transitions (status/transitions.py), JSONL append/read (status/store.py), term parsing (glossary/).

Parallel Opportunities

  • None (sequential within WP; run status/ first, then glossary/).

Dependencies

  • Depends on WP02.

Risks & Mitigations

  • Many surviving mutants → prioritise by module criticality; kill state machine guards first, then parsers.
  • Slow mutmut run per re-check → scope the re-run to a single file once you know which mutants you targeted.

Work Package WP04: Squash Survivors — Batch 2 (merge/, core/) (Priority: P1)

Goal: All killable surviving mutants in src/specify_cli/merge/ and src/specify_cli/core/ are killed by targeted tests. Independent Test: Run mutmut run --paths-to-mutate src/specify_cli/merge/ src/specify_cli/core/ after test additions; confirm surviving count lower than baseline and all remaining survivors are documented as equivalent. Prompt: tasks/WP04-squash-batch2-merge-core.md Requirement Refs: FR-009, FR-010, FR-012, SC-006, SC-007

Included Subtasks

  • ✅ T018 Run mutmut run --paths-to-mutate src/specify_cli/merge/ and record surviving mutant IDs
  • ✅ T019 Triage merge/ survivors and write targeted tests in tests/unit/merge/; re-run to verify kills
  • ✅ T020 Run mutmut run --paths-to-mutate src/specify_cli/core/ and record surviving mutant IDs
  • ✅ T021 Triage core/ survivors and write targeted tests in tests/unit/core/; re-run to verify kills
  • ✅ T022 Update kitty-specs/047-mutmut-mutation-testing-ci/mutmut-equivalents.md with any equivalent mutants from merge/ and core/

Implementation Notes

  • Focus for merge/: MergeState persistence (merge/state.py), preflight validation (merge/preflight.py), conflict forecasting (merge/forecast.py).
  • Focus for core/: Dependency graph utilities, any shared ABCs or protocol definitions.
  • Keep the same triage → test → re-run loop as WP03.
  • All new tests go under the existing tests/ hierarchy (e.g., tests/unit/merge/, tests/unit/core/); create subdirectories if they don't exist.

Parallel Opportunities

  • None (sequential within WP; run merge/ first, then core/).

Dependencies

  • Depends on WP03.

Risks & Mitigations

  • merge/ has integration-level tests (preflight requires git repo) → use tmp_path fixtures with a real git init, or mock git calls at the boundary.
  • core/ may have few units to mutate → if module is thin, move on quickly.

Work Package WP05: Enforce Floor (Priority: P1)

Goal: After the squashing campaign, the mutation score floor is raised to the achieved score (rounded down to nearest 5%), CI enforces it, and the feature is complete per spec constraint C-003. Independent Test: MUTATION_FLOOR=<new-value> in ci-quality.yml; push a commit; CI mutation-testing job passes. Manually set floor 5% above actual score and confirm job fails with a descriptive message. Prompt: tasks/WP05-enforce-floor.md Requirement Refs: FR-013, C-003, SC-004, SC-006

Included Subtasks

  • ✅ T023 Run full mutation suite against all four priority modules (status/, glossary/, merge/, core/) and capture the final achieved score from mutation-stats.json
  • ✅ T024 Compute floor value: floor(score_percent / 5) * 5 (round down to nearest 5%)
  • ✅ T025 Update MUTATION_FLOOR env var in ci-quality.yml job to the computed value
  • ✅ T026 Push and verify the mutation-testing CI job passes at the new floor; spot-check by temporarily setting floor 5% above actual to confirm failure path works
  • ✅ T027 Update spec.md constraint C-003 status from Open to Done with a note of the achieved floor value

Implementation Notes

  • The floor must be > 0 for C-003 to be satisfied. If the squashing campaign produced no kills, escalate before closing this WP.
  • The floor value is stored as a plain integer (0–100) in the MUTATION_FLOOR env var.
  • The failure-path spot-check (T026) is important: it validates the enforcement mechanism is actually wired up and not silently ignored.
  • After updating spec.md, the feature can be considered done.

Parallel Opportunities

  • T025 and T027 can be done in parallel (different files).

Dependencies

  • Depends on WP04.

Risks & Mitigations

  • Achieved score is very low (e.g., 5–10%) → still update the floor; the mechanism exists and future campaigns will raise it further.
  • CI timeout hit during full-scope run → if the four-module run exceeds the 75-minute timeout, split into two CI runs (status+glossary on one push, merge+core on another) and take the lower score.

Dependency & Execution Summary

  • Sequence: WP01 → WP02 → WP03 → WP04 → WP05 (strictly sequential).
  • Parallelization: None — this feature uses sequential execution by design to keep the history traceable and avoid merge conflicts in test files.
  • MVP Scope: WP01 + WP02 constitute the minimum release (toolchain + CI). WP03–WP05 deliver the baseline squashing campaign required by C-003.

Requirements Coverage Summary

Requirement IDCovered By Work Package(s)
FR-001WP01
FR-002WP01
FR-003WP02
FR-004WP02
FR-005WP02
FR-006WP02
FR-007WP02
FR-008WP02
FR-009WP03, WP04
FR-010WP03, WP04
FR-011WP03 (plan.md defines scope)
FR-012WP03, WP04
FR-013WP05
NFR-001WP02
NFR-002WP01
C-001WP01
C-002WP01
C-003WP02 (initial 0%), WP05 (raise after squashing)
C-004WP02

Subtask Index (Reference)

Subtask IDSummaryWork PackagePriorityParallel?
T001Add mutmut to pyproject.toml test depsWP01P0No
T002Add [tool.mutmut] config sectionWP01P0No
T003Update .gitignore for mutmut artifactsWP01P0No
T004Local verification runWP01P0No
T005Confirm mutmut results outputWP01P0No
T006Write check_mutation_floor.pyWP02P0Yes
T007Add mutation-testing CI jobWP02P0Yes
T008Add CI job stepsWP02P0No
T009Set MUTATION_FLOOR=0 in CIWP02P0No
T010Verify PR skip logicWP02P0No
T011Run mutmut on status/WP03P1No
T012Triage status/ survivorsWP03P1No
T013Write tests to kill status/ mutantsWP03P1No
T014Run mutmut on glossary/WP03P1No
T015Triage glossary/ survivorsWP03P1No
T016Write tests to kill glossary/ mutantsWP03P1No
T017Create mutmut-equivalents.md (batch 1)WP03P1No
T018Run mutmut on merge/WP04P1No
T019Triage and kill merge/ survivorsWP04P1No
T020Run mutmut on core/WP04P1No
T021Triage and kill core/ survivorsWP04P1No
T022Update mutmut-equivalents.md (batch 2)WP04P1No
T023Run full priority-scope mutation suiteWP05P1No
T024Compute floor valueWP05P1No
T025Update MUTATION_FLOOR in ci-quality.ymlWP05P1Yes
T026Push and verify CI at new floorWP05P1No
T027Update spec.md C-003 statusWP05P1Yes

> Sequential execution model: start WP01, complete it fully, then move to WP02, and so on. > No worktrees are used for this feature — all work happens on the target branch directly.

<!-- status-model:start -->

Canonical Status (Generated)

<!-- status-model:end -->

  • WP01: done
  • WP02: done
  • WP03: done
  • WP04: done
  • WP05: done