Tasks: Phase 4 Auth Identity-Boundary Canary Gate

Mission: phase4-canary-gate-01KS1W46 | Branch: main → main Date: 2026-05-20 | Total WPs: 8 | Total Subtasks: 43


Subtask Index

IDDescriptionWPParallel
T001Query #1141 state via gh issue viewWP01
T002Query #1182 state via gh issue viewWP01[P]
T003Gate: either OPEN → stop and reportWP01
T004Inspect #1141 merge commit diff (queue.py, adapters.py)WP01
T005Verify #1141 fix has test coverage (fail-without, pass-with)WP01
T006Inspect #1182 merge commit diff (error-classification change)WP01
T007Determine latest prerelease RC tag from PyPI/GitHubWP02
T008Gate: if latest is still rc15 → stop and reportWP02
T009Kill any orphan sync daemons (pkill)WP02
T010Install post-rc15 RC via pipxWP02
T011Verify installed CLI version outputWP02
T012Verify sync.owner + sync.preflight boundary importsWP02
T013Check /health/ → 200 + events versionWP03[P]
T014Check /health/ready/ → 200WP03[P]
T015Check infra terminal_failed count via Fly SSHWP03
T016Confirm business_rule_rejected_count == 22WP03
T017Export required env vars for canaryWP04
T018uv sync in spec-kitty-end-to-end-testingWP04
T019Kill any orphan sync daemons before canaryWP04
T020Run harness unit test preflight (3 preflight test files)WP04
T021Run --single canaryWP04
T022Assert all 4 scenario statuses in latest.jsonWP04
T023On failure: re-open issue(s), preserve evidence, stopWP04
T024Run full four-run canary protocolWP05
T025Assert "outcome":"pass" in all 4 run-N.json filesWP05
T026Confirm zero interventions between runsWP05
T027On failure: re-open issue(s), preserve evidence, stopWP05
T028Gather environment metadata (CLI, SaaS, events)WP06[P]
T029Take post-run health snapshotWP06[P]
T030Bundle evidence tarballWP06
T031Post evidence comment to e2e#41 (template)WP06
T032Close e2e#41WP06
T033Verify #41 is closedWP06
T034Check #1038 for latest comment (Teamspace still required?)WP07
T035Teamspace MVP canary run 1WP07
T036Teamspace MVP canary run 2WP07
T037Teamspace MVP canary run 3WP07
T038Teamspace MVP canary run 4WP07
T039Preserve logs at /tmp/teamspace-canary-run-{1..4}.logWP07
T040Bundle Teamspace log tarballWP08
T041Gather final environment metadataWP08
T042Post evidence comment to spec-kitty#1038 (template)WP08
T043Verify #1038 remains OPENWP08

WP01 — Blocker Verification

Priority: Critical (hard gate — all other WPs depend on this) Execution Mode: planning_artifact Dependencies: none Estimated prompt: ~320 lines

Goal: Confirm both Phase-4 blocker issues are CLOSED and that #1141's fix is substantive (test-backed behavioral change, not diagnostic logging).

Included subtasks:

  • ✅ T001 Query #1141 state via gh issue view (WP01)
  • ✅ T002 Query #1182 state via gh issue view (WP01)
  • ✅ T003 Gate: either OPEN → stop and report (WP01)
  • □ T004 Inspect #1141 merge commit diff (queue.py, adapters.py) (WP01)
  • □ T005 Verify #1141 fix has test coverage (fail-without, pass-with) (WP01)
  • □ T006 Inspect #1182 merge commit diff (error-classification change) (WP01)

Parallel opportunities: T001 and T002 can run concurrently.

Risks:

  • #1141 was previously "fixed" with only diagnostic logging — the gate must reject that pattern.
  • GitHub token scope issues: unset GITHUB_TOKEN and use keyring if needed (see CLAUDE.md).

Prompt file: WP01-blocker-verification.md


WP02 — RC Install and Boundary Verification

Priority: Critical (hard gate) Execution Mode: planning_artifact Dependencies: WP01 Estimated prompt: ~350 lines

Goal: Install the post-rc15 CLI RC, confirm it contains both #1141 and #1182 fixes, and verify that auth boundary imports are clean.

Included subtasks:

  • ✅ T007 Determine latest prerelease RC tag from PyPI/GitHub (WP02)
  • ✅ T008 Gate: if latest is still rc15 → stop and report (WP02)
  • □ T009 Kill any orphan sync daemons (pkill) (WP02)
  • □ T010 Install post-rc15 RC via pipx (WP02)
  • □ T011 Verify installed CLI version output (WP02)
  • □ T012 Verify sync.owner + sync.preflight boundary imports (WP02)

Parallel opportunities: WP03 (SaaS Preflight) is independent and can run concurrently with WP02.

Risks:

  • No post-rc15 RC may exist yet — WP02 stops and reports rather than cutting autonomously.
  • pipx symlink path mismatch (same as rc13 regression) — verify both spec-kitty --version and the Python import.

Prompt file: WP02-rc-install.md


WP03 — SaaS Preflight

Priority: High (read-only; gates WP04) Execution Mode: planning_artifact Dependencies: none (independent) Estimated prompt: ~260 lines

Goal: Confirm the deployed SaaS environment is healthy before running any canary. Read-only checks only.

Included subtasks:

  • ✅ T013 Check /health/ → 200 + events version (WP03)
  • ✅ T014 Check /health/ready/ → 200 (WP03)
  • ✅ T015 Check infra terminal_failed count via Fly SSH (WP03)
  • ✅ T016 Confirm business_rule_rejected_count == 22 (WP03)

Parallel opportunities: T013 and T014 can run concurrently.

Risks:

  • SaaS may be redeploying — wait and retry once if 503.
  • Fly SSH console may require flyctl authentication.

Prompt file: WP03-saas-preflight.md


WP04 — Single-Run Identity-Boundary Canary

Priority: Critical (hard gate) Execution Mode: planning_artifact Dependencies: WP02, WP03 Estimated prompt: ~420 lines

Goal: Run the identity-boundary canary in --single mode. All four scenarios must pass. On any failure, re-open the appropriate issue and preserve evidence.

Included subtasks:

  • ✅ T017 Export required env vars for canary (WP04)
  • ✅ T018 uv sync in spec-kitty-end-to-end-testing (WP04)
  • ✅ T019 Kill any orphan sync daemons before canary (WP04)
  • ✅ T020 Run harness unit test preflight (3 preflight test files) (WP04)
  • □ T021 Run --single canary (WP04)
  • □ T022 Assert all 4 scenario statuses in latest.json (WP04)
  • □ T023 On failure: re-open issue(s), preserve evidence, stop (WP04)

Risks:

  • Scenario 4 regression: if from='for_review' to='in_review' reappears, re-open #1141.
  • Scenarios 1+2 regression: if unknown: N + sync.event_loop_unavailable reappears, re-open #1182.
  • Do NOT burn a canary cycle if fix substance was questionable (WP01 gate catches this first).

Prompt file: WP04-single-run-canary.md


WP05 — Four-Run Canary Protocol

Priority: Critical Execution Mode: planning_artifact Dependencies: WP04 Estimated prompt: ~280 lines

Goal: Run the full four-run canary protocol. All four JSON result documents must contain "outcome": "pass". No interventions between runs.

Included subtasks:

  • □ T024 Run full four-run canary protocol (WP05)
  • □ T025 Assert "outcome":"pass" in all 4 run-N.json files (WP05)
  • □ T026 Confirm zero interventions between runs (WP05)
  • □ T027 On failure: re-open issue(s), preserve evidence, stop (WP05)

Risks:

  • Flaky SaaS state: if a run fails, preserve evidence and stop — do not retry by cleaning up.
  • The four-run protocol is sequential; runs cannot be parallelized.

Prompt file: WP05-four-run-protocol.md


WP06 — Evidence Collection and Close e2e#41

Priority: High Execution Mode: planning_artifact Dependencies: WP05 Estimated prompt: ~340 lines

Goal: Bundle evidence, post the required comment to e2e#41, and close the issue.

Included subtasks:

  • □ T028 Gather environment metadata (CLI, SaaS, events) (WP06)
  • □ T029 Take post-run health snapshot (WP06)
  • □ T030 Bundle evidence tarball (WP06)
  • □ T031 Post evidence comment to e2e#41 (template) (WP06)
  • □ T032 Close e2e#41 (WP06)
  • □ T033 Verify #41 is closed (WP06)

Parallel opportunities: T028 and T029 can run concurrently.

Risks:

  • PR #42 is already merged — do not attempt to merge it again (no-op sub-step).
  • GitHub token scope: unset GITHUB_TOKEN if posting/closing fails (see CLAUDE.md).

Prompt file: WP06-evidence-and-close-41.md


WP07 — Teamspace MVP Canary Suite

Priority: High Execution Mode: planning_artifact Dependencies: WP06 Estimated prompt: ~360 lines

Goal: Run the Teamspace MVP canary suite four consecutive times without SaaS mutation. Preserve logs.

Included subtasks:

  • □ T034 Check #1038 for latest comment (Teamspace still required?) (WP07)
  • □ T035 Teamspace MVP canary run 1 (WP07)
  • □ T036 Teamspace MVP canary run 2 (WP07)
  • □ T037 Teamspace MVP canary run 3 (WP07)
  • □ T038 Teamspace MVP canary run 4 (WP07)
  • □ T039 Preserve logs at /tmp/teamspace-canary-run-{1..4}.log (WP07)

Risks:

  • 413 on sync: investigate payload size — do NOT raise ingress cap.
  • Materialization timeout: verify e2e#40 polling helper is in use.
  • /health/ready/ 503: investigate infra terminal_failed — do NOT fix by deleting rows.
  • If #1038's latest comment says Teamspace canary is no longer required: skip to WP08.

Prompt file: WP07-teamspace-canary-suite.md


WP08 — Release Tracker Evidence Comment

Priority: High Execution Mode: planning_artifact Dependencies: WP07 Estimated prompt: ~270 lines

Goal: Post the release tracker evidence comment to spec-kitty#1038. Do NOT close the issue.

Included subtasks:

  • □ T040 Bundle Teamspace log tarball (WP08)
  • □ T041 Gather final environment metadata (WP08)
  • □ T042 Post evidence comment to spec-kitty#1038 (template) (WP08)
  • □ T043 Verify #1038 remains OPEN (WP08)

Risks:

  • Accidentally closing #1038 — the gate explicitly prohibits this.
  • Token scope: unset GITHUB_TOKEN if posting fails.

Prompt file: WP08-release-tracker-evidence.md


Execution Order Summary

WP01 (gate) ──► WP02 (gate) ──► WP04 (gate) ──► WP05 ──► WP06 ──► WP07 ──► WP08
                     ↑
              WP03 (runs independently, must complete before WP04)

WP03 can run in parallel with WP02. All other WPs are sequential due to hard gate dependencies.

MVP Scope

WP01 through WP05 are the MVP: blocker verification → RC install → preflight → single run → four-run. WP06–WP08 are required for the gate to close but don't produce new canary evidence.