Tasks — 082 Stealth-Gated SaaS Sync Hardening
Feature dir: /Users/robert/spec-kitty-dev/linear2/spec-kitty/kitty-specs/082-stealth-gated-saas-sync-hardening/ Planning base branch: main Merge target branch: main Generated: 2026-04-11T06:22:58Z Inputs: spec.md, plan.md, research.md, data-model.md, contracts/, quickstart.md
Overview
This mission preserves the stealth rollout posture of the CLI's hosted SaaS/tracker surface and hardens the enabled-mode experience inside that gate. Work is decomposed into 6 work packages along three concurrent threads (rollout+readiness, daemon+config, CLI integration) plus a documentation package. No customer-facing surface appears unless the SPEC_KITTY_ENABLE_SAAS_SYNC env var is set; inside enabled mode, a new shared HostedReadiness evaluator replaces ad hoc preflight checks and the background sync daemon becomes intent-gated with an operator-facing config toggle.
Total subtasks: 29 Work packages: 6 Parallelization: WP01 and WP03 are entry points with no dependencies; WP02 depends on WP01; WP04 depends on WP01+WP03; WP05 depends on WP01+WP02+WP03+WP04; WP06 depends on WP01+WP02+WP03+WP04.
Subtask Index
| ID | Description | WP | Parallel |
|---|---|---|---|
| T001 | Create src/specify_cli/saas/__init__.py with rollout-symbol exports | WP01 | — |
| T002 | Create canonical src/specify_cli/saas/rollout.py (is_saas_sync_enabled, saas_sync_disabled_message) | WP01 | — |
| T003 | Convert src/specify_cli/tracker/feature_flags.py to a re-export shim | WP01 | [D] |
| T004 | Convert src/specify_cli/sync/feature_flags.py to a re-export shim | WP01 | [D] |
| T005 | Write tests/saas/test_rollout.py (env-var variants, shim re-export, stable message) | WP01 | — |
| T006 | Create src/specify_cli/saas/readiness.py with ReadinessState, ReadinessResult, evaluate_readiness | WP02 | — |
| T007 | Implement the stable failure-message catalog per contracts/hosted_readiness.md | WP02 | — |
| T008 | Implement check ordering, short-circuit, and exception→HOST_UNREACHABLE conversion | WP02 | — |
| T009 | Create tests/saas/conftest.py with rollout + auth/config/binding fixtures | WP02 | — |
| T010 | Write tests/saas/test_readiness_unit.py (stubbed probes, every state, wording, ordering) | WP02 | [D] |
| T011 | Write tests/saas/test_readiness_integration.py (real evaluator, tmp_path fixtures, local stub server) | WP02 | [D] |
| T012 | Add BackgroundDaemonPolicy enum and background_daemon field to SyncConfig | WP03 | — |
| T013 | Update TOML loader: parse [sync].background_daemon (case-insensitive, warn+default, reject empty) | WP03 | — |
| T014 | Write tests/sync/test_config_background_daemon.py | WP03 | — |
| T015 | Add DaemonIntent enum and DaemonStartOutcome dataclass to src/specify_cli/sync/daemon.py | WP04 | — |
| T016 | Refactor ensure_sync_daemon_running() — mandatory intent=, full decision matrix, outcome return | WP04 | — |
| T017 | Update src/specify_cli/dashboard/server.py to pass intent=LOCAL_ONLY | WP04 | [D] |
| T018 | Update src/specify_cli/dashboard/handlers/api.py (read endpoints LOCAL_ONLY; sync-now REMOTE_REQUIRED) | WP04 | [D] |
| T019 | Update src/specify_cli/sync/events.py to pass intent=REMOTE_REQUIRED | WP04 | [D] |
| T020 | Write tests/sync/test_daemon_intent_gate.py (decision matrix, missing-intent TypeError, audit-grep guard) | WP04 | — |
| T021 | Rewrite tracker callback in src/specify_cli/cli/commands/tracker.py — per-command evaluate_readiness() | WP05 | — |
| T022 | Implement manual-mode CLI surfacing for sync run/pull/push/publish (exit 0, stable wording) | WP05 | — |
| T023 | Update src/specify_cli/cli/commands/__init__.py conditional import to source from specify_cli.saas.rollout | WP05 | — |
| T024 | Parametrize tests/agent/cli/commands/test_tracker.py (rollout-on/off × prerequisite-state matrix) | WP05 | [D] |
| T025 | Update tests/agent/cli/commands/test_tracker_discover.py to readiness-aware assertions | WP05 | [D] |
| T026 | Update tests/agent/cli/commands/test_tracker_status.py (readiness-aware + manual-mode behavior) | WP05 | [D] |
| T027 | Write architecture/ADR-XXXX-saas-rollout-and-readiness.md (DIRECTIVE_003) | WP06 | — |
| T028 | Update architecture/README.md index to reference the new ADR | WP06 | — |
| T029 | Refresh docs/ cross-references if any existing SaaS-sync doc points at the old single gate | WP06 | — |
The [P] column marks subtasks that can be parallelized within their WP (different files, no shared state). It is a reference column, not a status column.
Work Package WP01 — Canonical rollout gate and BC shims
Prompt file: tasks/WP01-canonical-rollout-gate-and-bc-shims.md Priority: P1 (foundation for all enabled-mode work) Dependencies: none Estimated prompt size: ~400 lines Independent test: Run pytest tests/saas/test_rollout.py plus a smoke-import check on the two BC shims; verify the existing tests/conftest.py autouse fixture still works unchanged.
Goal: Consolidate the duplicated is_saas_sync_enabled() implementations into a single canonical module at src/specify_cli/saas/rollout.py, expose it at the package root, and convert tracker/feature_flags.py and sync/feature_flags.py into thin re-export shims so existing callers keep working with no rename.
Included subtasks
- ✅ T001 Create
src/specify_cli/saas/__init__.pywith rollout-symbol exports (WP01) - ✅ T002 Create canonical
src/specify_cli/saas/rollout.py(WP01) - ✅ T003 Convert
src/specify_cli/tracker/feature_flags.pyto a re-export shim (WP01) - ✅ T004 Convert
src/specify_cli/sync/feature_flags.pyto a re-export shim (WP01) - ✅ T005 Write
tests/saas/test_rollout.py(WP01)
Implementation sketch
1. Create the new saas/ package directory with __init__.py importing and re-exporting is_saas_sync_enabled and saas_sync_disabled_message from specify_cli.saas.rollout. Do not import readiness here — WP02 owns readiness and accesses are via the module path specify_cli.saas.readiness. 2. Move the canonical implementation into rollout.py, matching byte-for-byte the wording from contracts/saas_rollout.md. 3. Rewrite both feature_flags.py modules to be 3-to-5-line re-exports. Add a # Retained as backwards-compatibility shim; canonical home is specify_cli.saas.rollout. comment so future contributors know why. 4. Parametrize the new unit tests over every truthy/falsy env value listed in the contract.
Parallel opportunities
- T003 and T004 are
[P]— the two shim conversions are independent files.
Risks
- The existing autouse fixture at
tests/conftest.py:57-60importsis_saas_sync_enabledvia one of the shim modules; if the shim rewrite forgets to preserve the symbol name, the entire test suite breaks. Mitigation: run the full test suite after the shim conversion, not justtests/saas/. - The shims are imported by
src/specify_cli/cli/commands/__init__.py:37-40, 71-72(see R-001 research). WP05 will switch that import tospecify_cli.saas.rollout, but WP01 must preserve the shim-level import so the CLI still works between WP01 and WP05 landing.
Work Package WP02 — HostedReadiness evaluator and test layers
Prompt file: tasks/WP02-hosted-readiness-evaluator-and-tests.md Priority: P1 Dependencies: WP01 Estimated prompt size: ~500 lines Independent test: pytest tests/saas/test_readiness_unit.py tests/saas/test_readiness_integration.py — all six ReadinessState members reached with stable wording; integration layer drives the real evaluator against tmp_path fixtures.
Goal: Introduce the shared HostedReadiness evaluator that commands call instead of ad hoc preflight checks. Deliver both a stubbed unit layer and a smaller integration layer exercising the real evaluator.
Included subtasks
- ✅ T006 Create
src/specify_cli/saas/readiness.pywithReadinessState,ReadinessResult,evaluate_readiness(WP02) - ✅ T007 Implement the stable failure-message catalog per contract (WP02)
- ✅ T008 Implement check ordering, short-circuit, and exception→
HOST_UNREACHABLEconversion (WP02) - ✅ T009 Create
tests/saas/conftest.pywith rollout + auth/config/binding fixtures (WP02) - ✅ T010 Write
tests/saas/test_readiness_unit.pywith stubbed probes (WP02) - ✅ T011 Write
tests/saas/test_readiness_integration.pywith the real evaluator (WP02)
Implementation sketch
1. readiness.py defines the ReadinessState str-enum with the six members in declaration order (which is also check order); the frozen ReadinessResult dataclass; and evaluate_readiness() with keyword-only arguments per contracts/hosted_readiness.md. 2. Each prerequisite probe is implemented as a separate module-level helper (_probe_rollout, _probe_auth, _probe_host_config, _probe_reachability, _probe_mission_binding) so unit tests can monkey-patch them individually. 3. The top-level evaluator wraps the probe dispatch in a try/except that converts any exception into ReadinessState.HOST_UNREACHABLE with details["error"] set — the evaluator never raises to callers (contract requirement). 4. conftest.py provides rollout_enabled, rollout_disabled monkeypatch fixtures plus factory fixtures that build tmp_path-backed auth, config, and binding state used by both unit and integration tests. 5. Unit tests stub the probe helpers; integration tests let them run against real fixture files plus a http.server-based local stub for the reachability probe (no real network).
Parallel opportunities
- T010 and T011 are
[P](unit and integration tests are independent files once the evaluator and conftest exist).
Risks
- Drift between stub-shaped and real-shaped fixtures. Mitigation: share the
_Fixturesdataclass from conftest across both test files so a stub cannot diverge silently. - Reachability stub server port conflicts in CI. Mitigation: bind to
127.0.0.1:0and read the assigned port back.
Work Package WP03 — BackgroundDaemonPolicy config extension
Prompt file: tasks/WP03-background-daemon-policy-config.md Priority: P2 Dependencies: none Estimated prompt size: ~250 lines Independent test: pytest tests/sync/test_config_background_daemon.py — missing-key default to AUTO, case-variants round-trip, unknown-value warns, empty string rejected.
Goal: Extend the existing SyncConfig dataclass with a new user-level TOML key [sync].background_daemon = "auto" | "manual" (default "auto"). This is the config surface that WP04 will consult to decide whether to auto-start the daemon on REMOTE_REQUIRED intent.
Included subtasks
- ✅ T012 Add
BackgroundDaemonPolicyenum andbackground_daemonfield toSyncConfig(WP03) - ✅ T013 Update TOML loader for case-insensitive parsing with warn+default and empty-string rejection (WP03)
- ✅ T014 Write
tests/sync/test_config_background_daemon.py(WP03)
Implementation sketch
1. Add the new enum to src/specify_cli/sync/config.py next to the existing dataclass definition (lines 12–70); wire background_daemon: BackgroundDaemonPolicy = BackgroundDaemonPolicy.AUTO into SyncConfig. 2. The loader path reads the raw value, .casefold()s it, matches against known members, warns to stderr on unknown (falling back to AUTO), and raises a typed config error on empty string. 3. The test module covers: missing key → AUTO; "auto"/"AUTO"/"Auto" → AUTO; "manual"/"Manual" → MANUAL; "banana" → one-line warning + AUTO; empty string → raises.
Parallel opportunities
- None internal. This WP runs in parallel with WP01/WP02 because it touches a different file tree.
Risks
- SyncConfig is loaded early in many code paths (daemon, dashboard, sync events). A typo in the loader rejects any existing config. Mitigation: regression-test that a config file without the new key still loads cleanly.
Work Package WP04 — Intent-gated daemon and caller audit
Prompt file: tasks/WP04-intent-gated-daemon-and-caller-audit.md Priority: P1 Dependencies: WP01, WP03 Estimated prompt size: ~550 lines Independent test: pytest tests/sync/test_daemon_intent_gate.py — every row of the rollout × intent × policy decision matrix asserted; TypeError regression guard; audit-grep guard scans the repo for unauthorized ensure_sync_daemon_running call sites.
Goal: Convert ensure_sync_daemon_running() from unconditional auto-start to an intent-gated startup function governed by the new DaemonIntent enum and the BackgroundDaemonPolicy config from WP03. Update the three existing call sites to pass explicit intent. Ensure help and local-only commands never spawn the daemon on enabled machines.
Included subtasks
- ✅ T015 Add
DaemonIntentenum andDaemonStartOutcomedataclass tosrc/specify_cli/sync/daemon.py(WP04) - ✅ T016 Refactor
ensure_sync_daemon_running()for mandatoryintent=, apply decision matrix, returnDaemonStartOutcome(WP04) - ✅ T017 Update
src/specify_cli/dashboard/server.pycaller tointent=LOCAL_ONLY(WP04) - ✅ T018 Update
src/specify_cli/dashboard/handlers/api.pycallers (LOCAL_ONLY reads, REMOTE_REQUIRED sync-now) (WP04) - ✅ T019 Update
src/specify_cli/sync/events.pyevent-upload caller toREMOTE_REQUIRED(WP04) - ✅ T020 Write
tests/sync/test_daemon_intent_gate.py(decision matrix, TypeError guard, audit-grep guard) (WP04)
Implementation sketch
1. Add the enum and the frozen dataclass at the top of daemon.py. DaemonStartOutcome includes started, skipped_reason, and pid. 2. Rewrite ensure_sync_daemon_running() to accept *, intent: DaemonIntent, config: SyncConfig | None = None. The function:
3. Update the three call sites per contracts/background_daemon_policy.md. Dashboard server startup → LOCAL_ONLY; dashboard read handlers → LOCAL_ONLY; dashboard sync-now handler → REMOTE_REQUIRED; sync/events.py upload path → REMOTE_REQUIRED. 4. The audit-grep guard in the test module walks src/ and asserts that every ensure_sync_daemon_running( match appears in the contract's caller-allowlist. Any new caller must be added to both the allowlist and the matrix tests.
- Returns
(started=False, skipped_reason="rollout_disabled")ifis_saas_sync_enabled()isFalse. - Returns
(started=False, skipped_reason="intent_local_only")if intent isLOCAL_ONLY. - Returns
(started=False, skipped_reason="policy_manual")if intent isREMOTE_REQUIREDandconfig.background_daemon == MANUAL. - Otherwise delegates to the existing start logic.
Parallel opportunities
- T017, T018, T019 are
[P]once T015/T016 land.
Risks
- The dashboard currently relies on implicit daemon startup to populate status data. With
LOCAL_ONLY, that implicit path disappears. Mitigation: dashboard should render daemon state from the on-disk state file (DAEMON_STATE_FILE) rather than from a live process handle, and the WP must verify that pattern holds before landing. sync/events.pymay call the function from inside async contexts; ensure the new outcome return is handled synchronously (the existing code appears synchronous based on exploration).REMOTE_REQUIRED+MANUALreturnsstarted=Falseand the caller must not crash. Dashboard handler callers log at INFO; CLI callers are handled by WP05.
Work Package WP05 — Tracker CLI readiness wiring and dual-mode tests
Prompt file: tasks/WP05-tracker-cli-readiness-wiring.md Priority: P1 Dependencies: WP01, WP02, WP03, WP04 Estimated prompt size: ~600 lines Independent test: pytest tests/agent/cli/commands/test_tracker.py tests/agent/cli/commands/test_tracker_discover.py tests/agent/cli/commands/test_tracker_status.py — parametrized over rollout-on and rollout-off modes; prerequisite-state matrix assertions; manual-mode surfacing verified.
Goal: Replace the generic _require_enabled() guard in the tracker CLI with per-command evaluate_readiness() calls that produce actionable, per-prerequisite failure messages. Ensure manual-mode policy prints the stable wording and exits 0. Parametrize existing tracker tests over both rollout modes.
Included subtasks
- ✅ T021 Rewrite tracker callback in
src/specify_cli/cli/commands/tracker.pyto use per-commandevaluate_readiness()(WP05) - ✅ T022 Implement manual-mode CLI surfacing for
sync run/pull/push/publish(WP05) - ✅ T023 Update
src/specify_cli/cli/commands/__init__.pyconditional import to source fromspecify_cli.saas.rollout(WP05) - ✅ T024 Parametrize
tests/agent/cli/commands/test_tracker.pyover rollout × prerequisite-state matrix (WP05) - ✅ T025 Update
tests/agent/cli/commands/test_tracker_discover.pyto readiness-aware assertions (WP05) - ✅ T026 Update
tests/agent/cli/commands/test_tracker_status.py(readiness-aware + manual-mode behavior) (WP05)
Implementation sketch
1. tracker.py — replace the tracker_callback()'s call to _require_enabled() with a helper that looks up the command name from the Typer context, consults a dispatch table for the correct (require_mission_binding, probe_reachability) flags per contracts/hosted_readiness.md, calls evaluate_readiness(), and either proceeds or exits with code 1 printing result.message + blank line + result.next_action. 2. sync run, sync pull, sync push, sync publish additionally wrap their actual daemon call in a try that catches DaemonStartOutcome(skipped_reason="policy_manual") and prints the stable manual-mode wording, exiting 0. 3. cli/commands/__init__.py keeps its conditional import pattern — only rename the source to specify_cli.saas.rollout. The BC shims from WP01 still work, but the canonical import prevents future drift. 4. Test parametrization uses pytest.mark.parametrize("mode", [("rollout_disabled", False), ("rollout_enabled", True)]) fixtures from tests/saas/conftest.py (import from the shared conftest, or create a local conftest that imports through). 5. Byte-wise wording assertions from the contract table: every prerequisite failure is asserted via assert result.stdout.startswith("No SaaS authentication token") etc.
Parallel opportunities
- T024, T025, T026 are
[P]once T021–T023 land.
Risks
- The conditional Typer registration in
cli/commands/__init__.pyis evaluated at module-import time, which means the env-var read happens once per process. Tests that flip the env var after the CLI module is imported will not see the change. Mitigation: test harness should construct a fresh Typer app per test, mirroring the existing pattern intest_tracker.py:22-66. - The existing autouse fixture in
tests/conftest.py:57-60sets the env var ON by default; the dual-mode parametrization mustmonkeypatch.delenv(...)inside the rollout-off case before importing the CLI module. - Stable wording assertions are fragile against copy-edits. Use a shared
WORDINGconstant in the contract/module so tests and implementation stay aligned.
Work Package WP06 — ADR and documentation
Prompt file: tasks/WP06-adr-and-documentation.md Priority: P3 Dependencies: WP01, WP02, WP03, WP04 Estimated prompt size: ~250 lines Independent test: rg "saas-rollout-and-readiness" architecture/ docs/ returns the new ADR in the index; pytest -q still passes end-to-end.
Goal: Capture the rollout/readiness split, daemon-policy-as-config choice, tracker-remains-ungated rationale, and future migration paths in an architecture decision record. Update the architecture index. Optionally refresh user-facing docs if existing SaaS-sync pointers need updating (DIRECTIVE_003 — Decision Documentation Requirement).
Included subtasks
- ✅ T027 Write
architecture/ADR-XXXX-saas-rollout-and-readiness.md(WP06) - ✅ T028 Update
architecture/README.mdindex to reference the new ADR (WP06) - ✅ T029 Refresh
docs/cross-references if any existing SaaS-sync doc points at the old single gate (WP06)
Implementation sketch
1. Inspect architecture/README.md for the next available ADR number; rename the placeholder. Follow the ADR template already used in that directory. 2. ADR sections: Context (the stealth rollout posture and why), Decision (split rollout-gate from readiness evaluator, config-driven daemon policy, tracker stays ungated), Consequences (BC shims live on, future project-level config layering is documented but deferred, HostedReadiness becomes the single fan-in point for any new prerequisite), Alternatives Considered (single gate with per-command overrides; adding rollout to tracker package — both rejected with reasons from research.md). 3. Update architecture/README.md's index with one new row. 4. Grep docs/ for SPEC_KITTY_ENABLE_SAAS_SYNC and any "coming soon" notes — if a user-facing doc implies the gate is a single yes/no for all readiness, add a pointer to the new ADR. Skip if no such doc exists (the subtask is conditional).
Parallel opportunities
- None internal. Depends on the substantive WPs landing so the ADR accurately reports what shipped.
Risks
- The ADR risks drifting from the shipped implementation if written before the preceding WPs are merged. Mitigation: schedule this WP last in its lane; reviewer must diff the ADR against the actual implementation before approving.
MVP Scope Recommendation
Minimum viable delivery = WP01 + WP02 + WP05. Those three packages restore the stealth posture, ship the shared readiness abstraction, and plumb it through the tracker CLI with dual-mode tests — the three highest-priority user stories (US1, US2, US3) plus FR-004 and NFR-002.
Strongly recommended to bundle with MVP: WP03 + WP04 (intent-gated daemon + config policy) address FR-005, FR-006, and NFR-003. Without them, US4 is not satisfied and help/local-only commands can still spawn the daemon on enabled machines.
Defer-safe: WP06 (ADR + docs) — the code ships correct without it, but DIRECTIVE_003 requires the decision trail before acceptance, so it should land in the same release window.
Parallelization Highlights
- Lane A (rollout+readiness): WP01 → WP02 (sequential; WP02 imports from WP01's package)
- Lane B (config+daemon): WP03 → WP04 (sequential; WP04 consumes WP03's new enum)
- Lane C (CLI integration): WP05 (depends on Lane A tip and Lane B tip — runs after both)
- Lane D (docs): WP06 (depends on Lanes A+B; can run concurrently with WP05 if an agent starts the ADR skeleton early and updates it from merged code)
In practice, Lane A and Lane B can run in parallel from day one, Lane C merges afterward, and Lane D closes the loop.