Contracts

README.md

Contracts: Safe Sync Daemon Orphan Cleanup

These are the normative interface contracts for the mission. They are CLI/process contracts (not REST) — the product surface is the spec-kitty auth doctor command and the loopback /api/health endpoint.

ContractFileRequirements
Cleanup classification engine (pure)cleanup-classification.mdFR-001, FR-002, FR-003, FR-008
auth doctor --json / --reset --json outputauth-doctor-json.mdFR-004, FR-005, FR-009
/api/health payload (extended)health-payload.mdFR-001, FR-003

Behavioral contracts enforced by tests rather than schema:

  • Startup auto-clean acts on safe_auto only (FR-006, FR-007).
  • --reset guards operator_required behind --force/confirmation (D-02, FR-009).
  • Self-retirement transitions (FR-010, FR-011) — see ../data-model.md.
  • Port-range and cross-family boundaries (NFR-001, NFR-002, NFR-003, C-002).

auth-doctor-json.md

Contract: auth doctor JSON output

Surface: src/specify_cli/cli/commands/auth.py (doctor) → _auth_doctor.py (doctor_impl) Requirements: FR-004, FR-005, FR-009

The command stays read-only without --reset (read-only invariant preserved). --reset is the only mutating path. A new --force flag is added (D-02).

--json (read-only scan) — FR-004

schema_version bumps 1 → 2. Each entry in orphans[] is extended to a full identity record (additive, superset of today's {port,pid,package_version,protocol_version}):

{
  "schema_version": 2,
  "generated_at": "2026-06-30T10:59:00+00:00",
  "auth_root": "/Users/<u>/.spec-kitty",
  "session": { "present": true, "user_email": "u@example.com", "...": "..." },
  "refresh_lock": { "held": false, "...": "..." },
  "daemon": { "active": true, "pid": 4321, "port": 9400, "package_version": "3.2.4", "protocol_version": 1 },
  "orphans": [
    {
      "daemon_family": "sync",
      "pid": 5001, "port": 9401,
      "protocol_version": 1, "package_version": "3.2.2",
      "singleton_scope_id": "/Users/<u>/.spec-kitty",
      "daemon_root": "/Users/<u>/.spec-kitty",
      "queue_db_path": "/Users/<u>/.spec-kitty/queues/queue-aaaaaaaa.db",
      "auth_scope": "https://…|u@example.com|t-private",
      "server_url": "https://…", "owner_present": true,
      "identity_source": "health_self_report",
      "executable_summary": "…/bin/python",
      "spawn_shape_ok": true,
      "self_report_matches_listener": true,
      "is_recorded_singleton": false,
      "cleanup_class": "safe_auto",
      "skip_reason": null
    },
    {
      "daemon_family": "sync", "pid": null, "port": 9405,
      "package_version": "3.2.3", "singleton_scope_id": null,
      "identity_source": "cmdline_marker", "owner_present": false,
      "cleanup_class": "operator_required", "skip_reason": "pre_marker"
    }
  ],
  "findings": [ { "id": "F-002", "severity": "warn", "summary": "…", "remediation": { "command": "spec-kitty auth doctor --reset" } } ]
}
  • Consumers MUST switch on cleanup_class; counts alone are non-conformant (FR-004).
  • never_touch listeners (third-party / out-of-range) are excluded from orphans[].

--reset --json — FR-005, FR-009

Adds a top-level reset_result object with three explicit arrays:

{
  "schema_version": 2,
  "reset_result": {
    "swept":   [ { "pid": 5001, "port": 9401, "package_version": "3.2.2", "protocol_version": 1, "cleanup_path": "http_shutdown", "reason": "safe_auto stale-version" } ],
    "skipped": [ { "pid": null, "port": 9405, "cleanup_class": "operator_required", "skip_reason": "pre_marker" } ],
    "failed":  [ { "pid": 5009, "port": 9402, "failure_reason": "process survived terminate+kill" } ]
  }
}

cleanup_class/skip_reason. Human output prints a one-line remediation hint (… run with --force to clean N operator_required daemon(s)), satisfying FR-009.

successes move to swept[], survivors to failed[].

  • Without --force: operator_required candidates appear in skipped[] with their
  • With --force (or interactive y): operator_required candidates are attempted;
  • cleanup_path{http_shutdown, terminate, kill} records which escalation step closed the port.

Human output (FR-004/FR-005)

The existing "Orphans" table gains a class column (safe_auto/operator_required) and a reason column; --reset prints compact swept/skipped/failed lines mirroring the JSON. Count-only output is removed.

Back-compat note

The schema_version bump to 2 is the signal for consumers. Fields are additive on orphans[]; the pre-existing keys remain present, so a v1 reader degrades gracefully.

cleanup-classification.md

Contract: Cleanup Classification Engine

Module (new): src/specify_cli/sync/classification.py Requirements: FR-001, FR-002, FR-003, FR-008

Function

def classify_candidate(
    *,
    port: int,
    listener_pid: int | None,
    health: HealthProbe | None,       # parsed /api/health, or None if unresponsive
    cmdline: Sequence[str] | None,    # process argv via psutil, or None
    foreground_scope: str,            # _daemon_scope_root() of this runtime
    foreground_exec_scope: str,       # canonical_executable_scope() of this runtime
    recorded_singleton: SingletonRef | None,  # state-file (pid, port)
) -> DaemonIdentityRecord:
    ...
  • Pure / deterministic: no process signals, no filesystem writes, no network. All probing happens in the caller; the classifier only decides. This makes it unit-testable in isolation (Sonar-friendly extraction).
  • Returns a fully-populated DaemonIdentityRecord (fields in ../data-model.md) including daemon_family="sync", cleanup_class, and skip_reason.

Classification rules (normative)

The engine implements the decision table in ../data-model.md (rows 1–9). Key guarantees:

1. Primary kill authority is the daemon-root scope marker (singleton_scope_id), not owner.json (FR-003). owner_present is recorded for reporting but never affects cleanup_class. 2. safe_auto requires a live self-report whose pid/port match the listener (D-01). Unresponsive ⇒ operator_required (skip_reason=unresponsive). 3. Version/executable mismatch is evidence, not a gate (FR-008): once scope + responsiveness + spawn-shape + not-singleton hold, a differing package_version/executable_summary yields safe_auto, not a skip. 4. port is always in [9400,9450) for any record emitted (NFR-001); the caller never hands the sync engine an out-of-range or dashboard port.

Caller obligations (boundary — C-002, NFR-001)

  • The sync scan enumerates only range(DAEMON_PORT_START, DAEMON_PORT_START + DAEMON_PORT_MAX_ATTEMPTS) = [9400,9450).
  • Any signal/kill derived from a record MUST assert record.port is in range and record.daemon_family == "sync" before calling _sweep_daemon_process.
  • The classifier MUST NOT be invoked with dashboard-range ports; cross-family inputs are a caller bug, caught by the boundary regression matrix (IC-05).

Test surface

  • Unit: feed synthetic inputs covering every decision-table row (no subprocess).
  • Integration: drive via the live _DaemonHarness (IC-04) with real listeners and real PIDs.

health-payload.md

Contract: /api/health payload (extended)

Surface: SyncDaemonHandler.handle_health (src/specify_cli/sync/daemon.py:487-520) Requirements: FR-001, FR-003

The loopback-only health endpoint gains a single field: daemon_family. Everything else is unchanged (the redacted owner block already carries the identity fields).

{
  "status": "ok",
  "token": "<redacted>",
  "daemon_family": "sync",
  "protocol_version": 1,
  "package_version": "3.2.4",
  "sync": { "running": false, "last_sync": null, "consecutive_failures": 0 },
  "websocket_status": "Offline",
  "owner": {
    "pid": 4321, "port": 9400,
    "package_version": "3.2.4",
    "executable_path": "…/bin/python",
    "source_checkout_path": "…",
    "server_url": "https://…",
    "auth_principal": "u@example.com", "auth_team": "t-private",
    "auth_scope": "https://…|u@example.com|t-private",
    "queue_db_path": "…/queues/queue-aaaaaaaa.db",
    "started_at": "2026-06-30T10:40:00+00:00"
  }
}

Rules

family from the self-report (defense-in-depth on top of port-range isolation).

add auth or non-loopback exposure (Sonar loopback exception applies — keep it).

scope marker in the process cmdline, never this payload (FR-003). A daemon that returns owner but whose cmdline marker can't be proven is still operator_required.

owner.pid/owner.port here against the actual listener pid/port — a daemon that misreports is downgraded to operator_required (pid_port_mismatch).

  • daemon_family is always "sync" for the sync daemon. It lets a scanner confirm
  • The endpoint remains unauthenticated and loopback-only (127.0.0.1); do not
  • owner is reporting data only. Classification authority is the daemon-root
  • self_report_matches_listener (in the identity record) is computed by comparing