Functional Testing Requirements: Encoding & Plan Validation Guardrails

Overview

This document specifies the functional tests required to lock in the encoding validation and plan validation guardrails implemented in PR #XXX.

Target Test Files:

tests/test_encoding_validation_functional.py (new)
tests/test_plan_validation_functional.py (new)
tests/test_dashboard_encoding_resilience.py (new)
tests/test_pre_commit_hook.py (new)

Test Suite 1: Encoding Validation Module

File: tests/test_encoding_validation_functional.py

Test 1.1: Detect All Problematic Character Types

Objective: Verify sanitizer detects all 15+ problematic character types

Setup:

test_content = """
User's "favorite" feature
Temperature: 72°F outside
Price: $100 ± $10
Grid: 3 × 4 matrix
Long dash — short dash –
Ellipsis… here
Bullet • point
Copyright © 2024
Trademark™ symbol
Registered® mark
Non breaking space (invisible)
"""

Expected Behavior:

detect_problematic_characters(test_content) returns at least 15 issues
Each issue tuple contains: (line_number, column, character, replacement)
Line numbers are 1-indexed
Replacements match PROBLEMATIC_CHARS mapping

Assertions:

issues = detect_problematic_characters(test_content)
assert len(issues) >= 15
assert any(char == '\u2019' and repl == "'" for _, _, char, repl in issues)  # Smart quote
assert any(char == '\u00b1' and repl == "+/-" for _, _, char, repl in issues)  # Plus-minus
assert any(char == '\u00b0' and repl == " degrees" for _, _, char, repl in issues)  # Degree
assert any(char == '\u00d7' and repl == "x" for _, _, char, repl in issues)  # Multiply

Test 1.2: Sanitize Text Preserves Content

Objective: Verify sanitization replaces characters without corrupting text

Setup:

original = "User's "favorite" feature costs $100 ± $10 at 72°F"
expected = 'User\'s "favorite" feature costs $100 +/- $10 at 72 degrees F'

Expected Behavior:

sanitize_markdown_text(original) returns expected
No extra whitespace added
No content lost
Idempotent (running twice produces same result)

Assertions:

result = sanitize_markdown_text(original)
assert result == expected
# Idempotent check
assert sanitize_markdown_text(result) == expected

Test 1.3: Sanitize File Creates Backup

Objective: Verify file sanitization creates .bak file before modifying

Setup:

from pathlib import Path
from tempfile import TemporaryDirectory

with TemporaryDirectory() as tmpdir:
    test_file = Path(tmpdir) / "test.md"
    test_file.write_text("User's test", encoding='utf-8')

Expected Behavior:

sanitize_file(test_file, backup=True) returns (True, None)
Backup file test.md.bak exists
Backup contains original content
Main file contains sanitized content

Assertions:

was_modified, error = sanitize_file(test_file, backup=True, dry_run=False)
assert was_modified is True
assert error is None

backup = test_file.with_suffix(test_file.suffix + '.bak')
assert backup.exists()
assert backup.read_text() == "User's test"
assert test_file.read_text() == "User's test"

Test 1.4: Sanitize File Handles cp1252 Encoding

Objective: Verify sanitizer can read and fix Windows-1252 encoded files

Setup:

# Write file with Windows-1252 encoding
bad_content = "User's "test""
test_file.write_bytes(bad_content.encode('cp1252'))

# Verify it's broken for UTF-8
try:
    test_file.read_text(encoding='utf-8')
    assert False, "Should have raised UnicodeDecodeError"
except UnicodeDecodeError:
    pass  # Expected

Expected Behavior:

sanitize_file(test_file) returns (True, None)
File is now valid UTF-8
Smart quotes replaced with ASCII

Assertions:

was_modified, error = sanitize_file(test_file, backup=True, dry_run=False)
assert was_modified is True
assert error is None

# Should now be valid UTF-8
fixed_content = test_file.read_text(encoding='utf-8')
assert fixed_content == 'User\'s "test"'

Test 1.5: Sanitize Directory Recursively

Objective: Verify directory sanitization finds all .md files recursively

Setup:

with TemporaryDirectory() as tmpdir:
    base = Path(tmpdir)
    # Create nested structure
    (base / "level1").mkdir()
    (base / "level1" / "level2").mkdir()

    files = [
        base / "root.md",
        base / "level1" / "mid.md",
        base / "level1" / "level2" / "deep.md",
    ]

    for f in files:
        f.write_text("User's test")

Expected Behavior:

sanitize_directory(base, pattern="**/*.md") finds all 3 files
All files sanitized
No false positives (e.g., .txt files)

Assertions:

results = sanitize_directory(base, pattern="**/*.md", backup=False, dry_run=False)
assert len(results) == 3
assert all(was_modified for was_modified, _ in results.values())

# Verify all files fixed
for f in files:
    assert f.read_text() == "User's test"

Test 1.6: Dry Run Mode Doesn't Modify

Objective: Verify dry_run=True detects issues without modifying files

Setup:

test_file.write_text("User's test")
original_content = test_file.read_text()
original_mtime = test_file.stat().st_mtime

Expected Behavior:

sanitize_file(test_file, dry_run=True) returns (True, None)
File content unchanged
File mtime unchanged
No backup created

Assertions:

was_modified, error = sanitize_file(test_file, backup=True, dry_run=True)
assert was_modified is True  # Would modify
assert error is None

assert test_file.read_text() == original_content
assert test_file.stat().st_mtime == original_mtime
assert not test_file.with_suffix(test_file.suffix + '.bak').exists()

Test Suite 2: CLI Encoding Validation Command

File: tests/test_encoding_validation_cli.py

Test 2.1: Validate Clean Feature

Objective: Verify command exits 0 when no issues found

Setup:

from typer.testing import CliRunner
from specify_cli import app

# Create clean feature structure
feature_dir = tmp_path / "kitty-specs" / "001-test-feature"
feature_dir.mkdir(parents=True)
(feature_dir / "spec.md").write_text("Clean content")
(feature_dir / "plan.md").write_text("No issues here")

Expected Behavior:

Command exits with code 0
Output contains "✓ All files are properly UTF-8 encoded!"
No fixes applied

Assertions:

runner = CliRunner()
result = runner.invoke(app, ["validate-encoding", "--feature", "001-test-feature"])
assert result.exit_code == 0
assert "✓ All files are properly UTF-8 encoded!" in result.stdout

Test 2.2: Detect Issues Without Fix

Objective: Verify command exits 1 when issues found and --fix not specified

Setup:

(feature_dir / "bad.md").write_text("User's test")

Expected Behavior:

Command exits with code 1
Output shows table of files with issues
Output shows example problematic characters with line numbers
Suggests running with --fix

Assertions:

result = runner.invoke(app, ["validate-encoding", "--feature", "001-test-feature"])
assert result.exit_code == 1
assert "bad.md" in result.stdout
assert "Needs Fix" in result.stdout
assert "Line" in result.stdout  # Shows line numbers
assert "--fix" in result.stdout  # Suggests fix

Test 2.3: Fix Issues With Backup

Objective: Verify --fix flag repairs files and creates backups

Setup:

bad_file = feature_dir / "broken.md"
bad_file.write_text("User's "test"")

Expected Behavior:

Command exits with code 0
Output shows "Fixed" status
File sanitized
Backup created

Assertions:

result = runner.invoke(app, ["validate-encoding", "--feature", "001-test-feature", "--fix"])
assert result.exit_code == 0
assert "Fixed" in result.stdout
assert "Backup files (.bak) were created" in result.stdout

# Verify file fixed
assert bad_file.read_text() == 'User\'s "test"'

# Verify backup exists
backup = bad_file.with_suffix(".md.bak")
assert backup.exists()
assert backup.read_text() == "User's "test""

Test 2.4: Fix Without Backup

Objective: Verify --no-backup flag skips backup creation

Setup:

bad_file.write_text("User's test")

Expected Behavior:

Command exits 0
File fixed
No backup created

Assertions:

result = runner.invoke(app, [
    "validate-encoding",
    "--feature", "001-test-feature",
    "--fix",
    "--no-backup"
])
assert result.exit_code == 0
assert bad_file.read_text() == "User's test"
assert not bad_file.with_suffix(".md.bak").exists()

Test 2.5: Validate All Features

Objective: Verify --all flag scans multiple features

Setup:

for i in range(1, 4):
    feat_dir = tmp_path / "kitty-specs" / f"00{i}-feature"
    feat_dir.mkdir(parents=True)
    (feat_dir / "spec.md").write_text(f"User's test {i}")

Expected Behavior:

Scans all 3 features
Reports total issues
Can fix all with --fix

Assertions:

result = runner.invoke(app, ["validate-encoding", "--all"])
assert result.exit_code == 1
assert "3 features" in result.stdout or "001-feature" in result.stdout

# Fix all
result = runner.invoke(app, ["validate-encoding", "--all", "--fix"])
assert result.exit_code == 0

Test Suite 3: Dashboard Encoding Resilience

File: tests/test_dashboard_encoding_resilience.py

Test 3.1: Dashboard Read Resilient Auto-Fix

Objective: Verify dashboard auto-fixes encoding errors on read

Setup:

from specify_cli.dashboard.scanner import read_file_resilient

bad_file = tmp_path / "bad.md"
bad_file.write_bytes("User's test".encode('cp1252'))

Expected Behavior:

read_file_resilient(bad_file, auto_fix=True) returns (content, None)
Content is valid string with sanitized characters
File is fixed on disk
Backup created

Assertions:

content, error = read_file_resilient(bad_file, auto_fix=True)
assert content is not None
assert error is None
assert content == "User's test"

# File should now be UTF-8
assert bad_file.read_text(encoding='utf-8') == "User's test"

# Backup should exist
assert bad_file.with_suffix('.md.bak').exists()

Test 3.2: Dashboard Read Without Auto-Fix

Objective: Verify non-auto-fix mode returns clear error message

Setup:

bad_file.write_bytes("User's test".encode('cp1252'))

Expected Behavior:

read_file_resilient(bad_file, auto_fix=False) returns (None, error_msg)
Error message contains file name
Error message contains byte offset
Error message suggests fix command

Assertions:

content, error = read_file_resilient(bad_file, auto_fix=False)
assert content is None
assert error is not None
assert "bad.md" in error
assert "byte" in error.lower()
assert "spec-kitty validate-encoding" in error

Test 3.3: Dashboard Scanner Creates Error Cards

Objective: Verify dashboard creates error card for broken files instead of crashing

Setup:

from specify_cli.dashboard.scanner import scan_feature_kanban

# Create feature with bad work package file
feature_dir = tmp_path / "kitty-specs" / "001-test"
tasks_dir = feature_dir / "tasks" / "planned"
tasks_dir.mkdir(parents=True)

wp_file = tasks_dir / "WP01-test.md"
wp_file.write_bytes("""---
work_package_id: WP01
---
# Work Package Prompt: User's Test
""".encode('cp1252'))

Expected Behavior:

Scanner doesn't crash
Returns lanes dict with error card in planned lane
Error card has encoding_error: True flag
Error card title contains "⚠️ Encoding Error"
Error card markdown contains error description

Assertions:

lanes = scan_feature_kanban(tmp_path, "001-test")
assert "planned" in lanes
assert len(lanes["planned"]) == 1

error_card = lanes["planned"][0]
assert error_card.get("encoding_error") is True
assert "⚠️ Encoding Error" in error_card["title"]
assert "WP01" in error_card["title"]
assert "Encoding Error" in error_card["prompt_markdown"]

Test 3.4: Dashboard Scanner Fixes and Loads

Objective: Verify auto-fix allows successful load after initial error

Setup:

# Same as 3.3 but verify successful load after auto-fix

Expected Behavior:

First call auto-fixes file
Card loaded successfully (not error card)
Content properly parsed
Frontmatter extracted

Assertions:

lanes = scan_feature_kanban(tmp_path, "001-test")
task = lanes["planned"][0]
assert task.get("encoding_error") is not True
assert task["id"] == "WP01"
assert "User's Test" in task["title"]  # Fixed smart quote

Test Suite 4: Plan Validation Guardrail

File: tests/test_plan_validation_functional.py

Test 4.1: Research Command Blocks Unfilled Plan

Objective: Verify /spec-kitty.research blocks when plan.md is template

Setup:

from specify_cli.cli.commands.research import research
from typer.testing import CliRunner

# Create feature with template plan
feature_dir = tmp_path / "kitty-specs" / "001-test"
feature_dir.mkdir(parents=True)

plan_file = feature_dir / "plan.md"
plan_file.write_text("""
# Implementation Plan: [FEATURE]
**Date**: [DATE]
**Language/Version**: [e.g., Python 3.11 or NEEDS CLARIFICATION]
**Primary Dependencies**: [e.g., FastAPI or NEEDS CLARIFICATION]
**Testing**: [e.g., pytest or NEEDS CLARIFICATION]
[Gates determined based on charter file]
# [REMOVE IF UNUSED] Option 1
# [REMOVE IF UNUSED] Option 2
ACTION REQUIRED: Replace the content
""")

Expected Behavior:

Command exits with code 1
Output contains "appears to be unfilled"
Output shows count of template markers
Output provides next steps
No research artifacts created

Assertions:

runner = CliRunner()
result = runner.invoke(research_command, ["--feature", "001-test"])
assert result.exit_code == 1
assert "appears to be unfilled" in result.stdout
assert "template markers" in result.stdout
assert "/spec-kitty.plan" in result.stdout
assert not (feature_dir / "research.md").exists()

Test 4.2: Research Command Allows Filled Plan

Objective: Verify research proceeds when plan is properly filled

Setup:

plan_file.write_text("""
# Implementation Plan: User Auth System
**Date**: 2025-11-13
**Language/Version**: Python 3.11
**Primary Dependencies**: FastAPI, bcrypt
**Testing**: pytest
✓ Password hashing required
✓ Rate limiting on auth endpoints
backend/
├── src/
│   ├── models/
│   └── api/
""")

Expected Behavior:

Command exits 0
Research artifacts created
No validation errors

Assertions:

result = runner.invoke(research_command, ["--feature", "001-test"])
assert result.exit_code == 0
assert (feature_dir / "research.md").exists()
assert (feature_dir / "data-model.md").exists()

Test 4.3: Tasks Command Blocks Unfilled Plan

Objective: Verify prerequisite check script blocks tasks generation

Setup:

# Create feature with template plan (same as 4.1)

Expected Behavior:

check-prerequisites.sh --include-tasks exits non-zero
stderr contains "plan.md appears to be unfilled"
stderr shows marker count
stderr provides remediation steps

Assertions:

import subprocess

result = subprocess.run(
    [".kittify/scripts/bash/check-prerequisites.sh", "--include-tasks"],
    cwd=feature_worktree_path,
    capture_output=True,
    text=True
)
assert result.returncode != 0
assert "plan.md appears to be unfilled" in result.stderr
assert "template markers" in result.stderr
assert "/spec-kitty.plan" in result.stderr

Test 4.4: Tasks Command Allows Filled Plan

Objective: Verify tasks proceeds with properly filled plan

Setup:

# Use filled plan from 4.2

Expected Behavior:

Script exits 0
JSON output includes FEATURE_DIR
JSON output includes plan.md in validation

Assertions:

result = subprocess.run(
    [".kittify/scripts/bash/check-prerequisites.sh", "--include-tasks", "--json"],
    cwd=feature_worktree_path,
    capture_output=True,
    text=True
)
assert result.returncode == 0
output = json.loads(result.stdout)
assert "FEATURE_DIR" in output

Test 4.5: Plan Validation Threshold

Objective: Verify 5-marker threshold works correctly

Setup:

# Create plan with exactly 4 markers (should pass)
plan_4_markers = """
# Implementation Plan: My Feature
**Date**: 2025-11-13
**Language/Version**: Python 3.11 or NEEDS CLARIFICATION
**Primary Dependencies**: FastAPI or NEEDS CLARIFICATION
**Testing**: pytest or NEEDS CLARIFICATION
[Gates determined based on charter file]
✓ Password hashing required
backend/src/
"""

# Create plan with exactly 5 markers (should fail)
plan_5_markers = plan_4_markers + """
# [REMOVE IF UNUSED] Option 1
"""

Expected Behavior:

4 markers → validation passes
5 markers → validation fails
Threshold is configurable

Assertions:

from specify_cli.plan_validation import detect_unfilled_plan

plan_file.write_text(plan_4_markers)
is_unfilled, markers = detect_unfilled_plan(plan_file)
assert is_unfilled is False
assert len(markers) == 4

plan_file.write_text(plan_5_markers)
is_unfilled, markers = detect_unfilled_plan(plan_file)
assert is_unfilled is True
assert len(markers) == 5

Test Suite 5: Legacy Git Hook Retirement (Deprecated in 2.x)

File: tests/test_pre_commit_hook_functional.py

Test 5.1: Managed Hook Is Removed During Migration

Objective: Verify legacy managed hooks are retired safely without deleting custom hooks

Setup:

import subprocess
import os
from pathlib import Path

# Create temporary git repo
git_repo = tmp_path / "test-repo"
git_repo.mkdir()
os.chdir(git_repo)

subprocess.run(["git", "init"], check=True)
subprocess.run(["git", "config", "user.email", "test@test.com"], check=True)
subprocess.run(["git", "config", "user.name", "Test User"], check=True)

# Install hook
hook_dir = git_repo / ".git" / "hooks"
hook_dir.mkdir(parents=True, exist_ok=True)
hook_file = hook_dir / "pre-commit"
hook_file.write_text("# legacy managed hook fixture content")
hook_file.chmod(0o755)

# Stage bad file
bad_file = git_repo / "test.md"
bad_file.write_text("User's test")
subprocess.run(["git", "add", "test.md"], check=True)

Expected Behavior:

git commit fails
Error message shows "Encoding errors detected"
Error shows file name and line number
Suggests fix command

Assertions:

result = subprocess.run(
    ["git", "commit", "-m", "Test commit"],
    capture_output=True,
    text=True
)
assert result.returncode != 0
assert "Encoding errors detected" in result.stderr or result.stdout
assert "test.md" in result.stderr or result.stdout
assert "spec-kitty validate-encoding" in result.stderr or result.stdout

Test 5.2: Hook Allows Clean Files

Objective: Verify hook passes for properly encoded files

Setup:

# Clean repo from 5.1
clean_file = git_repo / "clean.md"
clean_file.write_text("This is clean content")
subprocess.run(["git", "add", "clean.md"], check=True)

Expected Behavior:

git commit succeeds
Output shows "✓ All staged markdown files are properly UTF-8 encoded"

Assertions:

result = subprocess.run(
    ["git", "commit", "-m", "Clean commit"],
    capture_output=True,
    text=True
)
assert result.returncode == 0
assert "properly UTF-8 encoded" in result.stdout or result.stderr

Test 5.3: Hook Skips Non-Markdown

Objective: Verify hook only checks .md files

Setup:

# Stage mixed files
(git_repo / "code.py").write_text("print('User's test')")  # Bad chars in Python OK
(git_repo / "doc.md").write_text("Clean markdown")
subprocess.run(["git", "add", "."], check=True)

Expected Behavior:

Hook only validates .md files
Python file ignored (even with smart quotes)
Commit succeeds if markdown clean

Assertions:

result = subprocess.run(
    ["git", "commit", "-m", "Mixed files"],
    capture_output=True,
    text=True
)
assert result.returncode == 0

Test 5.4: Hook Bypass With --no-verify

Objective: Verify git commit --no-verify bypasses hook

Setup:

bad_file = git_repo / "bad.md"
bad_file.write_text("User's test")
subprocess.run(["git", "add", "bad.md"], check=True)

Expected Behavior:

git commit --no-verify succeeds
Hook not executed

Assertions:

result = subprocess.run(
    ["git", "commit", "--no-verify", "-m", "Bypass hook"],
    capture_output=True,
    text=True
)
assert result.returncode == 0

Test Suite 6: Integration Tests

File: tests/test_encoding_plan_integration.py

Test 6.1: End-to-End Encoding Workflow

Objective: Test complete workflow from detection to fix to dashboard

Scenario:

Create feature with Windows-1252 encoded files
Dashboard scanner detects and auto-fixes
CLI validation confirms clean
Pre-commit hook allows commit

Assertions:

# 1. Create bad files
feature_dir = setup_feature_with_bad_encoding()

# 2. Dashboard auto-fixes
lanes = scan_feature_kanban(project_dir, "001-test")
assert all(not task.get("encoding_error") for lane in lanes.values() for task in lane)

# 3. Validation passes
result = runner.invoke(app, ["validate-encoding", "--feature", "001-test"])
assert result.exit_code == 0

# 4. Hook allows commit
commit_result = git_commit_in_feature(feature_dir)
assert commit_result.returncode == 0

Test 6.2: End-to-End Plan Validation Workflow

Objective: Test complete plan validation workflow

Scenario:

Create feature with template plan
Research command blocks
Fill in plan
Research command proceeds
Tasks command proceeds

Assertions:

# 1. Template plan
plan_file.write_text(TEMPLATE_PLAN)

# 2. Research blocks
result = runner.invoke(research_command, ["--feature", "001-test"])
assert result.exit_code == 1

# 3. Fill plan
plan_file.write_text(FILLED_PLAN)

# 4. Research proceeds
result = runner.invoke(research_command, ["--feature", "001-test"])
assert result.exit_code == 0

# 5. Tasks proceeds
result = subprocess.run(["check-prerequisites.sh", "--include-tasks"])
assert result.returncode == 0

Test 6.3: Multiple Features Mixed State

Objective: Verify validation handles mixed state across features

Scenario:

Feature 001: clean encoding, filled plan
Feature 002: bad encoding, filled plan
Feature 003: clean encoding, template plan

Assertions:

# All-features scan shows correct counts
result = runner.invoke(app, ["validate-encoding", "--all"])
assert "1 file(s) with encoding issues" in result.stdout  # Feature 002

# Research blocks only on 003
for feature_id in ["001", "002"]:
    result = runner.invoke(research_command, ["--feature", f"00{feature_id}-test"])
    assert result.exit_code == 0

result = runner.invoke(research_command, ["--feature", "003-test"])
assert result.exit_code == 1

Test Coverage Requirements

Minimum Coverage Targets:

src/specify_cli/text_sanitization.py: 95%
src/specify_cli/plan_validation.py: 95%
src/specify_cli/cli/commands/validate_encoding.py: 85%
src/specify_cli/dashboard/scanner.py (encoding portions): 90%

Critical Paths (Must Be 100%):

Character mapping in PROBLEMATIC_CHARS
Backup file creation
Plan marker detection
Dashboard auto-fix logic

Performance Requirements

Encoding Validation

Single file validation: < 50ms (for 10KB file)
Directory scan (100 files): < 2 seconds
Dashboard auto-fix: < 200ms (first-time per file)

Plan Validation

Template detection: < 20ms (for typical plan.md)
Research command gate: < 100ms total overhead

Error Case Testing

Must Test These Failure Modes

Binary file mistaken as markdown
- Verify sanitizer handles gracefully
- Verify dashboard doesn't crash
Corrupted UTF-8 (invalid byte sequences)
- Verify fallback to cp1252/latin-1
- Verify error message clarity
Mixed encodings in same file
- Verify best-effort sanitization
- Verify no data corruption
Very large files (>10MB)
- Verify no memory issues
- Verify timeout handling
Symbolic links
- Verify sanitizer follows/doesn't follow appropriately
- Verify no infinite loops
Permission denied
- Verify clear error message
- Verify doesn't crash pipeline
Plan with exactly 5 markers
- Verify threshold edge case
- Verify consistent behavior
Empty plan.md
- Verify doesn't crash
- Verify sensible default

Regression Test Requirements

These must NEVER break:

Existing clean files remain untouched by validation
Dashboard loads features with all UTF-8 files (no regression)
Research command works normally with filled plan
Backup files never overwrite existing .bak files
Legacy hook retirement does not delete custom project hooks

Documentation Tests

Verify documentation examples work:

All code examples in docs/encoding-validation.md execute successfully
Migration and encoding documentation examples are valid
AGENTS.md character examples actually trigger detection
CLI help text matches documented behavior

Acceptance Criteria

All tests must:

✅ Run in CI/CD pipeline
✅ Complete in < 30 seconds total
✅ Be deterministic (no flaky tests)
✅ Clean up temp files
✅ Work on Linux, macOS, Windows
✅ Not require internet connection
✅ Use fixtures for shared setup
✅ Have descriptive names matching test IDs above

Test execution:

# Run all encoding/plan tests
pytest tests/test_encoding_validation_functional.py -v
pytest tests/test_plan_validation_functional.py -v
pytest tests/test_dashboard_encoding_resilience.py -v
pytest tests/test_pre_commit_hook_functional.py -v
pytest tests/test_encoding_plan_integration.py -v

# Run with coverage
pytest tests/test_*encoding*.py tests/test_*plan*.py --cov=src/specify_cli --cov-report=html

# Quick smoke test
pytest tests/ -k "encoding or plan" -x  # Stop on first failure

Success Metrics

These metrics prove the guardrails work:

Zero dashboard crashes from encoding errors in test suite
Zero false positives in clean file validation
100% detection rate for all 15+ problematic character types
Zero data loss during sanitization (content preserved)
Pre-commit blocks 100% of files with encoding errors
Research/tasks block 100% of template plans (≥5 markers)

Test Maintenance Notes

When adding new problematic characters:

Update PROBLEMATIC_CHARS in text_sanitization.py
Add test case in Test 1.1
Update AGENTS.md examples
Add to pre-commit hook detection

When changing plan template:

Update marker list in plan_validation.py
Update Test 4.5 threshold tests
Update bash script markers
Regenerate test fixtures

When modifying dashboard scanner:

Verify Test 3.3 still creates error cards
Verify Test 3.4 auto-fix still works
Check performance benchmarks
Update integration tests