Research: Modular Code Refactoring

Path: kitty-specs/004-modular-code-refactoring/research.md

Feature: 004-modular-code-refactoring Date: 2025-11-11 Status: Complete

Executive Summary

This research documents the analysis and decisions for refactoring two monolithic Python files (totaling 5,730 lines) into a modular architecture with ~21 modules, each under 200 lines. The refactoring strategy enables parallel development by up to 6 agents simultaneously.

Key Decisions

Decision 1: Refactoring Approach

Choice: Hybrid Layer-Module Strategy Rationale:

  • Minimizes file editing conflicts between parallel agents
  • Creates clear ownership boundaries
  • Respects dependency ordering (foundation ’ services ’ commands)
  • Enables 6 agents to work simultaneously during peak phases

Alternatives Considered:

  • Module-first approach: Rejected due to cross-cutting concerns in some modules
  • Layer-first approach: Rejected as it would create too many interdependencies
  • Feature-first approach: Rejected as features span multiple files

Decision 2: Module Size Limit

Choice: 200 lines per module (excluding comments/docstrings) Rationale:

  • Industry best practice for readable, maintainable code
  • Forces single responsibility principle
  • Makes code review manageable
  • Aligns with cognitive load theory (7±2 items in working memory)

Evidence: Martin Fowler's "Refactoring" recommends classes under 200 lines; Google's Python style guide suggests similar limits

Decision 3: Import Strategy

Choice: Try/except pattern for subprocess compatibility Rationale:

  • Handles three execution contexts: local dev, pip install, subprocess
  • Dashboard spawns detached processes requiring absolute imports
  • Maintains backward compatibility

Implementation:

try:
    from .diagnostics import run_diagnostics  # Package context
except ImportError:
    from specify_cli.dashboard.diagnostics import run_diagnostics  # Subprocess

Decision 4: Directory Structure

Choice: Domain-based package organization Rationale:

  • Groups related functionality (cli/, core/, template/, dashboard/)
  • Matches mental model of system architecture
  • Enables independent testing per package
  • Supports future plugin architecture if needed

Alternatives Considered:

  • Flat structure: Rejected as it doesn't scale
  • Feature-based: Rejected as features cross-cut concerns
  • Layered (MVC): Rejected as CLI apps don't fit MVC well

Technical Analysis

Current State Metrics

FileLinesFunctionsClassesComplexity
__init__.py2,700472High (15 subsystems)
dashboard.py3,030231Very High (7 subsystems + embedded HTML/JS)

Target State Metrics

PackageModulesMax LinesTotal LinesAgent Owner
cli/8200~1,100Agent E, F
core/5200~600Agent C
template/4200~700Agent B
dashboard/11200~1,400Agent A, D

Dependency Analysis

Critical Path Dependencies: 1. core/config.py ’ All modules (constants) 2. core/utils.py ’ Most modules (utilities) 3. cli/ui.py ’ CLI commands (UI components) 4. template/manager.py ’ init command 5. dashboard/server.py ’ dashboard command

Circular Dependency Risks:

  • None identified with proposed structure
  • Each package has clear upstream/downstream relationships

Parallel Execution Plan

Wave Structure

Sequential Foundation (1 agent, Day 1)
    

 Parallel Wave 1 (3 agents, Days 2-3)
    
 Agent A: Dashboard Infrastructure
    
 Agent B: Template System
        Agent C: Core Services
    
     Parallel Wave 2 (3 agents, Days 4-5)

 Agent D: Dashboard Handlers

 Agent E: CLI Commands
         Agent F: GitHub & Init

Coordination Mechanisms

1. File Locking: Each agent owns specific files exclusively 2. Interface Contracts: Define function signatures before implementation 3. Import Stubs: Create placeholder imports for not-yet-extracted modules 4. Daily Sync: Merge completed work at day end

Best Practices Research

Python Module Design

  • PEP 8: Style Guide for Python Code
  • PEP 257: Docstring Conventions
  • PEP 484: Type Hints (for interface contracts)

Refactoring Patterns

1. Extract Method: Break large functions into smaller ones 2. Extract Class: Group related functions into classes 3. Move Method: Relocate functions to appropriate modules 4. Replace Magic Numbers: Use named constants from config

Testing Strategy

  • Unit tests per module (pytest)
  • Integration tests for command flows
  • Import tests for all execution contexts
  • Regression tests against existing test suite

Implementation Risks & Mitigations

Risk 1: Import Resolution Failures

Likelihood: High Impact: High Mitigation:

  • Test in all three contexts (local, pip, subprocess)
  • Use try/except pattern consistently
  • Create import test suite

Risk 2: Behavioral Changes

Likelihood: Medium Impact: High Mitigation:

  • Comprehensive test coverage before refactoring
  • Keep original files as reference during development
  • A/B testing of commands

Risk 3: Merge Conflicts

Likelihood: Low (with proper coordination) Impact: Medium Mitigation:

  • Clear file ownership per agent
  • Daily sync points
  • Feature flags for gradual rollout

Open Questions

1. Q: Should we create a compatibility shim for external tools that might import directly?

  • A: No, this is internal refactoring. Public API (CLI) remains unchanged.

2. Q: How do we handle the embedded HTML/CSS/JS in dashboard.py?

  • A: Extract to separate files in dashboard/static/ and dashboard/templates/

3. Q: Should we use type hints throughout?

  • A: Yes, for public interfaces. Optional for internal functions.

Evidence Sources

See research/source-register.csv for full source list including:

  • Python Enhancement Proposals (PEPs)
  • "Clean Code" by Robert Martin
  • "Refactoring" by Martin Fowler
  • Google Python Style Guide
  • Real Python's guide on Python project structure

Next Steps

1. Complete data-model.md with module interfaces 2. Generate contracts for inter-module communication 3. Create quickstart.md for developers 4. Run agent context update script