Architecture Improvements for Research Workflow¶

Problem Statement¶

The original architecture required too much upfront structure:

Define Permanence/Process classes
Create TOML configs
Register everything in builder
Go through CLI/runner

This creates friction for researchers who just want to experiment!

Solution: Progressive Enhancement Architecture¶

Core Principle¶

"Start simple, scale when needed"

Researchers can:

Start with plain Python scripts
Add features incrementally (progress bars → logging → device management)
Extract to functions when code stabilizes
Convert to pipeline only when going to production

New Components¶

1. Helper Module (`helpers.py`)¶

Provides script-level utilities that work standalone OR with pipeline:

from tipi.helpers import progress_bar, logger, device_manager

# Works in plain scripts!
for epoch in progress_bar(range(10)):
    device = device_manager.get_device()
    loss = train()
    logger.log({"loss": loss})

Key Features:

progress_bar(): tqdm-like progress bars (uses Rich standalone, pipeline's ProgressManager when available)
logger: WandB logging (initializes manually or uses pipeline's WandBManager)
device_manager: Smart GPU selection (simple selection standalone, uses pipeline's Device when available)

How it works:

Checks global _pipeline_context to detect if running in pipeline
If context exists, uses pipeline permanences
If not, provides standalone implementations

2. Decorator Module (`decorators.py`)¶

Makes functions pipeline-ready with zero code changes:

from tipi.decorators import pipeline_process

@pipeline_process
def train(epochs: int = 10):
    """This function works standalone AND in pipeline!"""
    for epoch in helpers.progress_bar(range(epochs)):
        loss = train_step()
        helpers.logger.log({"loss": loss})

# Run as script
if __name__ == "__main__":
    train(epochs=5)

# Or register in pipeline config:
# [processes.training]
# type = "train"
# params = { epochs = 10 }

Key Features:

Decorated functions remain callable as normal functions
Decorator creates a PipelineProcess class dynamically
Function parameters become config parameters
Automatic pipeline context injection

3. Progressive Enhancement Guide (`docs/progressive_enhancement.md`)¶

Complete guide showing 5 levels of enhancement:

Level 0: Raw script
Level 1: Add progress bars (+2 lines)
Level 2: Add logging (+3 lines)
Level 3: Better device management (+2 lines)
Level 4: Extract to reusable functions
Level 5: Full pipeline (config file + class wrapper)

Each level adds ONE concept, building on previous levels.

Architecture Changes¶

Before (Original Proposed)¶

CLI → Runner → Builder → Controller → Executor → Processes
                                                      ↑
                                              Requires class definition
                                              Requires TOML config
                                              Requires registration

Entry barrier: HIGH (must understand entire pipeline)

After (With Progressive Enhancement)¶

┌─────────────────────────────────────────────────────┐
│  Script Mode                                        │
│  ──────────────────────────────────────────────     │
│  Plain Python → helpers.py (standalone mode)        │
│                                                     │
│  Entry barrier: ZERO (just import helpers)          │
└─────────────────────────────────────────────────────┘
                      │
                      │ (when ready)
                      ▼
┌─────────────────────────────────────────────────────┐
│  Hybrid Mode                                        │
│  ──────────────────────────────────────────────     │
│  Functions + @pipeline_process → helpers (dual)     │
│                                                     │
│  Entry barrier: LOW (add decorator)                 │
└─────────────────────────────────────────────────────┘
                      │
                      │ (when productionizing)
                      ▼
┌─────────────────────────────────────────────────────┐
│  Pipeline Mode                                      │
│  ──────────────────────────────────────────────     │
│  CLI → Runner → Builder → Controller → Executor     │
│                                    ↓                │
│                            Processes → helpers      │
│                                    (pipeline mode)  │
│                                                     │
│  Entry barrier: MEDIUM (config + classes)           │
└─────────────────────────────────────────────────────┘

Technical Implementation¶

Pipeline Context Injection¶

The PipelineExecutor sets context before running processes:

class PipelineExecutor:
    def _run_processes(self):
        # Set context for helpers
        from tipi.helpers import set_pipeline_context

        context = {
            "progress_manager": self.controller.get_permanence("progress_manager", None),
            "wandb_logger": self.controller.get_permanence("wandb_logger", None),
            "device": self.controller.get_permanence("device", None),
        }
        set_pipeline_context(context)

        try:
            # Run processes (they use helpers which now see the context)
            for process in self.controller.iterate_processes():
                process.execute()
        finally:
            clear_pipeline_context()

Helper Auto-Detection¶

Each helper checks for pipeline context:

def progress_bar(iterable, desc="Processing"):
    # Check if running in pipeline
    if _pipeline_context:
        progress_manager = _pipeline_context.get("progress_manager")
        if progress_manager:
            # Use pipeline's progress manager
            return pipeline_progress(iterable, desc, progress_manager)

    # Fallback to rich.track (tqdm-like)
    return track(iterable, description=desc)

Function-to-Process Conversion¶

The @pipeline_process decorator dynamically creates a PipelineProcess class:

@pipeline_process
def train(epochs: int = 10):
    # Function body...
    pass

# Decorator generates:
class TrainProcess(PipelineProcess):
    def __init__(self, controller, force=False, epochs=10):
        self.controller = controller
        self.epochs = epochs

    def execute(self):
        set_pipeline_context({...})
        try:
            return train(epochs=self.epochs)  # Call original function
        finally:
            clear_pipeline_context()

Benefits¶

For Researchers¶

Zero friction start: Just write normal Python
Incremental enhancement: Add features one at a time
No forced migration: Can stay at any level
Copy-paste friendly: Code works across levels
Familiar tools: tqdm-like progress bars, standard WandB

For Production¶

Full pipeline features: When needed
Config-driven: Easy to manage experiments
Testable: Each level is testable
Team collaboration: Standardized structure
CI/CD ready: CLI integration

For Both¶

Same code: Training logic doesn't change between levels
Progressive complexity: Match tool to task
Smooth transition: No rewrites needed
Dual-mode helpers: Work everywhere

Migration Path¶

Existing Scripts → Enhanced Scripts¶

# Before
for epoch in range(10):
    loss = train()
    print(loss)

# After (add 3 lines)
from tipi.helpers import progress_bar, logger
logger.init(project="exp")

for epoch in progress_bar(range(10)):
    loss = train()
    logger.log({"loss": loss})

Enhanced Scripts → Pipeline-Ready Functions¶

# Add decorator (1 line)
@pipeline_process
def train(epochs: int = 10):
    # Same code as before!
    pass

Pipeline-Ready Functions → Full Pipeline¶

# Create config.toml
[processes.training]
type = "train"  # Function name
params = { epochs = 10 }

Comparison with Original Architecture¶

Aspect	Original	With Progressive Enhancement
Entry point	CLI/Runner	Plain Python script
Initial complexity	High (classes + config)	Zero (just helpers)
Learning curve	Steep	Gradual
Research workflow	Forced into pipeline	Natural script evolution
Production workflow	Ready	Ready (same endpoint)
Code reuse	Medium	High (same code everywhere)
Testing	Full pipeline only	Each level testable
Team onboarding	Full architecture	Start with helpers

Example: Real Research Scenario¶

Week 1: New idea¶

# quick_test.py
from tipi.helpers import progress_bar
for epoch in progress_bar(range(5)):
    print(train())

Week 2: Looks promising¶

# experiment_v1.py
from tipi.helpers import progress_bar, logger
logger.init(project="new_idea")
for epoch in progress_bar(range(10)):
    logger.log({"loss": train()})

Week 3: Multiple variations¶

# experiment_v2.py, v3.py, v4.py
# All using helpers - easy to compare in WandB

Month 1: Extract common logic¶

# train_utils.py
@pipeline_process
def train(learning_rate: float = 0.001):
    # Shared logic
    pass

# experiment_v5.py
from train_utils import train
train(learning_rate=0.01)

Month 2: Production¶

# pipeline.toml
[processes.training]
type = "train"
params = { learning_rate = 0.001 }

Same training code from Week 1 to Month 2!

Implementation Status¶

Completed¶

✅ helpers.py module with progress_bar, logger, device_manager
✅ decorators.py module with @pipeline_process
✅ docs/progressive_enhancement.md full guide
✅ Architecture diagrams updated

TODO (for full implementation)¶

[ ] Integrate helper context setting in PipelineExecutor
[ ] Implement pipeline progress integration in helpers
[ ] Add tests for dual-mode helpers
[ ] Create example scripts at each level
[ ] Update existing processes to use helpers
[ ] Add CLI flag to detect pipeline mode
[ ] Create migration guide for existing pipelines

Questions & Answers¶

Q: Does this change the core architecture? A: No, it adds a new entry layer. Full pipeline architecture remains unchanged.

Q: Can existing pipelines use helpers? A: Yes! Processes can import helpers and they'll automatically use permanences.

Q: What if someone doesn't want helpers? A: Completely optional. Original architecture still works.

Q: Performance impact? A: Minimal. Context checking is a simple dict lookup.

Q: Does this work with existing permanences? A: Yes! Helpers just access them through context.

Recommendation¶

Implement progressive enhancement in this order:

Phase 1: Helper module (standalone mode only)
Researchers can start using immediately
No pipeline changes needed
Low risk
Phase 2: Executor integration
Connect helpers to pipeline permanences
Existing pipelines work unchanged
Medium risk
Phase 3: Decorator support
Enable function-to-process conversion
Add to builder registration
Medium risk
Phase 4: Documentation & examples
Create example scripts at each level
Migration guides
Low risk

This allows incremental rollout with early value delivery.