Skip to content

Abstractions

Abstract base classes for TensorImgPipeline.

This package provides the core abstractions used throughout the pipeline framework.

Copyright (C) 2025 Matti Kaupenjohann

This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.

You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.

AbstractConfig dataclass

Bases: ABC

Abstract base class for configuration objects.

Provides common functionality for: - Path string to Path object conversion - Parameter validation - Configuration validation

Subclasses must implement the validate() method to define their specific validation logic.

__post_init__()

Post-initialization hook.

Applies path conversions and runs validation.

validate() abstractmethod

Validate the configuration.

Subclasses should implement this method to check that all configuration values are valid and meet requirements.

Raises:

Type Description
InvalidConfigError

If configuration is invalid.

validate_params(params, cls)

Validate that parameters match a class constructor signature.

Parameters:

Name Type Description Default
params dict[str, Any]

Dictionary of parameter names to values.

required
cls type

The class whose constructor signature to validate against.

required

Raises:

Type Description
InvalidConfigError

If params contain unexpected keys.

AbstractController

Bases: ABC

Abstract base class for pipeline controllers.

Controllers manage the execution flow of pipeline processes, including progress reporting and process lifecycle management.

add_process(process) abstractmethod

Add a process to the controller's execution queue.

Parameters:

Name Type Description Default
process PipelineProcess

The pipeline process to add.

required

Permanence

Bases: ABC

Base class for objects that persist through the entire pipeline lifecycle.

Permanences are stateful resources that: - Store structured data needed throughout pipeline execution - Are accessed by processes via controller.get_permanence(name) - Have managed lifecycles with hooks - Are extensible through abstraction

Example
class MyDataPermanence(Permanence):
    def __init__(self, path: Path):
        self.data = self._load_data(path)

    def initialize(self) -> None:
        # Setup phase - called before any process runs
        self._validate_data()

    def checkpoint(self) -> None:
        # Save intermediate state
        self._save_checkpoint()

    def cleanup(self) -> None:
        # Release resources
        del self.data

checkpoint()

Save intermediate state during pipeline execution.

Called at configurable checkpoints during execution. Use for saving progress, creating backups, or logging state.

Raises:

Type Description
Exception

If checkpointing fails

cleanup() abstractmethod

Cleans up data from RAM or VRAM.

Called after all processes complete or on error. Should release any held resources (memory, file handles, connections).

Raises:

Type Description
Exception

If cleanup fails

get_state()

Get serializable state for inspection or debugging.

Returns a dictionary representation of the permanence state. Useful for logging, debugging, or state inspection.

Returns:

Type Description
dict[str, Any]

dict[str, Any]: Dictionary containing permanence state.

initialize()

Initialize the permanence before pipeline execution.

Called once after all permanences are constructed but before any process runs. Use for validation, resource allocation, or setup that depends on other permanences.

Raises:

Type Description
Exception

If initialization fails

is_initialized()

Check if the permanence has been initialized.

Override to provide custom logic for determining initialization state.

Returns:

Name Type Description
bool bool

True if initialized, False otherwise.

validate()

Validate the permanence state.

Called to verify permanence is in valid state. Use for health checks, data validation, or consistency checks.

Raises:

Type Description
Exception

If validation fails

PipelineProcess

Bases: ABC

Abstract base class for pipeline processes.

A process represents a unit of work within the pipeline that: - Can access permanences via a controller/manager - Can be skipped based on conditions - Executes its main logic via execute() - Can be forced to run via the force parameter

Example
class MyProcess(PipelineProcess):
    def __init__(self, controller, force: bool):
        super().__init__(controller, force)
        self.data = controller.get_permanence("data")

    def skip(self) -> bool:
        return not self.force and self.data.is_cached()

    def execute(self) -> None:
        # Process logic here
        self.data.process()

__init__(controller, force)

Initialize the process.

When overriding this method, make sure to call super().init(controller, force).

Parameters:

Name Type Description Default
controller Any

The controller/manager providing access to permanences. Should have a get_permanence(name: str) method.

required
force bool

If True, process should run even if outputs exist.

required

execute() abstractmethod

Execute the process logic.

This method should contain the main work of the process. It should handle any errors internally or let them propagate.

Raises:

Type Description
Exception

Any exceptions during execution.

skip() abstractmethod

Determine if the process should be skipped.

Returns:

Type Description
bool

True if the process should be skipped, False otherwise.

bool

Common reasons to skip:

bool
  • Outputs already exist and force=False
bool
  • Required inputs are missing
bool
  • Conditional execution based on config

ProcessConfig dataclass

Bases: AbstractConfig

Base configuration for pipeline processes.

Attributes:

Name Type Description
force bool

If True, forces execution even if outputs exist.

__post_init__()

Post-initialization hook.

Applies path conversions and runs validation.

validate()

Validate the process configuration.

Raises:

Type Description
InvalidConfigError

If force is not a boolean.

validate_params(params, cls)

Validate that parameters match a class constructor signature.

Parameters:

Name Type Description Default
params dict[str, Any]

Dictionary of parameter names to values.

required
cls type

The class whose constructor signature to validate against.

required

Raises:

Type Description
InvalidConfigError

If params contain unexpected keys.