Abstractions
Abstract base classes for TensorImgPipeline.
This package provides the core abstractions used throughout the pipeline framework.
Copyright (C) 2025 Matti Kaupenjohann
This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with this program. If not, see https://www.gnu.org/licenses/.
AbstractConfig
dataclass
¶
Bases: ABC
Abstract base class for configuration objects.
Provides common functionality for: - Path string to Path object conversion - Parameter validation - Configuration validation
Subclasses must implement the validate() method to define their specific validation logic.
__post_init__()
¶
Post-initialization hook.
Applies path conversions and runs validation.
validate()
abstractmethod
¶
Validate the configuration.
Subclasses should implement this method to check that all configuration values are valid and meet requirements.
Raises:
| Type | Description |
|---|---|
InvalidConfigError
|
If configuration is invalid. |
validate_params(params, cls)
¶
Validate that parameters match a class constructor signature.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
params
|
dict[str, Any]
|
Dictionary of parameter names to values. |
required |
cls
|
type
|
The class whose constructor signature to validate against. |
required |
Raises:
| Type | Description |
|---|---|
InvalidConfigError
|
If params contain unexpected keys. |
AbstractController
¶
Bases: ABC
Abstract base class for pipeline controllers.
Controllers manage the execution flow of pipeline processes, including progress reporting and process lifecycle management.
add_process(process)
abstractmethod
¶
Add a process to the controller's execution queue.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
process
|
PipelineProcess
|
The pipeline process to add. |
required |
Permanence
¶
Bases: ABC
Base class for objects that persist through the entire pipeline lifecycle.
Permanences are stateful resources that: - Store structured data needed throughout pipeline execution - Are accessed by processes via controller.get_permanence(name) - Have managed lifecycles with hooks - Are extensible through abstraction
Example
class MyDataPermanence(Permanence):
def __init__(self, path: Path):
self.data = self._load_data(path)
def initialize(self) -> None:
# Setup phase - called before any process runs
self._validate_data()
def checkpoint(self) -> None:
# Save intermediate state
self._save_checkpoint()
def cleanup(self) -> None:
# Release resources
del self.data
checkpoint()
¶
Save intermediate state during pipeline execution.
Called at configurable checkpoints during execution. Use for saving progress, creating backups, or logging state.
Raises:
| Type | Description |
|---|---|
Exception
|
If checkpointing fails |
cleanup()
abstractmethod
¶
Cleans up data from RAM or VRAM.
Called after all processes complete or on error. Should release any held resources (memory, file handles, connections).
Raises:
| Type | Description |
|---|---|
Exception
|
If cleanup fails |
get_state()
¶
Get serializable state for inspection or debugging.
Returns a dictionary representation of the permanence state. Useful for logging, debugging, or state inspection.
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
dict[str, Any]: Dictionary containing permanence state. |
initialize()
¶
Initialize the permanence before pipeline execution.
Called once after all permanences are constructed but before any process runs. Use for validation, resource allocation, or setup that depends on other permanences.
Raises:
| Type | Description |
|---|---|
Exception
|
If initialization fails |
is_initialized()
¶
Check if the permanence has been initialized.
Override to provide custom logic for determining initialization state.
Returns:
| Name | Type | Description |
|---|---|---|
bool |
bool
|
True if initialized, False otherwise. |
validate()
¶
Validate the permanence state.
Called to verify permanence is in valid state. Use for health checks, data validation, or consistency checks.
Raises:
| Type | Description |
|---|---|
Exception
|
If validation fails |
PipelineProcess
¶
Bases: ABC
Abstract base class for pipeline processes.
A process represents a unit of work within the pipeline that: - Can access permanences via a controller/manager - Can be skipped based on conditions - Executes its main logic via execute() - Can be forced to run via the force parameter
Example
class MyProcess(PipelineProcess):
def __init__(self, controller, force: bool):
super().__init__(controller, force)
self.data = controller.get_permanence("data")
def skip(self) -> bool:
return not self.force and self.data.is_cached()
def execute(self) -> None:
# Process logic here
self.data.process()
__init__(controller, force)
¶
Initialize the process.
When overriding this method, make sure to call super().init(controller, force).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
controller
|
Any
|
The controller/manager providing access to permanences. Should have a get_permanence(name: str) method. |
required |
force
|
bool
|
If True, process should run even if outputs exist. |
required |
execute()
abstractmethod
¶
Execute the process logic.
This method should contain the main work of the process. It should handle any errors internally or let them propagate.
Raises:
| Type | Description |
|---|---|
Exception
|
Any exceptions during execution. |
skip()
abstractmethod
¶
Determine if the process should be skipped.
Returns:
| Type | Description |
|---|---|
bool
|
True if the process should be skipped, False otherwise. |
bool
|
Common reasons to skip: |
bool
|
|
bool
|
|
bool
|
|
ProcessConfig
dataclass
¶
Bases: AbstractConfig
Base configuration for pipeline processes.
Attributes:
| Name | Type | Description |
|---|---|---|
force |
bool
|
If True, forces execution even if outputs exist. |
__post_init__()
¶
Post-initialization hook.
Applies path conversions and runs validation.
validate()
¶
Validate the process configuration.
Raises:
| Type | Description |
|---|---|
InvalidConfigError
|
If force is not a boolean. |
validate_params(params, cls)
¶
Validate that parameters match a class constructor signature.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
params
|
dict[str, Any]
|
Dictionary of parameter names to values. |
required |
cls
|
type
|
The class whose constructor signature to validate against. |
required |
Raises:
| Type | Description |
|---|---|
InvalidConfigError
|
If params contain unexpected keys. |