Manager

Pipeline and artifact manager

Data:

CONFIGURATION_FILE

The expected configuration filename.

TIMESTAMP_FORMAT

The datetime format string used for timestamps in pipeline run reference names.

Classes:

Manager([database_path, cache_path, ...])

curifactory.experimental.manager.CONFIGURATION_FILE = 'curifactory_config.json'

The expected configuration filename.

class curifactory.experimental.manager.Manager(database_path='data/store.db', cache_path='data/cache', reports_path='reports', default_pipeline_modules=None, **additional_configuration)

Methods:

add_pipeline_to_ref_names(module_str, ...)

Add the pipeline to the ref names dictionary under all logical names.

db_connection()

default_manager()

Try to find a config file and if not assume some defaults.

divide_reference_parts(ref_str)

ensure_dir_paths()

ensure_store_tables()

ensure_sys_path()

from_config(config)

get_artifact_obj_repr(artifact)

get_manager()

get_next_pipeline_run_number(pipeline)

get_reference_name(pipeline)

Get the reference name of this run in the pipeline registry.

get_str_timestamp(pipeline)

Convert the manager's run timestamp into a string representation.

import_pipelines_from_module(module_str)

init_cf_logging()

init_logging()

init_root_logging()

The CLI sets this.

load_artifact_metadata_by_id(db_id, artifact)

load_default_pipeline_imports()

load_jinja_env()

pipeline_keys_matching(prefix)

quietly_import_module(module_str)

record_artifact(artifact)

record_pipeline_run(pipeline)

record_pipeline_run_completion(pipeline)

record_pipeline_run_target(pipeline, target)

record_stage(stage)

record_stage_artifact_input(stage, artifact, ...)

record_stage_completion(stage)

record_stage_dependency(stage, dependency_stage)

record_stage_start(stage)

resolve_reference(ref_str[, types])

search_for_artifact_generating_run(artifact_id)

search_for_db_artifact(artifact)

Attributes:

additional_configuration

Anything in additional configuration can be accessed/referenced in stages

config

imported_module_names

List of module strings that have been successfully imported and parsed for pipelines.

logger

parameterized_pipelines

Dictionary of initialized (parameterized) pipelines keyed by dataclass type

pipeline_ref_names

Dictionary of possible names/references to use to refer to specific pipelines.

pipelines

dictionary of pipeline dataclasses keyed by dataclass name.

runs

Parameters:
  • database_path (str)

  • cache_path (str)

  • reports_path (str)

  • default_pipeline_modules (list[str])

  • additional_configuration (dict[str, Any])

add_pipeline_to_ref_names(module_str, attr_name, pipeline)

Add the pipeline to the ref names dictionary under all logical names.

Parameters:
  • module_str (str)

  • attr_name (str)

additional_configuration: dict[str, Any]

Anything in additional configuration can be accessed/referenced in stages

property config: dict[str, Any]
currently_recording: bool
db_connection()
static default_manager()

Try to find a config file and if not assume some defaults.

default_pipeline_modules: list[str]
divide_reference_parts(ref_str)
Parameters:

ref_str (str)

Return type:

dict[str, str]

ensure_dir_paths()
ensure_store_tables()
ensure_sys_path()
static from_config(config)
Parameters:

config (dict)

get_artifact_obj_repr(artifact)
Return type:

str

classmethod get_manager()
get_next_pipeline_run_number(pipeline)
Return type:

int

get_reference_name(pipeline)

Get the reference name of this run in the pipeline registry.

The format for this name is [pipeline_name]_[run_number]_[timestamp].

Return type:

str

get_str_timestamp(pipeline)

Convert the manager’s run timestamp into a string representation.

Return type:

str

import_pipelines_from_module(module_str)
Parameters:

module_str (str)

imported_module_names

List of module strings that have been successfully imported and parsed for pipelines.

init_cf_logging()
init_logging()
init_root_logging()

The CLI sets this.

load_artifact_metadata_by_id(db_id, artifact)
Parameters:
Return type:

bool

load_default_pipeline_imports()
load_jinja_env()
property logger
logging_initialized: bool
parameterized_pipelines

Dictionary of initialized (parameterized) pipelines keyed by dataclass type

pipeline_keys_matching(prefix)
Parameters:

prefix (str)

Return type:

list[str]

pipeline_ref_names

Dictionary of possible names/references to use to refer to specific pipelines.

pipelines

dictionary of pipeline dataclasses keyed by dataclass name.

project_root: str
quietly_import_module(module_str)
Parameters:

module_str (str)

record_artifact(artifact)
record_pipeline_run(pipeline)
record_pipeline_run_completion(pipeline)
record_pipeline_run_target(pipeline, target)
record_stage(stage)
record_stage_artifact_input(stage, artifact, arg_index, arg_name)
record_stage_completion(stage)
record_stage_dependency(stage, dependency_stage)
record_stage_start(stage)
repr_functions: dict[type, callable]
resolve_reference(ref_str, types=None)
Parameters:
  • ref_str (str)

  • types (list[str])

property runs: DataFrame
search_for_artifact_generating_run(artifact_id)
search_for_db_artifact(artifact)
curifactory.experimental.manager.TIMESTAMP_FORMAT = '%Y-%m-%d-T%H%M%S'

The datetime format string used for timestamps in pipeline run reference names.