Utils

Helper and utility functions for the library.

Data:

CONFIGURATION_FILE

The expected configuration filename.

EDITORS

The list of possible editor exec names to test for getting a valid text editor.

TIMESTAMP_FORMAT

The datetime format string used for timestamps in experiment run reference names.

Classes:

PrefixedLogFactory(original_factory, prefix)

Note that we have to use this to prevent weird recursion issues.

Functions:

check_git_dirty_workingdir()

Checks if git working directory is dirty or not.

get_command_output(cmd[, silent])

Runs the command passed and returns the full string output of the command (minus the final newline).

get_conda_env()

Returns printed output from running conda env export --from-history command.

get_conda_list()

Returns printed output from running conda env export --from-history command.

get_configuration()

Load the configuration file if available, with defaults for any keys not found.

get_current_commit()

Returns printed output from running git rev-parse HEAD command.

get_editor()

Returns a text editor to use for richer text entry, such as in providing experiment notes.

get_os()

Get the current OS name and version.

get_pip_freeze()

Returns printed output from running pip freeze command.

get_py_opening_comment(lines)

Parse the passed lines of a python script file for a top of file docstring.

human_readable_mem_usage(byte_count)

Takes the given byte count and returns a nicely formatted string that includes the suffix (K/M/GB).

human_readable_time(seconds)

Takes the given time in seconds and returns a nicely formatted string that includes the suffix.

init_logging([log_path, level, log_errors, …])

Sets up logging configuration, including the associated file output.

preview_object(object)

Get a small string representation of an object that fits nicely in a single line and includes shape information if relevant.

run_command(cmd)

Prints output from running a command as it occurs.

set_logging_prefix(prefix)

Set the prefix content of the logger, which is incorporated in the log formatter.

curifactory.utils.CONFIGURATION_FILE = 'curifactory_config.json'

The expected configuration filename.

curifactory.utils.EDITORS = ['vim', 'nvim', 'emacs', 'nano', 'vi']

The list of possible editor exec names to test for getting a valid text editor.

class curifactory.utils.PrefixedLogFactory(original_factory, prefix)

Note that we have to use this to prevent weird recursion issues.

My understanding is that since logging is in the global context, after many many tests, the old_factory keeps getting set to the previous new_factory, and you end up with a massive function chain. Using this class approach above, we can check if we’ve already set the factory to an instance of this class, and just update the prefix on it.

https://stackoverflow.com/questions/59585861/using-logrecordfactory-in-python-to-add-custom-fields-for-logging

curifactory.utils.TIMESTAMP_FORMAT = '%Y-%m-%d-T%H%M%S'

The datetime format string used for timestamps in experiment run reference names.

curifactory.utils.check_git_dirty_workingdir()bool

Checks if git working directory is dirty or not. This is used to indicate potential reproducibility problems in the report and console output.

curifactory.utils.get_command_output(cmd: list, silent: bool = False)str

Runs the command passed and returns the full string output of the command (minus the final newline).

Parameters

cmd – Either a string command or array of strings, as one would pass to subprocess.run()

curifactory.utils.get_conda_env()str

Returns printed output from running conda env export --from-history command.

curifactory.utils.get_conda_list()str

Returns printed output from running conda env export --from-history command.

curifactory.utils.get_configuration()dict

Load the configuration file if available, with defaults for any keys not found. The config file should be “curifactory_config.json” in the project root.

The defaults are:

{
    "experiments_module_name": "experiments",
    "params_module_name": "params",
    "manager_cache_path": "data/",
    "cache_path": "data/cache",
    "runs_path": "data/runs",
    "logs_path": "logs/",
    "notebooks_path": "notebooks/",
    "reports_path": "reports/",
    "report_css_path": "reports/style.css",
}
Returns

the dictionary of configuration keys/values.

curifactory.utils.get_current_commit()str

Returns printed output from running git rev-parse HEAD command.

curifactory.utils.get_editor()str

Returns a text editor to use for richer text entry, such as in providing experiment notes.

e.g. "vim"

curifactory.utils.get_os()str

Get the current OS name and version.

curifactory.utils.get_pip_freeze()str

Returns printed output from running pip freeze command.

curifactory.utils.get_py_opening_comment(lines: list)str

Parse the passed lines of a python script file for a top of file docstring. This is used to get parameter and experiment file descriptions, so be sure to always comment your code!

curifactory.utils.human_readable_mem_usage(byte_count: int)str

Takes the given byte count and returns a nicely formatted string that includes the suffix (K/M/GB).

Parameters

byte_count (int) – The number of bytes to convert into KB/MB/GB.

curifactory.utils.human_readable_time(seconds: float)str

Takes the given time in seconds and returns a nicely formatted string that includes the suffix.

Parameters

seconds (float) – The time in seconds to convert.

curifactory.utils.init_logging(log_path: Optional[str] = None, level=20, log_errors: bool = False, include_process: bool = False, no_color: bool = False, quiet: bool = False, plain: bool = False, disable_non_cf_loggers: bool = True)

Sets up logging configuration, including the associated file output.

Parameters
  • log_path (str) – Folder to store output logs in. If None, only log to console.

  • level – The logging level to output.

  • log_errors (bool) – Whether to include error messages in the log output or not.

  • include_process (bool) – Whether to include the PID prefix value in the logger. This is mostly only used for when the --parallel flag is used, to help track which log message is from which process.

  • no_color (bool) – Suppress colors in console output.

  • quiet (bool) – Suppress all console log output.

  • plain (bool) – Output plain text log rather than rich output.

  • disable_non_cf_loggers (bool) – Hide loggers from other libraries. This is true by default because some libraries can generate a lot of noise.

curifactory.utils.preview_object(object: any)str

Get a small string representation of an object that fits nicely in a single line and includes shape information if relevant.

curifactory.utils.run_command(cmd: list)

Prints output from running a command as it occurs.

Parameters

cmd – Either a string command or array of strings, as one would pass to subprocess.run()

curifactory.utils.set_logging_prefix(prefix: str)

Set the prefix content of the logger, which is incorporated in the log formatter. Currently this is just used to include pid number when running in parallel.