Utils

Helper and utility functions for the library.

Data:

CONFIGURATION_FILE

The expected configuration filename.

EDITORS

The list of possible editor exec names to test for getting a valid text editor.

TIMESTAMP_FORMAT

The datetime format string used for timestamps in experiment run reference names.

Classes:

PrefixedLogFactory(original_factory, prefix)

Note that we have to use this to prevent weird recursion issues.

StreamToLogger(level)

Functions:

check_git_dirty_workingdir()

Checks if git working directory is dirty or not.

get_command_output(cmd[, silent])

Runs the command passed and returns the full string output of the command (minus the final newline).

get_conda_env()

Returns printed output from running conda env export --from-history command.

get_conda_list()

Returns printed output from running conda env export --from-history command.

get_configuration()

Load the configuration file if available, with defaults for any keys not found.

get_current_commit()

Returns printed output from running git rev-parse HEAD command.

get_editor()

Returns a text editor to use for richer text entry, such as in providing experiment notes.

get_os()

Get the current OS name and version.

get_pip_freeze()

Returns printed output from running pip freeze command.

get_py_opening_comment(lines)

Parse the passed lines of a python script file for a top of file docstring.

human_readable_mem_usage(byte_count)

Takes the given byte count and returns a nicely formatted string that includes the suffix (K/M/GB).

human_readable_time(seconds)

Takes the given time in seconds and returns a nicely formatted string that includes the suffix.

init_logging([log_path, level, log_errors, ...])

Sets up logging configuration, including the associated file output.

preview_object(object)

Get a small string representation of an object that fits nicely in a single line and includes shape information if relevant.

run_command(cmd)

Prints output from running a command as it occurs.

set_logging_prefix(prefix)

Set the prefix content of the logger, which is incorporated in the log formatter.

curifactory.utils.CONFIGURATION_FILE = 'curifactory_config.json'

The expected configuration filename.

curifactory.utils.EDITORS = ['vim', 'nvim', 'emacs', 'nano', 'vi']

The list of possible editor exec names to test for getting a valid text editor.

class curifactory.utils.PrefixedLogFactory(original_factory, prefix)

Note that we have to use this to prevent weird recursion issues.

My understanding is that since logging is in the global context, after many many tests, the old_factory keeps getting set to the previous new_factory, and you end up with a massive function chain. Using this class approach above, we can check if we’ve already set the factory to an instance of this class, and just update the prefix on it.

https://stackoverflow.com/questions/59585861/using-logrecordfactory-in-python-to-add-custom-fields-for-logging

class curifactory.utils.StreamToLogger(level)

Methods:

write(buf)

write(buf)
curifactory.utils.TIMESTAMP_FORMAT = '%Y-%m-%d-T%H%M%S'

The datetime format string used for timestamps in experiment run reference names.

curifactory.utils.check_git_dirty_workingdir()

Checks if git working directory is dirty or not. This is used to indicate potential reproducibility problems in the report and console output.

Return type:

bool

curifactory.utils.get_command_output(cmd, silent=False)

Runs the command passed and returns the full string output of the command (minus the final newline).

Parameters:
  • cmd (list[str]) – Either a string command or array of strings, as one would pass to subprocess.run()

  • silent (bool)

Return type:

str

curifactory.utils.get_conda_env()

Returns printed output from running conda env export --from-history command.

Return type:

str

curifactory.utils.get_conda_list()

Returns printed output from running conda env export --from-history command.

Return type:

str

curifactory.utils.get_configuration()

Load the configuration file if available, with defaults for any keys not found. The config file should be “curifactory_config.json” in the project root.

The defaults are:

{
    "experiments_module_name": "experiments",
    "params_module_name": "params",
    "manager_cache_path": "data/",
    "cache_path": "data/cache",
    "runs_path": "data/runs",
    "logs_path": "logs/",
    "notebooks_path": "notebooks/",
    "reports_path": "reports/",
    "report_css_path": "reports/style.css",
}
Returns:

the dictionary of configuration keys/values.

Return type:

dict[str, str]

curifactory.utils.get_current_commit()

Returns printed output from running git rev-parse HEAD command.

Return type:

str

curifactory.utils.get_editor()

Returns a text editor to use for richer text entry, such as in providing experiment notes.

e.g. "vim"

Return type:

str

curifactory.utils.get_os()

Get the current OS name and version.

Return type:

str

curifactory.utils.get_pip_freeze()

Returns printed output from running pip freeze command.

Return type:

str

curifactory.utils.get_py_opening_comment(lines)

Parse the passed lines of a python script file for a top of file docstring. This is used to get parameter and experiment file descriptions, so be sure to always comment your code!

Parameters:

lines (list[str])

Return type:

str

curifactory.utils.human_readable_mem_usage(byte_count)

Takes the given byte count and returns a nicely formatted string that includes the suffix (K/M/GB).

Parameters:

byte_count (int) – The number of bytes to convert into KB/MB/GB.

Return type:

str

curifactory.utils.human_readable_time(seconds)

Takes the given time in seconds and returns a nicely formatted string that includes the suffix.

Parameters:

seconds (float) – The time in seconds to convert.

Return type:

str

curifactory.utils.init_logging(log_path=None, level=20, log_errors=False, include_process=False, no_color=False, quiet=False, plain=False, disable_non_cf_loggers=True)

Sets up logging configuration, including the associated file output.

Parameters:
  • log_path (str) – Folder to store output logs in. If None, only log to console.

  • level – The logging level to output.

  • log_errors (bool) – Whether to include error messages in the log output or not.

  • include_process (bool) – Whether to include the PID prefix value in the logger. This is mostly only used for when the --parallel flag is used, to help track which log message is from which process.

  • no_color (bool) – Suppress colors in console output.

  • quiet (bool) – Suppress all console log output.

  • plain (bool) – Output plain text log rather than rich output.

  • disable_non_cf_loggers (bool) – Hide loggers from other libraries. This is true by default because some libraries can generate a lot of noise.

curifactory.utils.preview_object(object)

Get a small string representation of an object that fits nicely in a single line and includes shape information if relevant.

Parameters:

object (any)

Return type:

str

curifactory.utils.run_command(cmd)

Prints output from running a command as it occurs.

Parameters:

cmd (list[str]) – Either a string command or array of strings, as one would pass to subprocess.run()

curifactory.utils.set_logging_prefix(prefix)

Set the prefix content of the logger, which is incorporated in the log formatter. Currently this is just used to include pid number when running in parallel.

Parameters:

prefix (str)