🎯 doit interface

https://github.com/tillahoffmann/doit_interface/actions/workflows/main.yml/badge.svg https://img.shields.io/pypi/v/doit_interface.svg?style=flat https://readthedocs.org/projects/doit-interface/badge/?version=latest

This package provides a functional interface for reducing boilerplate in dodo.py of the pydoit build system. In short, all tasks are created and managed using a Manager. Most features are exposed using python context manager, e.g., grouping tasks.

Basic usage

>>> import doit_interface as di


>>> # Get a default manager (or create your own to use as a context manager).
>>> manager = di.Manager.get_instance()

>>> # Create a single task.
>>> manager(basename="create_foo", actions=["touch foo"], targets=["foo"])
{'basename': 'create_foo', 'actions': ['touch foo'], 'targets': ['foo'], ...}

>>> # Group multiple tasks.
>>> with di.group_tasks("my_group") as my_group:
...     member = manager(basename="member")
>>> my_group
<doit_interface.contexts.group_tasks object at 0x...> named `my_group` with 1 task

Note

The default manager obtained by calling Manager.get_instance() has a number of default contexts enabled:

  1. SubprocessAction.use_as_default to use SubprocessAction by default for string actions.

  2. create_target_dirs to create target directories if they are missing.

  3. normalize_dependencies such that task objects can be used as file and task dependencies.

It also injects a default DOIT_CONFIG configuration variable if the filename is dodo.py.

If you want to override this default behavior, you can create a dedicated manager and call Manager.set_default_instance() or modify the Manager.context_stack of the default manager.

Features

Traceback for failed tasks

The DoitInterfaceReporter provides more verbose progress reports and points you to the location where a failing task was defined. The DOIT_CONFIG is used by default if you use Manager.get_instance() to get a Manager.

>>> DOIT_CONFIG = {"reporter": DoitInterfaceReporter}
>>> manager(basename="false", actions=["false"])
{'basename': 'false', 'actions': ['false'], 'meta': {'filename': '...', 'lineno': 1}}
$ doit
EXECUTE: false
FAILED: false (declared at ...:1)
...

Group tasks

Group tasks to easily execute all of them using group_tasks. Tasks can be added to groups using a context manager (as shown below) or by calling the group to add an existing task. Groups can be nested arbitrarily.

>>> with group_tasks("vgg16") as vgg16:
...     train = manager(basename="train", actions=[...])
...     validate = manager(basename="validate", actions=[...])
>>> vgg16
<doit_interface.contexts.group_tasks object at 0x...> named `vgg16` with 2 tasks

Automatically create target directories

Use create_target_dirs to automatically create directories for each of your targets. This can be particularly useful if you generate nested data structures, e.g., for machine learning results based on different architectures, seeds, optimizers, learning rates, etc.

>>> with create_target_dirs():
...     task = manager(basename="bar", targets=["foo/bar"], actions=[...])
>>> task["actions"]
[(<function create_folder at 0x...>, ['foo']), ...]

Share default values across tasks

Use defaults to share default values across tasks, such as file_dep.

>>> with defaults(file_dep=["data.pt"]):
...     train = manager(basename="train", actions=[...])
...     validate = manager(basename="validate", actions=[...])
>>> train["file_dep"]
['data.pt']
>>> validate["file_dep"]
['data.pt']

Use tasks as file_dep or task_dep

normalize_dependencies normalizes file and task dependencies such that task objects can be used as dependencies (in addition file and task names).

>>> with normalize_dependencies():
...     base_task = manager(basename="base", name="output", targets=["output.txt"])
...     file_dep_task = manager(basename="file_dep_task", file_dep=[base_task])
...     task_dep_task = manager(basename="task_dep_task", task_dep=[base_task])
>>> file_dep_task["file_dep"]
['output.txt']
>>> task_dep_task["task_dep"]
['base:output']

Add prefixes to paths or other attributes

Path prefixes can be added using the path_prefix context if file dependencies or targets share common directories. General prefixes are also available using prefix.

>>> with path_prefix(targets="outputs", file_dep="inputs"):
...     manager(basename="task", targets=["out.txt"], file_dep=["in1.txt", "in2.txt"])
{'basename': 'task', 'targets': ['outputs/out.txt'], 'file_dep': ['inputs/in1.txt', 'inputs/in2.txt'], ...}

Subprocess action

The SubprocessAction lets you spawn subprocesses akin to doit.action.CmdAction yet with a few small differences. First, it does not capture output of the subprocess which is helpful for development but may add too much noise for deployment. Second, it supports Makefile style variable substitutions and f-string substitutions for any attribute of the parent task. Third, it allows for global environment variables to be set that are shared across all, e.g., to limit the number of OpenMP threads. You can use it by default for string-actions using the SubprocessAction.use_as_default context.

Interface

class DoitInterfaceReporter(outstream, options)

Doit console reporter that includes a traceback for failed tasks.

add_success(task)

called when execution finishes successfully

execute_task(task)

called when execution starts

skip_uptodate(task)

skipped up-to-date task

class Manager(context_stack: Optional[list['contexts._BaseContext']] = None)

Task manager.

Parameters

context_stack – Stack of context managers that will be applied to all associated tasks.

context_stack

Stack of context managers that will be applied to all associated tasks.

Example

Get the default manager and create a single task.

>>> manager = Manager.get_instance()
>>> manager(basename="my_task", actions=[lambda: print("hello world")])
{'basename': 'my_task', 'actions': [<function <lambda> at 0x...>], ...}
>>> manager
<doit_interface.manager.Manager object at 0x...> with 1 task
clear()

Reset the state of the manager.

doit_main(DOIT_CONFIG=None, **kwargs) DoitMain

Doit interface object.

classmethod get_instance(strict: bool = False) Manager

Get the currently active manager instance.

If no manager is active, a global instance is returned that includes a number of default contexts. Should you require a manager without default contexts, create a new one and use it with a with statement or call set_default_instance().

Parameters

strict – Enforce that a specific manager is active rather than relying on a default.

run(args: Optional[list[str]] = None, **kwargs) int

Run doit as if called from the command line.

Parameters
  • args – Command line arguments.

  • **kwargs – Keyword arguments passed to doit_main().

Returns

status – Status code of the run (see doit.doit_cmd.DoitMain.run for details).

classmethod set_default_instance(instance: Manager) Manager

Set the default manager.

Parameters

instance – Instance to use by default.

Returns

instance – Input argument.

exception NoTasksError

No tasks have been discovered.

class SubprocessAction(args: Union[str, Iterable[str]], task: Optional[Task] = None, env: Optional[dict] = None, inherit_env: bool = True, check_targets: bool = True, **kwargs)

Launch a subprocess.

This action supports substitution for the following variables:

  • $@: first target of the corresponding task.

  • $^: unordered list of file dependencies.

  • $!: current python interpreter.

Substitution of the first file dependency $< is not currently supported because doit uses an unordered set for dependencies (see https://github.com/pydoit/doit/pull/430 for details).

Python format string substitution is also supported with keys matching the valid attributes of doit.task.Task.

Parameters
  • args – Sequence of program arguments or shell command.

  • env – Environment variables.

  • inherit_env – Inherit the environment from the parent process. The environment is updated with env if True and replaced by env if False.

  • check_targets – Check that targets are created.

  • **kwargs – Keyword arguments passed to subprocess.check_call().

Example

>>> # Write "hello" to the first target of the task.
>>> SubprocessAction("echo hello > $@")
<doit_interface.actions.SubprocessAction object at 0x...>
>>> # Write the task name to the first target of the task.
>>> SubprocessAction("echo {name} > $@")
<doit_interface.actions.SubprocessAction object at 0x...>
classmethod get_global_env()

Get global environment variables for all SubprocessActions.

classmethod set_global_env(env)

Set global environment variables for all SubprocessActions.

class use_as_default(*, manager: Optional[Manager] = None)

Use the SubprocessAction as the default action for strings (with shell execution) and lists of strings (without shell execution).

class create_target_dirs(*, manager: Optional[Manager] = None)

Create parent directories for all targets.

Example

>>> with create_target_dirs():
...     manager(basename="task", targets=["missing/directories/output.txt"])
{'basename': 'task',
 'targets': ['missing/directories/output.txt'],
 'actions': [(<function create_folder at 0x...>, ['missing/directories'])],
 ...}
class defaults(*, manager: Optional[Manager] = None, **defaults)

Apply default task properties.

Parameters

**defaults – Default properties as keyword arguments.

Example

Ensure all tasks within the defaults context share the same basename.

>>> with defaults(basename="basename") as d:
...     manager(name="task1")
...     manager(name="task2")
{'basename': 'basename', 'name': 'task1', ...}
{'basename': 'basename', 'name': 'task2', ...}
dict2args(*args, **kwargs)

Convert a dictionary of values to named command line arguments.

Parameters
  • *args – Sequence of mappings to convert.

  • **kwargs – Keyword arguments to convert.

Returns

args – Sequence of named command line arguments.

Example

>>> dict2args({"hello": "world"}, foo="bar")
['--hello=world', '--foo=bar']
class group_tasks(basename: str, *, actions: Optional[list] = None, task_dep: Optional[list] = None, manager: Optional[Manager] = None, **kwargs)

Group of tasks.

Parameters
  • basename – Basename of the task aggregating all constituent tasks.

  • actions – Actions to be performed as part of this group.

  • task_dep – Further task dependencies in addition to the constituent tasks.

  • manager – Task manager (defaults to Manager.get_instance()).

Example

>>> with group_tasks("my_group") as group:
...     manager(basename="my_first_task")
...     manager(basename="my_second_task")
{'basename': 'my_first_task', ...}
{'basename': 'my_second_task', ...}
>>> group
<doit_interface.contexts.group_tasks object at 0x...> named `my_group` with 2 tasks
class normalize_dependencies(*, manager: Optional[Manager] = None)

Normalize task and file dependencies. For task dependencies (task_dep), tasks are replaced by fully qualified task names. For file dependencies (task_dep), tasks are replaced by their targets.

Example

>>> task1 = manager(basename="task1", name="name")
>>> task2 = manager(basename="task2", targets=["file2.txt"])
>>> with normalize_dependencies():
...     manager(basename="task3", task_dep=[task1], file_dep=[task2])
{'basename': 'task3', 'task_dep': ['task1:name'], 'file_dep': ['file2.txt'], ...}
class path_prefix(prefix: Optional[str] = None, *, targets: Optional[str] = None, file_dep: Optional[str] = None, manager: Optional[Manager] = None)

Add a prefix for targets and/or file dependencies.

Parameters
  • prefix – Prefix for both targets and file dependencies.

  • targets – Prefix for targets.

  • file_dep – Prefix for file dependencies.

Example

>>> with path_prefix(targets="outputs", file_dep="inputs"):
...     manager(basename="task", targets=["out.txt"], file_dep=["in.txt"])
{'basename': 'task', 'targets': ['outputs/out.txt'], 'file_dep': ['inputs/in.txt'], ...}
class prefix(*, manager: Optional[Manager] = None, op: Optional[Callable] = None, **kwargs)

Add a prefix for specified task properties.

Parameters
  • op – Operation used to join prefixes (defaults to addition).

  • **kwargs – Keyword arguments of different prefixes.