🎯 doit interface
This package provides a functional interface for reducing boilerplate in dodo.py
of the pydoit build system. In short, all tasks are created and managed using a Manager
. Most features are exposed using python context manager, e.g., grouping tasks.
Basic usage
>>> import doit_interface as di
>>> # Get a default manager (or create your own to use as a context manager).
>>> manager = di.Manager.get_instance()
>>> # Create a single task.
>>> manager(basename="create_foo", actions=["touch foo"], targets=["foo"])
{'basename': 'create_foo', 'actions': ['touch foo'], 'targets': ['foo'], ...}
>>> # Group multiple tasks.
>>> with di.group_tasks("my_group") as my_group:
... member = manager(basename="member")
>>> my_group
<doit_interface.contexts.group_tasks object at 0x...> named `my_group` with 1 task
Note
The default manager obtained by calling Manager.get_instance()
has a number of default contexts enabled:
SubprocessAction.use_as_default
to useSubprocessAction
by default for string actions.create_target_dirs
to create target directories if they are missing.normalize_dependencies
such that task objects can be used as file and task dependencies.
It also injects a default DOIT_CONFIG
configuration variable if the filename is dodo.py
.
If you want to override this default behavior, you can create a dedicated manager and call Manager.set_default_instance()
or modify the Manager.context_stack
of the default manager.
Features
Traceback for failed tasks
The DoitInterfaceReporter
provides more verbose progress reports and points you to the location where a failing task was defined. The DOIT_CONFIG
is used by default if you use Manager.get_instance()
to get a Manager
.
>>> DOIT_CONFIG = {"reporter": DoitInterfaceReporter}
>>> manager(basename="false", actions=["false"])
{'basename': 'false', 'actions': ['false'], 'meta': {'filename': '...', 'lineno': 1}}
$ doit
EXECUTE: false
FAILED: false (declared at ...:1)
...
Group tasks
Group tasks to easily execute all of them using group_tasks
. Tasks can be added to groups using a context manager (as shown below) or by calling the group to add an existing task. Groups can be nested arbitrarily.
>>> with group_tasks("vgg16") as vgg16:
... train = manager(basename="train", actions=[...])
... validate = manager(basename="validate", actions=[...])
>>> vgg16
<doit_interface.contexts.group_tasks object at 0x...> named `vgg16` with 2 tasks
Automatically create target directories
Use create_target_dirs
to automatically create directories for each of your targets. This can be particularly useful if you generate nested data structures, e.g., for machine learning results based on different architectures, seeds, optimizers, learning rates, etc.
>>> with create_target_dirs():
... task = manager(basename="bar", targets=["foo/bar"], actions=[...])
>>> task["actions"]
[(<function create_folder at 0x...>, ['foo']), ...]
Use tasks as file_dep
or task_dep
normalize_dependencies
normalizes file and task dependencies such that task objects can be used as dependencies (in addition file and task names).
>>> with normalize_dependencies():
... base_task = manager(basename="base", name="output", targets=["output.txt"])
... file_dep_task = manager(basename="file_dep_task", file_dep=[base_task])
... task_dep_task = manager(basename="task_dep_task", task_dep=[base_task])
>>> file_dep_task["file_dep"]
['output.txt']
>>> task_dep_task["task_dep"]
['base:output']
Add prefixes to paths or other attributes
Path prefixes can be added using the path_prefix
context if file dependencies or targets share common directories. General prefixes are also available using prefix
.
>>> with path_prefix(targets="outputs", file_dep="inputs"):
... manager(basename="task", targets=["out.txt"], file_dep=["in1.txt", "in2.txt"])
{'basename': 'task', 'targets': ['outputs/out.txt'], 'file_dep': ['inputs/in1.txt', 'inputs/in2.txt'], ...}
Subprocess action
The SubprocessAction
lets you spawn subprocesses akin to doit.action.CmdAction
yet with a few small differences. First, it does not capture output of the subprocess which is helpful for development but may add too much noise for deployment. Second, it supports Makefile style variable substitutions and f-string substitutions for any attribute of the parent task. Third, it allows for global environment variables to be set that are shared across all, e.g., to limit the number of OpenMP threads. You can use it by default for string-actions using the SubprocessAction.use_as_default
context.
Interface
- class DoitInterfaceReporter(outstream, options)
Doit console reporter that includes a traceback for failed tasks.
- add_success(task)
called when execution finishes successfully
- execute_task(task)
called when execution starts
- skip_uptodate(task)
skipped up-to-date task
- class Manager(context_stack: Optional[list['contexts._BaseContext']] = None)
Task manager.
- Parameters
context_stack – Stack of context managers that will be applied to all associated tasks.
- context_stack
Stack of context managers that will be applied to all associated tasks.
Example
Get the default manager and create a single task.
>>> manager = Manager.get_instance() >>> manager(basename="my_task", actions=[lambda: print("hello world")]) {'basename': 'my_task', 'actions': [<function <lambda> at 0x...>], ...} >>> manager <doit_interface.manager.Manager object at 0x...> with 1 task
- clear()
Reset the state of the manager.
- doit_main(DOIT_CONFIG=None, **kwargs) DoitMain
Doit interface object.
- classmethod get_instance(strict: bool = False) Manager
Get the currently active manager instance.
If no manager is active, a global instance is returned that includes a number of default contexts. Should you require a manager without default contexts, create a new one and use it with a
with
statement or callset_default_instance()
.- Parameters
strict – Enforce that a specific manager is active rather than relying on a default.
- exception NoTasksError
No tasks have been discovered.
- class SubprocessAction(args: Union[str, Iterable[str]], task: Optional[Task] = None, env: Optional[dict] = None, inherit_env: bool = True, check_targets: bool = True, **kwargs)
Launch a subprocess.
This action supports substitution for the following variables:
$@
: first target of the corresponding task.$^
: unordered list of file dependencies.$!
: current python interpreter.
Substitution of the first file dependency
$<
is not currently supported because doit uses an unordered set for dependencies (see https://github.com/pydoit/doit/pull/430 for details).Python format string substitution is also supported with keys matching the valid attributes of
doit.task.Task
.- Parameters
args – Sequence of program arguments or shell command.
env – Environment variables.
inherit_env – Inherit the environment from the parent process. The environment is updated with env if True and replaced by env if False.
check_targets – Check that targets are created.
**kwargs – Keyword arguments passed to
subprocess.check_call()
.
Example
>>> # Write "hello" to the first target of the task. >>> SubprocessAction("echo hello > $@") <doit_interface.actions.SubprocessAction object at 0x...>
>>> # Write the task name to the first target of the task. >>> SubprocessAction("echo {name} > $@") <doit_interface.actions.SubprocessAction object at 0x...>
- classmethod get_global_env()
Get global environment variables for all
SubprocessAction
s.
- classmethod set_global_env(env)
Set global environment variables for all
SubprocessAction
s.
- class use_as_default(*, manager: Optional[Manager] = None)
Use the
SubprocessAction
as the default action for strings (with shell execution) and lists of strings (without shell execution).
- class create_target_dirs(*, manager: Optional[Manager] = None)
Create parent directories for all targets.
Example
>>> with create_target_dirs(): ... manager(basename="task", targets=["missing/directories/output.txt"]) {'basename': 'task', 'targets': ['missing/directories/output.txt'], 'actions': [(<function create_folder at 0x...>, ['missing/directories'])], ...}
- class defaults(*, manager: Optional[Manager] = None, **defaults)
Apply default task properties.
- Parameters
**defaults – Default properties as keyword arguments.
Example
Ensure all tasks within the
defaults
context share the same basename.>>> with defaults(basename="basename") as d: ... manager(name="task1") ... manager(name="task2") {'basename': 'basename', 'name': 'task1', ...} {'basename': 'basename', 'name': 'task2', ...}
- dict2args(*args, **kwargs)
Convert a dictionary of values to named command line arguments.
- Parameters
*args – Sequence of mappings to convert.
**kwargs – Keyword arguments to convert.
- Returns
args – Sequence of named command line arguments.
Example
>>> dict2args({"hello": "world"}, foo="bar") ['--hello=world', '--foo=bar']
- class group_tasks(basename: str, *, actions: Optional[list] = None, task_dep: Optional[list] = None, manager: Optional[Manager] = None, **kwargs)
Group of tasks.
- Parameters
basename – Basename of the task aggregating all constituent tasks.
actions – Actions to be performed as part of this group.
task_dep – Further task dependencies in addition to the constituent tasks.
manager – Task manager (defaults to
Manager.get_instance()
).
Example
>>> with group_tasks("my_group") as group: ... manager(basename="my_first_task") ... manager(basename="my_second_task") {'basename': 'my_first_task', ...} {'basename': 'my_second_task', ...} >>> group <doit_interface.contexts.group_tasks object at 0x...> named `my_group` with 2 tasks
- class normalize_dependencies(*, manager: Optional[Manager] = None)
Normalize task and file dependencies. For task dependencies (task_dep), tasks are replaced by fully qualified task names. For file dependencies (task_dep), tasks are replaced by their targets.
Example
>>> task1 = manager(basename="task1", name="name") >>> task2 = manager(basename="task2", targets=["file2.txt"]) >>> with normalize_dependencies(): ... manager(basename="task3", task_dep=[task1], file_dep=[task2]) {'basename': 'task3', 'task_dep': ['task1:name'], 'file_dep': ['file2.txt'], ...}
- class path_prefix(prefix: Optional[str] = None, *, targets: Optional[str] = None, file_dep: Optional[str] = None, manager: Optional[Manager] = None)
Add a prefix for targets and/or file dependencies.
- Parameters
prefix – Prefix for both targets and file dependencies.
targets – Prefix for targets.
file_dep – Prefix for file dependencies.
Example
>>> with path_prefix(targets="outputs", file_dep="inputs"): ... manager(basename="task", targets=["out.txt"], file_dep=["in.txt"]) {'basename': 'task', 'targets': ['outputs/out.txt'], 'file_dep': ['inputs/in.txt'], ...}