ParslExecutor
- class coffea.processor.ParslExecutor(status: bool = True, unit: str = 'items', desc: str = 'Processing', compression: int | None = 1, function_name: str | None = None, tailtimeout: int | None = None, config: parsl.config.Config | None = None, recoverable: bool = False, merging: bool | Tuple[int, int, int] | None = False, jobs_executors: str | List = 'all', merges_executors: str | List = 'all')[source]
Bases:
ExecutorBase
Execute using parsl pyapp wrapper
- Parameters:
items (list) – List of input arguments
function (callable) – A function to be called on each input, which returns an accumulator instance
accumulator (Accumulatable) – An accumulator to collect the output of the function
config (parsl.config.Config, optional) –
A parsl DataFlow configuration object. Necessary if there is no active kernel
Note
In general, it is safer to construct the DFK with
parsl.load(config)
prior to calling this functionstatus (bool) – If true (default), enable progress bar
unit (str) – Label of progress bar unit
desc (str) – Label of progress bar description
compression (int, optional) – Compress accumulator outputs in flight with LZ4, at level specified (default 1) Set to
None
for no compression.recoverable (bool, optional) – Instead of raising Exception right away, the exception is captured and returned up for custom parsing. Already completed items will be returned as well.
merging (bool | tuple(int, int, int), optional) – Enables submitting intermediate merge jobs to the executor. Format is (n_batches, min_batch_size, max_batch_size). Passing
True
will use default: (5, 4, 100), aka as they are returned try to split completed jobs into 5 batches, but of at least 4 and at most 100 items. Default isFalse
- results get merged as they finish in the main process.jobs_executors (list | "all" optional) – Labels of the executors (from dfk.config.executors) that will process main jobs. Default is ‘all’. Recommended is
['jobs']
, while passinglabel='jobs'
to the primary executor.merges_executors (list | "all" optional) – Labels of the executors (from dfk.config.executors) that will process main jobs. Default is ‘all’. Recommended is
['merges']
, while passinglabel='merges'
to the executor dedicated towards merge jobs.tailtimeout (int, optional) – Timeout requirement on job tails. Cancel all remaining jobs if none have finished in the timeout window.
Attributes Summary
Methods Summary
__call__
(items, function, accumulator)Call self as a function.
Attributes Documentation
Methods Documentation
- __call__(items: Iterable, function: Callable, accumulator: Addable | MutableSet | MutableMapping)[source]
Call self as a function.