ParslExecutor

class coffea.processor.ParslExecutor(status: bool = True, unit: str = 'items', desc: str = 'Processing', compression: int | None = 1, function_name: str | None = None, tailtimeout: int | None = None, config: parsl.config.Config | None = None, recoverable: bool = False, merging: bool | Tuple[int, int, int] | None = False, jobs_executors: str | List = 'all', merges_executors: str | List = 'all')[source]

Bases: ExecutorBase

Execute using parsl pyapp wrapper

Parameters:
  • items (list) – List of input arguments

  • function (callable) – A function to be called on each input, which returns an accumulator instance

  • accumulator (Accumulatable) – An accumulator to collect the output of the function

  • config (parsl.config.Config, optional) –

    A parsl DataFlow configuration object. Necessary if there is no active kernel

    Note

    In general, it is safer to construct the DFK with parsl.load(config) prior to calling this function

  • status (bool) – If true (default), enable progress bar

  • unit (str) – Label of progress bar unit

  • desc (str) – Label of progress bar description

  • compression (int, optional) – Compress accumulator outputs in flight with LZ4, at level specified (default 1) Set to None for no compression.

  • recoverable (bool, optional) – Instead of raising Exception right away, the exception is captured and returned up for custom parsing. Already completed items will be returned as well.

  • merging (bool | tuple(int, int, int), optional) – Enables submitting intermediate merge jobs to the executor. Format is (n_batches, min_batch_size, max_batch_size). Passing True will use default: (5, 4, 100), aka as they are returned try to split completed jobs into 5 batches, but of at least 4 and at most 100 items. Default is False - results get merged as they finish in the main process.

  • jobs_executors (list | "all" optional) – Labels of the executors (from dfk.config.executors) that will process main jobs. Default is ‘all’. Recommended is ['jobs'], while passing label='jobs' to the primary executor.

  • merges_executors (list | "all" optional) – Labels of the executors (from dfk.config.executors) that will process main jobs. Default is ‘all’. Recommended is ['merges'], while passing label='merges' to the executor dedicated towards merge jobs.

  • tailtimeout (int, optional) – Timeout requirement on job tails. Cancel all remaining jobs if none have finished in the timeout window.

Attributes Summary

config

jobs_executors

merges_executors

merging

recoverable

tailtimeout

Methods Summary

__call__(items, function, accumulator)

Call self as a function.

Attributes Documentation

config: parsl.config.Config | None = None
jobs_executors: str | List = 'all'
merges_executors: str | List = 'all'
merging: bool | Tuple[int, int, int] | None = False
recoverable: bool = False
tailtimeout: int | None = None

Methods Documentation

__call__(items: Iterable, function: Callable, accumulator: Addable | MutableSet | MutableMapping)[source]

Call self as a function.