DaskExecutor

class coffea.processor.DaskExecutor(status: bool = True, unit: str = 'items', desc: str = 'Processing', compression: int | None = 1, function_name: str | None = None, client: dask.distributed.Client | None = None, treereduction: int = 20, priority: int = 0, retries: int = 3, heavy_input: bytes | None = None, use_dataframes: bool = False, worker_affinity: bool = False)[source]

Bases: ExecutorBase

Execute using dask futures

Parameters:

items (list) – List of input arguments
function (callable) – A function to be called on each input, which returns an accumulator instance
accumulator (Accumulatable) – An accumulator to collect the output of the function
client (distributed.client.Client) – A dask distributed client instance
treereduction (int, optional) – Tree reduction factor for output accumulators (default: 20)
status (bool, optional) – If true (default), enable progress bar
compression (int, optional) – Compress accumulator outputs in flight with LZ4, at level specified (default 1) Set to None for no compression.
priority (int, optional) – Task priority, default 0
retries (int, optional) – Number of retries for failed tasks (default: 3)
heavy_input (serializable, optional) – Any value placed here will be broadcast to workers and joined to input items in a tuple (item, heavy_input) that is passed to function.
function_name (str, optional) – Name of the function being passed
use_dataframes (bool, optional) –
Retrieve output as a distributed Dask DataFrame (default: False). The outputs of individual tasks must be Pandas DataFrames.

Note

If heavy_input is set, function is assumed to be pure.

Attributes Summary

`client`
`heavy_input`
`priority`
`retries`
`treereduction`
`use_dataframes`
`worker_affinity`

Methods Summary

__call__(items, function, accumulator)

Call self as a function.

Attributes Documentation

client: dask.distributed.Client | None = None

heavy_input: bytes | None = None

priority: int = 0

retries: int = 3

treereduction: int = 20

use_dataframes: bool = False

worker_affinity: bool = False

Methods Documentation

__call__(items: Iterable, function: Callable, accumulator: Addable | MutableSet | MutableMapping)[source]: Call self as a function.