FuturesExecutor

class coffea.processor.FuturesExecutor(status: bool = True, unit: str = 'items', desc: str = 'Processing', compression: int | None = 1, function_name: str | None = None, pool: ~typing.Callable[[...], ~concurrent.futures._base.Executor] | ~concurrent.futures._base.Executor = <class 'concurrent.futures.process.ProcessPoolExecutor'>, mergepool: ~typing.Callable[[...], ~concurrent.futures._base.Executor] | ~concurrent.futures._base.Executor | bool | None = None, recoverable: bool = False, merging: bool | ~typing.Tuple[int, int, int] = False, workers: int = 1, tailtimeout: int | None = None)[source]

Bases: ExecutorBase

Execute using multiple local cores using python futures

Parameters:

items (list) – List of input arguments
function (callable) – A function to be called on each input, which returns an accumulator instance
accumulator (Accumulatable) – An accumulator to collect the output of the function
pool (concurrent.futures.Executor class or instance, optional) – The type of futures executor to use, defaults to ProcessPoolExecutor. You can pass an instance instead of a class to re-use an executor
workers (int, optional) – Number of parallel processes for futures (default 1)
status (bool, optional) – If true (default), enable progress bar
desc (str, optional) – Label of progress description (default: ‘Processing’)
unit (str, optional) – Label of progress bar bar unit (default: ‘items’)
compression (int, optional) – Compress accumulator outputs in flight with LZ4, at level specified (default 1) Set to None for no compression.
recoverable (bool, optional) – Instead of raising Exception right away, the exception is captured and returned up for custom parsing. Already completed items will be returned as well.
checkpoints (bool) – To do
merging (bool | tuple(int, int, int), optional) – Enables submitting intermediate merge jobs to the executor. Format is (n_batches, min_batch_size, max_batch_size). Passing True will use default: (5, 4, 100), aka as they are returned try to split completed jobs into 5 batches, but of at least 4 and at most 100 items. Default is False - results get merged as they finish in the main process.
nparts (int, optional) – Number of merge jobs to create at a time. Also pass via ``merging(X, …, …)’’
minred (int, optional) – Minimum number of items to merge in one job. Also pass via ``merging(…, X, …)’’
maxred (int, optional) – maximum number of items to merge in one job. Also pass via ``merging(…, …, X)’’
mergepool (concurrent.futures.Executor class or instance | int, optional) – Supply an additional executor to process merge jobs indepedently. An int will be interpretted as ProcessPoolExecutor(max_workers=int).
tailtimeout (int, optional) – Timeout requirement on job tails. Cancel all remaining jobs if none have finished in the timeout window.

Attributes Summary

`mergepool`
`merging`
`recoverable`
`tailtimeout`
`workers`

Methods Summary

__call__(items, function, accumulator)

Call self as a function.

Attributes Documentation

mergepool: Callable[[...], Executor] | Executor | bool | None = None

merging: bool | Tuple[int, int, int] = False

recoverable: bool = False

tailtimeout: int | None = None

workers: int = 1

Methods Documentation

__call__(items: Iterable, function: Callable, accumulator: Addable | MutableSet | MutableMapping)[source]: Call self as a function.