coffea.dataset_tools

Functions

preprocess(fileset[, step_size, ...])

Given a list of normalized file and object paths (defined in uproot), determine the steps for each file according to the supplied processing options.

apply_to_dataset(data_manipulation, dataset)

Apply the supplied function or processor to the supplied dataset.

apply_to_fileset(data_manipulation, fileset)

Apply the supplied function or processor to the supplied fileset (set of datasets).

max_chunks(fileset[, maxchunks])

Modify the input dataset so that only the first "maxchunks" chunks of each file will be processed.

slice_chunks(fileset[, theslice])

Modify the input dataset so that only the chunks of each file specified by the input slice are processed.

filter_files(fileset[, thefilter])

Modify the input dataset so that only the files of each dataset that pass the filter remain.

max_files(fileset[, maxfiles])

Modify the input dataset so that only the first "maxfiles" files of each dataset will be processed.

slice_files(fileset[, theslice])

Modify the input dataset so that only the files of each dataset specified by the input slice are processed.

get_failed_steps_for_dataset(dataset, report)

Modify an input dataset to only contain the files and row-ranges for failed processing jobs as specified in the supplied report.

get_failed_steps_for_fileset(fileset, ...)

Modify an input dataset to only contain the files and row-ranges for failed processing jobs as specified in the supplied report.