PackedSelection
- class coffea.analysis_tools.PackedSelection(dtype='uint32')[source]
Bases:
object
Store several boolean arrays in a compact manner
This class can store several boolean arrays in a memory-efficient mannner and evaluate arbitrary combinations of boolean requirements in an CPU-efficient way. Supported inputs are 1D numpy or awkward arrays.
- Parameters:
dtype (numpy.dtype or str) – internal bitwidth of the packed array, which governs the maximum number of selections storable in this object. The default value is
uint32
, which allows up to 32 booleans to be stored, but if a smaller or larger number of selections needs to be stored, one can chooseuint16
oruint64
instead.
Attributes Summary
Current list of mask names available
Methods Summary
add
(name, selection[, fill_value])Add a new boolean array
add_multiple
(selections[, fill_value])Add multiple boolean arrays at once, see
add
for detailsall
(*names)Shorthand for
require
, where all the values are True.allfalse
(*names)Shorthand for
require
, where all the values are False.any
(*names)Return a mask vector corresponding to an inclusive OR of requirements
cutflow
(*names)Compute the cutflow for a set of selections
nminusone
(*names)Compute the "N-1" style selection for a set of selections
require
(**names)Return a mask vector corresponding to specific requirements
Attributes Documentation
- delayed_mode
- maxitems
- names
Current list of mask names available
Methods Documentation
- add(name, selection, fill_value=False)[source]
Add a new boolean array
- Parameters:
name (str) – name of the selection
selection (numpy.ndarray or awkward.Array) – a flat array of type
bool
or?bool
. If this is not the first selection added, it must also have the same shape as previously added selections. If the array is option-type, null entries will be filled withfill_value
.fill_value (bool, optional) – All masked entries will be filled as specified (default:
False
)
- add_multiple(selections, fill_value=False)[source]
Add multiple boolean arrays at once, see
add
for details
- all(*names)[source]
Shorthand for
require
, where all the values are True. If no arguments are given, all the added selections are required to be True.
- allfalse(*names)[source]
Shorthand for
require
, where all the values are False. If no arguments are given, all the added selections are required to be False.
- any(*names)[source]
Return a mask vector corresponding to an inclusive OR of requirements
- Parameters:
*names (args) – The named selections to allow
Examples
If
>>> selection.names ['cut1', 'cut2', 'cut3']
then
>>> selection.any("cut1", "cut2") array([True, False, True, ...])
returns a boolean array where an entry is True if the corresponding entries
cut1 == True
orcut2 == False
, andcut3
arbitrary.
- cutflow(*names)[source]
Compute the cutflow for a set of selections
Returns an object which can return a list of the number of events that pass all the previous selections including the current one after each named selection is applied consecutively. The first element of the returned list is the total number of events before any selections are applied. The last element is the final number of events that pass after all the selections are applied. Can also return a cutflow histogram as a
hist.Hist
object where the bin heights are the number of events of the cutflow list. If the PackedSelection is in delayed mode, the elements of the list will be dask_awkward Arrays that can be computed whenever the user wants. If the histogram is requested, those delayed arrays will be computed in the process in order to set the bin heights.- Parameters:
*names (args) – The named selections to use, need to be a subset of the selections already added
- Returns:
res – A wrapper class for the results, see the documentation for that class for more details
- Return type:
- nminusone(*names)[source]
Compute the “N-1” style selection for a set of selections
The N-1 style selection for a set of selections, returns an object which can return a list of the number of events that pass all the other selections ignoring one at a time. The first element of the returned list is the total number of events before any selections are applied. The last element is the final number of events that pass if all selections are applied. It also returns a list of boolean mask vectors of which events pass the N-1 selection each time. Can also return a histogram as a
hist.Hist
object where the bin heights are the number of events of the N-1 selection list. If the PackedSelection is in delayed mode, the elements of those lists will be dask_awkward Arrays that can be computed whenever the user wants. If the histogram is requested, the delayed arrays of the number of events list will be computed in the process in order to set the bin heights.- Parameters:
*names (args) – The named selections to use, need to be a subset of the selections already added
- Returns:
res – A wrapper class for the results, see the documentation for that class for more details
- Return type:
- require(**names)
Return a mask vector corresponding to specific requirements
Specify an exact requirement on an arbitrary subset of the masks
- Parameters:
**names (kwargs) – Each argument to require specific value for, in form
arg=True
orarg=False
.
Examples
If
>>> selection.names ['cut1', 'cut2', 'cut3']
then
>>> selection.require(cut1=True, cut2=False) array([True, False, True, ...])
returns a boolean array where an entry is True if the corresponding entries
cut1 == True
,cut2 == False
, andcut3
arbitrary.