Hist

class coffea.hist.Hist(label, *axes, **kwargs)[source]

Bases: AccumulatorABC

Specify a multidimensional histogram.

Parameters:

label (str) – A description of the meaning of the sum of weights
*axes – positional list of Cat or Bin objects, denoting the axes of the histogram
axes (collections.abc.Sequence) – list of Cat or Bin objects, denoting the axes of the histogram (overridden by *axes)
dtype (str) – Underlying numpy dtype to use for storing sum of weights

Examples

Creating a histogram with a sparse axis, and two dense axes:

h = coffea.hist.Hist("Observed bird count",
                     coffea.hist.Cat("species", "Bird species"),
                     coffea.hist.Bin("x", "x coordinate [m]", 20, -5, 5),
                     coffea.hist.Bin("y", "y coordinate [m]", 20, -5, 5),
                     )

# or

h = coffea.hist.Hist(label="Observed bird count",
                     axes=(coffea.hist.Cat("species", "Bird species"),
                           coffea.hist.Bin("x", "x coordinate [m]", 20, -5, 5),
                           coffea.hist.Bin("y", "y coordinate [m]", 20, -5, 5),
                          )
                     )

# or

h = coffea.hist.Hist(axes=[coffea.hist.Cat("species", "Bird species"),
                           coffea.hist.Bin("x", "x coordinate [m]", 20, -5, 5),
                           coffea.hist.Bin("y", "y coordinate [m]", 20, -5, 5),
                          ],
                     label="Observed bird count",
                     )

which produces:

>>> h
<Hist (species,x,y) instance at 0x10d84b550>

Attributes Summary

`DEFAULT_DTYPE`	Default numpy dtype to store sum of weights
`fields`	This is a stub for histbook compatibility
`label`	A label describing the meaning of the sum of weights

Methods Summary

`add`(other)	Add another histogram into this one, in-place
`axes`()	Get all axes in this histogram
`axis`(axis_name)	Get an `Axis` object
`clear`()	Clear all content in this histogram
`compatible`(other)	Checks if this histogram is compatible with another, i.e. they have identical binning.
`copy`([content])	Create a deep copy
`dense_axes`()	All dense axes
`dense_dim`()	Dense dimension of this histogram (number of non-sparse axes)
`dim`()	Dimension of this histogram (number of axes)
`fill`(**values)	Fill sum of weights from columns
`group`(old_axes, new_axis, mapping[, overflow])	Group a set of slices on old axes into a single new axis
`identifiers`(axis[, overflow])	Return a list of identifiers for an axis
`identity`()	The identity (zero value) of this accumulator
`integrate`(axis_name[, int_range, overflow])	Integrates current histogram along one dimension
`project`(axes, *kwargs)	Project histogram onto a subset of its axes
`rebin`(old_axis, new_axis)	Rebin a dense axis
`remove`(bins, axis)	Remove bins from a sparse axis
`scale`(factor[, axis])	Scale histogram in-place by factor
`sparse_axes`()	All sparse axes
`sparse_dim`()	Sparse dimension of this histogram (number of sparse axes)
`sparse_nbins`()	Total number of sparse bins
`sum`(axes, *kwargs)	Integrates out a set of axes, producing a new histogram
`to_boost`()	Convert this coffea Hist object to a boost_histogram obbject
`to_hist`()	Convert this coffea.hist histogram to a hist object
`values`([sumw2, overflow])	Extract the sum of weights arrays from this histogram

Attributes Documentation

DEFAULT_DTYPE = 'd': Default numpy dtype to store sum of weights

fields: This is a stub for histbook compatibility

label: A label describing the meaning of the sum of weights

Methods Documentation

add(other)[source]: Add another histogram into this one, in-place

axes()[source]: Get all axes in this histogram

axis(axis_name)[source]: Get an Axis object

clear()[source]: Clear all content in this histogram

compatible(other)[source]: Checks if this histogram is compatible with another, i.e. they have identical binning

copy(content=True)[source]

Create a deep copy

Parameters:: content (bool) – If set false, only the histogram definition is copied, resetting the sum of weights to zero

dense_axes()[source]: All dense axes

dense_dim()[source]: Dense dimension of this histogram (number of non-sparse axes)

dim()[source]: Dimension of this histogram (number of axes)

fill(**values)[source]

Fill sum of weights from columns

Parameters:: **values – Keyword arguments, one for each axis name, of either flat numpy arrays (for dense dimensions) or literals (for sparse dimensions) which will be used to fill bins at the corresponding indices.

Note

The reserved keyword weight, if specified, will increment sum of weights by the given column values, which must be broadcastable to the same dimension as all other columns. Upon first use, this will trigger the storage of the sum of squared weights.

Examples

Filling the histogram from the Hist example:

>>> h.fill(species='ducks', x=numpy.random.normal(size=10), y=numpy.random.normal(size=10), weight=numpy.ones(size=10) * 3)

group(old_axes, new_axis, mapping, overflow='none')[source]

Group a set of slices on old axes into a single new axis

Parameters:

old_axes – Axis or tuple of axes which are being grouped
new_axis – A new sparse dimension definition, e.g. a Cat instance
mapping (dict) – A mapping {'new_bin': (slice, ...), ...} where each slice is on the axes being re-binned. In the case of a single axis for old_axes, {'new_bin': slice, ...} is admissible.
overflow (str) – See sum description for meaning of allowed values Default is to not include overflow bins

Returns a new histogram object

identifiers(axis, overflow='none')[source]

Return a list of identifiers for an axis

Parameters:

axis – Axis name or Axis object
overflow – See sum description for meaning of allowed values

identity()[source]: The identity (zero value) of this accumulator

integrate(axis_name, int_range=slice(None, None, None), overflow='none')[source]

Integrates current histogram along one dimension

Parameters:

axis_name (str or Axis) – Which dimension to reduce on
int_range (slice) – Any slice, list, string, or other object that the axis will understand Default is to integrate over the whole range
overflow (str) – See sum description for meaning of allowed values Default is to not include overflow bins

project(*axes, **kwargs)[source]

Project histogram onto a subset of its axes

Parameters:

*axes (str or Axis) – Positional list of axes to project on to
overflow (str) – Controls behavior of integration over remaining axes. See sum description for meaning of allowed values Default is to not include overflow bins

rebin(old_axis, new_axis)[source]

Rebin a dense axis

This function will construct the mapping from old to new axis, and constructs a new histogram, rebinning the sum of weights along that dimension.

Note

No interpolation is performed, so the user must be sure the old and new axes have compatible bin boundaries, e.g. that they evenly divide each other.

Parameters:

old_axis (str or Axis) – Axis to rebin
new_axis (str or Axis or int) – A DenseAxis object defining the new axis (e.g. a Bin instance). If a number N is supplied, the old axis edges are downsampled by N, resulting in a histogram with old_nbins // N bins.

Returns a new Hist object.

remove(bins, axis)[source]

Remove bins from a sparse axis

Parameters:

bins (iterable) – A list of bin identifiers to remove
axis (str or Axis) – Axis name or SparseAxis instance

Returns a copy of the histogram with specified bins removed, not an in-place operation

scale(factor, axis=None)[source]

Scale histogram in-place by factor

Parameters:

factor (float or dict) – A number or mapping of identifier to number
axis (optional) – Which (sparse) axis the dict applies to, may be a tuples of axes. The dict keys must follow the same structure.

Examples

This function is useful to quickly reweight according to some weight mapping along a sparse axis, such as the species axis in the Hist example:

>>> h.scale({'ducks': 0.3, 'geese': 1.2}, axis='species')
>>> h.scale({('ducks',): 0.5}, axis=('species',))
>>> h.scale({('geese', 'honk'): 5.0}, axis=('species', 'vocalization'))

sparse_axes()[source]: All sparse axes

sparse_dim()[source]: Sparse dimension of this histogram (number of sparse axes)

sparse_nbins()[source]: Total number of sparse bins

sum(*axes, **kwargs)[source]

Integrates out a set of axes, producing a new histogram

Parameters:

*axes – Positional list of axes to integrate out (either a string or an Axis object)
overflow ({'none', 'under', 'over', 'all', 'allnan'}, optional) – How to treat the overflow bins in the sum. Only applies to dense axes. ‘all’ includes both under- and over-flow but not nan-flow bins. Default is ‘none’.

to_boost()[source]: Convert this coffea Hist object to a boost_histogram obbject

to_hist()[source]: Convert this coffea.hist histogram to a hist object

values(sumw2=False, overflow='none')[source]

Extract the sum of weights arrays from this histogram

Parameters:

sumw2 (bool) – If True, frequencies is a tuple of arrays (sum weights, sum squared weights)
overflow – See sum description for meaning of allowed values

Returns a mapping {(sparse identifier, ...): numpy.array(...), ...} where each array has dimension dense_dim and shape matching the number of bins per axis, plus 0-3 overflow bins depending on the overflow argument.