Hist

class coffea.hist.Hist(label, *axes, **kwargs)[source]

Bases: AccumulatorABC

Specify a multidimensional histogram.

Parameters:
  • label (str) – A description of the meaning of the sum of weights

  • *axes – positional list of Cat or Bin objects, denoting the axes of the histogram

  • axes (collections.abc.Sequence) – list of Cat or Bin objects, denoting the axes of the histogram (overridden by *axes)

  • dtype (str) – Underlying numpy dtype to use for storing sum of weights

Examples

Creating a histogram with a sparse axis, and two dense axes:

h = coffea.hist.Hist("Observed bird count",
                     coffea.hist.Cat("species", "Bird species"),
                     coffea.hist.Bin("x", "x coordinate [m]", 20, -5, 5),
                     coffea.hist.Bin("y", "y coordinate [m]", 20, -5, 5),
                     )

# or

h = coffea.hist.Hist(label="Observed bird count",
                     axes=(coffea.hist.Cat("species", "Bird species"),
                           coffea.hist.Bin("x", "x coordinate [m]", 20, -5, 5),
                           coffea.hist.Bin("y", "y coordinate [m]", 20, -5, 5),
                          )
                     )

# or

h = coffea.hist.Hist(axes=[coffea.hist.Cat("species", "Bird species"),
                           coffea.hist.Bin("x", "x coordinate [m]", 20, -5, 5),
                           coffea.hist.Bin("y", "y coordinate [m]", 20, -5, 5),
                          ],
                     label="Observed bird count",
                     )

which produces:

>>> h
<Hist (species,x,y) instance at 0x10d84b550>

Attributes Summary

DEFAULT_DTYPE

Default numpy dtype to store sum of weights

fields

This is a stub for histbook compatibility

label

A label describing the meaning of the sum of weights

Methods Summary

add(other)

Add another histogram into this one, in-place

axes()

Get all axes in this histogram

axis(axis_name)

Get an Axis object

clear()

Clear all content in this histogram

compatible(other)

Checks if this histogram is compatible with another, i.e. they have identical binning.

copy([content])

Create a deep copy

dense_axes()

All dense axes

dense_dim()

Dense dimension of this histogram (number of non-sparse axes)

dim()

Dimension of this histogram (number of axes)

fill(**values)

Fill sum of weights from columns

group(old_axes, new_axis, mapping[, overflow])

Group a set of slices on old axes into a single new axis

identifiers(axis[, overflow])

Return a list of identifiers for an axis

identity()

The identity (zero value) of this accumulator

integrate(axis_name[, int_range, overflow])

Integrates current histogram along one dimension

project(*axes, **kwargs)

Project histogram onto a subset of its axes

rebin(old_axis, new_axis)

Rebin a dense axis

remove(bins, axis)

Remove bins from a sparse axis

scale(factor[, axis])

Scale histogram in-place by factor

sparse_axes()

All sparse axes

sparse_dim()

Sparse dimension of this histogram (number of sparse axes)

sparse_nbins()

Total number of sparse bins

sum(*axes, **kwargs)

Integrates out a set of axes, producing a new histogram

to_boost()

Convert this coffea Hist object to a boost_histogram obbject

to_hist()

Convert this coffea.hist histogram to a hist object

values([sumw2, overflow])

Extract the sum of weights arrays from this histogram

Attributes Documentation

DEFAULT_DTYPE = 'd'

Default numpy dtype to store sum of weights

fields

This is a stub for histbook compatibility

label

A label describing the meaning of the sum of weights

Methods Documentation

add(other)[source]

Add another histogram into this one, in-place

axes()[source]

Get all axes in this histogram

axis(axis_name)[source]

Get an Axis object

clear()[source]

Clear all content in this histogram

compatible(other)[source]

Checks if this histogram is compatible with another, i.e. they have identical binning

copy(content=True)[source]

Create a deep copy

Parameters:

content (bool) – If set false, only the histogram definition is copied, resetting the sum of weights to zero

dense_axes()[source]

All dense axes

dense_dim()[source]

Dense dimension of this histogram (number of non-sparse axes)

dim()[source]

Dimension of this histogram (number of axes)

fill(**values)[source]

Fill sum of weights from columns

Parameters:

**values – Keyword arguments, one for each axis name, of either flat numpy arrays (for dense dimensions) or literals (for sparse dimensions) which will be used to fill bins at the corresponding indices.

Note

The reserved keyword weight, if specified, will increment sum of weights by the given column values, which must be broadcastable to the same dimension as all other columns. Upon first use, this will trigger the storage of the sum of squared weights.

Examples

Filling the histogram from the Hist example:

>>> h.fill(species='ducks', x=numpy.random.normal(size=10), y=numpy.random.normal(size=10), weight=numpy.ones(size=10) * 3)
group(old_axes, new_axis, mapping, overflow='none')[source]

Group a set of slices on old axes into a single new axis

Parameters:
  • old_axes – Axis or tuple of axes which are being grouped

  • new_axis – A new sparse dimension definition, e.g. a Cat instance

  • mapping (dict) – A mapping {'new_bin': (slice, ...), ...} where each slice is on the axes being re-binned. In the case of a single axis for old_axes, {'new_bin': slice, ...} is admissible.

  • overflow (str) – See sum description for meaning of allowed values Default is to not include overflow bins

Returns a new histogram object

identifiers(axis, overflow='none')[source]

Return a list of identifiers for an axis

Parameters:
  • axis – Axis name or Axis object

  • overflow – See sum description for meaning of allowed values

identity()[source]

The identity (zero value) of this accumulator

integrate(axis_name, int_range=slice(None, None, None), overflow='none')[source]

Integrates current histogram along one dimension

Parameters:
  • axis_name (str or Axis) – Which dimension to reduce on

  • int_range (slice) – Any slice, list, string, or other object that the axis will understand Default is to integrate over the whole range

  • overflow (str) – See sum description for meaning of allowed values Default is to not include overflow bins

project(*axes, **kwargs)[source]

Project histogram onto a subset of its axes

Parameters:
  • *axes (str or Axis) – Positional list of axes to project on to

  • overflow (str) – Controls behavior of integration over remaining axes. See sum description for meaning of allowed values Default is to not include overflow bins

rebin(old_axis, new_axis)[source]

Rebin a dense axis

This function will construct the mapping from old to new axis, and constructs a new histogram, rebinning the sum of weights along that dimension.

Note

No interpolation is performed, so the user must be sure the old and new axes have compatible bin boundaries, e.g. that they evenly divide each other.

Parameters:
  • old_axis (str or Axis) – Axis to rebin

  • new_axis (str or Axis or int) – A DenseAxis object defining the new axis (e.g. a Bin instance). If a number N is supplied, the old axis edges are downsampled by N, resulting in a histogram with old_nbins // N bins.

Returns a new Hist object.

remove(bins, axis)[source]

Remove bins from a sparse axis

Parameters:
  • bins (iterable) – A list of bin identifiers to remove

  • axis (str or Axis) – Axis name or SparseAxis instance

Returns a copy of the histogram with specified bins removed, not an in-place operation

scale(factor, axis=None)[source]

Scale histogram in-place by factor

Parameters:
  • factor (float or dict) – A number or mapping of identifier to number

  • axis (optional) – Which (sparse) axis the dict applies to, may be a tuples of axes. The dict keys must follow the same structure.

Examples

This function is useful to quickly reweight according to some weight mapping along a sparse axis, such as the species axis in the Hist example:

>>> h.scale({'ducks': 0.3, 'geese': 1.2}, axis='species')
>>> h.scale({('ducks',): 0.5}, axis=('species',))
>>> h.scale({('geese', 'honk'): 5.0}, axis=('species', 'vocalization'))
sparse_axes()[source]

All sparse axes

sparse_dim()[source]

Sparse dimension of this histogram (number of sparse axes)

sparse_nbins()[source]

Total number of sparse bins

sum(*axes, **kwargs)[source]

Integrates out a set of axes, producing a new histogram

Parameters:
  • *axes – Positional list of axes to integrate out (either a string or an Axis object)

  • overflow ({'none', 'under', 'over', 'all', 'allnan'}, optional) – How to treat the overflow bins in the sum. Only applies to dense axes. ‘all’ includes both under- and over-flow but not nan-flow bins. Default is ‘none’.

to_boost()[source]

Convert this coffea Hist object to a boost_histogram obbject

to_hist()[source]

Convert this coffea.hist histogram to a hist object

values(sumw2=False, overflow='none')[source]

Extract the sum of weights arrays from this histogram

Parameters:
  • sumw2 (bool) – If True, frequencies is a tuple of arrays (sum weights, sum squared weights)

  • overflow – See sum description for meaning of allowed values

Returns a mapping {(sparse identifier, ...): numpy.array(...), ...} where each array has dimension dense_dim and shape matching the number of bins per axis, plus 0-3 overflow bins depending on the overflow argument.