numpy.histogram#

numpy.histogram(a, bins=10, range=None, density=None, weights=None)[源代码]#

计算数据集的直方图。

参数:

a类数组

输入数据。直方图在扁平化的数组上计算。

binsint 或标量序列或 str，可选

如果 bins 是一个 int，它定义了给定范围内的等宽 bin 数量（默认为 10）。如果 bins 是一个序列，它定义了一个单调递增的 bin 边缘数组，包括最右边的边缘，允许非均匀的 bin 宽度。

如果 bins 是一个字符串，它定义了用于计算最佳 bin 宽度的方法，该方法由 histogram_bin_edges 定义。

range(float, float)，可选

bin 的下限和上限。如果未提供，则范围简单地是 (a.min(), a.max())。超出范围的值将被忽略。范围的第一个元素必须小于或等于第二个。 range 也会影响自动 bin 计算。虽然 bin 宽度是根据 range 内的实际数据计算为最优的，但 bin 计数将填充整个范围，包括不包含数据的部分。

weights类数组，可选

与 a 形状相同的权重数组。 a 中的每个值仅贡献其关联的权重到 bin 计数中（而不是 1）。如果 density 为 True，则权重会归一化，使得密度在整个范围上的积分保持为 1。请注意，weights 的 dtype 也将成为返回的累加器 (hist) 的 dtype，因此它必须足够大以容纳累加值。

densitybool，可选

如果为 False，结果将包含每个 bin 中的样本数量。如果为 True，结果是 bin 处概率 *密度* 函数的值，经过归一化，使得在整个范围上的 *积分* 为 1。请注意，直方图值的总和不会等于 1，除非选择宽度为 1 的 bin；它不是概率 *质量* 函数。

返回:

hist数组: 直方图的值。关于可能语义的描述，请参见 density 和 weights。如果给定 weights，hist.dtype 将取自 weights。
bin_edgesdtype 为 float 的数组: 返回 bin 边缘 (length(hist)+1)。

另请参阅

histogramdd, bincount, searchsorted, digitize, histogram_bin_edges

注意

除了最后一个（最右边）bin，所有 bin 都是半开的。换句话说，如果 bins 是

[1, 2, 3, 4]

那么第一个 bin 是 [1, 2)（包含 1，但不包含 2），第二个是 [2, 3)。然而，最后一个 bin 是 [3, 4]，它 *包含* 4。

示例

>>> import numpy as np
>>> np.histogram([1, 2, 1], bins=[0, 1, 2, 3])
(array([0, 2, 1]), array([0, 1, 2, 3]))
>>> np.histogram(np.arange(4), bins=np.arange(5), density=True)
(array([0.25, 0.25, 0.25, 0.25]), array([0, 1, 2, 3, 4]))
>>> np.histogram([[1, 2, 1], [1, 0, 1]], bins=[0,1,2,3])
(array([1, 4, 1]), array([0, 1, 2, 3]))

>>> a = np.arange(5)
>>> hist, bin_edges = np.histogram(a, density=True)
>>> hist
array([0.5, 0. , 0.5, 0. , 0. , 0.5, 0. , 0.5, 0. , 0.5])
>>> hist.sum()
2.4999999999999996
>>> np.sum(hist * np.diff(bin_edges))
1.0

自动 Bin 选择方法示例，使用具有 2000 个点的双峰随机数据。

import matplotlib.pyplot as plt
import numpy as np

rng = np.random.RandomState(10)  # deterministic random data
a = np.hstack((rng.normal(size=1000),
               rng.normal(loc=5, scale=2, size=1000)))
plt.hist(a, bins='auto')  # arguments are passed to np.histogram
plt.title("Histogram with 'auto' bins")
plt.show()