Array iterator API#
Array iterator#
The array iterator encapsulates many of the key features in ufuncs, allowing user code to support features like output parameters, preservation of memory layouts, and buffering of data with the wrong alignment or type, without requiring difficult coding.
This page documents the API for the iterator. The iterator is named NpyIter and functions are named NpyIter_*.
There is an introductory guide to array iteration which may be of interest for those using this C API. In many instances, testing out ideas by creating the iterator in Python is a good idea before writing the C iteration code.
Iteration example#
The best way to become familiar with the iterator is to look at its usage within the NumPy codebase itself. For example, here is a slightly tweaked version of the code for PyArray_CountNonzero, which counts the number of non-zero elements in an array.
npy_intp PyArray_CountNonzero(PyArrayObject* self)
{
/* Nonzero boolean function */
PyArray_NonzeroFunc* nonzero = PyArray_DESCR(self)->f->nonzero;
NpyIter* iter;
NpyIter_IterNextFunc *iternext;
char** dataptr;
npy_intp nonzero_count;
npy_intp* strideptr,* innersizeptr;
/* Handle zero-sized arrays specially */
if (PyArray_SIZE(self) == 0) {
return 0;
}
/*
* Create and use an iterator to count the nonzeros.
* flag NPY_ITER_READONLY
* - The array is never written to.
* flag NPY_ITER_EXTERNAL_LOOP
* - Inner loop is done outside the iterator for efficiency.
* flag NPY_ITER_NPY_ITER_REFS_OK
* - Reference types are acceptable.
* order NPY_KEEPORDER
* - Visit elements in memory order, regardless of strides.
* This is good for performance when the specific order
* elements are visited is unimportant.
* casting NPY_NO_CASTING
* - No casting is required for this operation.
*/
iter = NpyIter_New(self, NPY_ITER_READONLY|
NPY_ITER_EXTERNAL_LOOP|
NPY_ITER_REFS_OK,
NPY_KEEPORDER, NPY_NO_CASTING,
NULL);
if (iter == NULL) {
return -1;
}
/*
* The iternext function gets stored in a local variable
* so it can be called repeatedly in an efficient manner.
*/
iternext = NpyIter_GetIterNext(iter, NULL);
if (iternext == NULL) {
NpyIter_Deallocate(iter);
return -1;
}
/* The location of the data pointer which the iterator may update */
dataptr = NpyIter_GetDataPtrArray(iter);
/* The location of the stride which the iterator may update */
strideptr = NpyIter_GetInnerStrideArray(iter);
/* The location of the inner loop size which the iterator may update */
innersizeptr = NpyIter_GetInnerLoopSizePtr(iter);
nonzero_count = 0;
do {
/* Get the inner loop data/stride/count values */
char* data = *dataptr;
npy_intp stride = *strideptr;
npy_intp count = *innersizeptr;
/* This is a typical inner loop for NPY_ITER_EXTERNAL_LOOP */
while (count--) {
if (nonzero(data, self)) {
++nonzero_count;
}
data += stride;
}
/* Increment the iterator to the next inner loop */
} while(iternext(iter));
NpyIter_Deallocate(iter);
return nonzero_count;
}
Multi-iteration example#
Here is a copy function using the iterator. The order parameter is used to control the memory layout of the allocated result, typically NPY_KEEPORDER is desired.
PyObject *CopyArray(PyObject *arr, NPY_ORDER order)
{
NpyIter *iter;
NpyIter_IterNextFunc *iternext;
PyObject *op[2], *ret;
npy_uint32 flags;
npy_uint32 op_flags[2];
npy_intp itemsize, *innersizeptr, innerstride;
char **dataptrarray;
/*
* No inner iteration - inner loop is handled by CopyArray code
*/
flags = NPY_ITER_EXTERNAL_LOOP;
/*
* Tell the constructor to automatically allocate the output.
* The data type of the output will match that of the input.
*/
op[0] = arr;
op[1] = NULL;
op_flags[0] = NPY_ITER_READONLY;
op_flags[1] = NPY_ITER_WRITEONLY | NPY_ITER_ALLOCATE;
/* Construct the iterator */
iter = NpyIter_MultiNew(2, op, flags, order, NPY_NO_CASTING,
op_flags, NULL);
if (iter == NULL) {
return NULL;
}
/*
* Make a copy of the iternext function pointer and
* a few other variables the inner loop needs.
*/
iternext = NpyIter_GetIterNext(iter, NULL);
innerstride = NpyIter_GetInnerStrideArray(iter)[0];
itemsize = NpyIter_GetDescrArray(iter)[0]->elsize;
/*
* The inner loop size and data pointers may change during the
* loop, so just cache the addresses.
*/
innersizeptr = NpyIter_GetInnerLoopSizePtr(iter);
dataptrarray = NpyIter_GetDataPtrArray(iter);
/*
* Note that because the iterator allocated the output,
* it matches the iteration order and is packed tightly,
* so we don't need to check it like the input.
*/
if (innerstride == itemsize) {
do {
memcpy(dataptrarray[1], dataptrarray[0],
itemsize * (*innersizeptr));
} while (iternext(iter));
} else {
/* For efficiency, should specialize this based on item size... */
npy_intp i;
do {
npy_intp size = *innersizeptr;
char *src = dataptrarray[0], *dst = dataptrarray[1];
for(i = 0; i < size; i++, src += innerstride, dst += itemsize) {
memcpy(dst, src, itemsize);
}
} while (iternext(iter));
}
/* Get the result from the iterator object array */
ret = NpyIter_GetOperandArray(iter)[1];
Py_INCREF(ret);
if (NpyIter_Deallocate(iter) != NPY_SUCCEED) {
Py_DECREF(ret);
return NULL;
}
return ret;
}
Multi index tracking example#
This example shows you how to work with the NPY_ITER_MULTI_INDEX flag. For simplicity, we assume the argument is a two-dimensional array.
int PrintMultiIndex(PyArrayObject *arr) {
NpyIter *iter;
NpyIter_IterNextFunc *iternext;
npy_intp multi_index[2];
iter = NpyIter_New(
arr, NPY_ITER_READONLY | NPY_ITER_MULTI_INDEX | NPY_ITER_REFS_OK,
NPY_KEEPORDER, NPY_NO_CASTING, NULL);
if (iter == NULL) {
return -1;
}
if (NpyIter_GetNDim(iter) != 2) {
NpyIter_Deallocate(iter);
PyErr_SetString(PyExc_ValueError, "Array must be 2-D");
return -1;
}
if (NpyIter_GetIterSize(iter) != 0) {
iternext = NpyIter_GetIterNext(iter, NULL);
if (iternext == NULL) {
NpyIter_Deallocate(iter);
return -1;
}
NpyIter_GetMultiIndexFunc *get_multi_index =
NpyIter_GetGetMultiIndex(iter, NULL);
if (get_multi_index == NULL) {
NpyIter_Deallocate(iter);
return -1;
}
do {
get_multi_index(iter, multi_index);
printf("multi_index is [%" NPY_INTP_FMT ", %" NPY_INTP_FMT "]\n",
multi_index[0], multi_index[1]);
} while (iternext(iter));
}
if (!NpyIter_Deallocate(iter)) {
return -1;
}
return 0;
}
When called with a 2x3 array, the above example prints
multi_index is [0, 0]
multi_index is [0, 1]
multi_index is [0, 2]
multi_index is [1, 0]
multi_index is [1, 1]
multi_index is [1, 2]
Iterator data types#
The iterator layout is an internal detail, and user code only sees an incomplete struct.
-
type NpyIter#
This is an opaque pointer type for the iterator. Access to its contents can only be done through the iterator API.
-
type NpyIter_Type#
This is the type which exposes the iterator to Python. Currently, no API is exposed which provides access to the values of a Python-created iterator. If an iterator is created in Python, it must be used in Python and vice versa. Such an API will likely be created in a future version.
-
type NpyIter_IterNextFunc#
This is a function pointer for the iteration loop, returned by
NpyIter_GetIterNext.
-
type NpyIter_GetMultiIndexFunc#
This is a function pointer for getting the current iterator multi-index, returned by
NpyIter_GetGetMultiIndex.
Construction and destruction#
-
NpyIter *NpyIter_New(PyArrayObject *op, npy_uint32 flags, NPY_ORDER order, NPY_CASTING casting, PyArray_Descr *dtype)#
Creates an iterator for the given numpy array object
op.Flags that may be passed in
flagsare any combination of the global and per-operand flags documented inNpyIter_MultiNew, except forNPY_ITER_ALLOCATE.Any of the
NPY_ORDERenum values may be passed toorder. For efficient iteration,NPY_KEEPORDERis the best option, and the other orders enforce the particular iteration pattern.Any of the
NPY_CASTINGenum values may be passed tocasting. The values includeNPY_NO_CASTING,NPY_EQUIV_CASTING,NPY_SAFE_CASTING,NPY_SAME_KIND_CASTING, andNPY_UNSAFE_CASTING. To allow the casts to occur, copying or buffering must also be enabled.If
dtypeisn’tNULL, then it requires that data type. If copying is allowed, it will make a temporary copy if the data is castable. IfNPY_ITER_UPDATEIFCOPYis enabled, it will also copy the data back with another cast upon iterator destruction.Returns NULL if there is an error, otherwise returns the allocated iterator.
To make an iterator similar to the old iterator, this should work.
iter = NpyIter_New(op, NPY_ITER_READWRITE, NPY_CORDER, NPY_NO_CASTING, NULL);
If you want to edit an array with aligned
doublecode, but the order doesn’t matter, you would use this.dtype = PyArray_DescrFromType(NPY_DOUBLE); iter = NpyIter_New(op, NPY_ITER_READWRITE| NPY_ITER_BUFFERED| NPY_ITER_NBO| NPY_ITER_ALIGNED, NPY_KEEPORDER, NPY_SAME_KIND_CASTING, dtype); Py_DECREF(dtype);
-
NpyIter *NpyIter_MultiNew(npy_intp nop, PyArrayObject **op, npy_uint32 flags, NPY_ORDER order, NPY_CASTING casting, npy_uint32 *op_flags, PyArray_Descr **op_dtypes)#
Creates an iterator for broadcasting the
noparray objects provided inop, using regular NumPy broadcasting rules.Any of the
NPY_ORDERenum values may be passed toorder. For efficient iteration,NPY_KEEPORDERis the best option, and the other orders enforce the particular iteration pattern. When usingNPY_KEEPORDER, if you also want to ensure that the iteration is not reversed along an axis, you should pass the flagNPY_ITER_DONT_NEGATE_STRIDES.Any of the
NPY_CASTINGenum values may be passed tocasting. The values includeNPY_NO_CASTING,NPY_EQUIV_CASTING,NPY_SAFE_CASTING,NPY_SAME_KIND_CASTING, andNPY_UNSAFE_CASTING. To allow the casts to occur, copying or buffering must also be enabled.If
op_dtypesisn’tNULL, it specifies a data type orNULLfor eachop[i].Returns NULL if there is an error, otherwise returns the allocated iterator.
Flags that may be passed in
flags, applying to the whole iterator, are
-
NPY_ITER_C_INDEX#
Causes the iterator to track a raveled flat index matching C order. This option cannot be used with
NPY_ITER_F_INDEX.
-
NPY_ITER_F_INDEX#
Causes the iterator to track a raveled flat index matching Fortran order. This option cannot be used with
NPY_ITER_C_INDEX.
-
NPY_ITER_MULTI_INDEX#
Causes the iterator to track a multi-index. This prevents the iterator from coalescing axes to produce bigger inner loops. If the loop is also not buffered and no index is being tracked (
NpyIter_RemoveAxiscan be called), then the iterator size can be-1to indicate that the iterator is too large. This can happen due to complex broadcasting and will result in errors being created when the setting the iterator range, removing the multi index, or getting the next function. However, it is possible to remove axes again and use the iterator normally if the size is small enough after removal.
-
NPY_ITER_EXTERNAL_LOOP#
Causes the iterator to skip iteration of the innermost loop, requiring the user of the iterator to handle it.
This flag is incompatible with
NPY_ITER_C_INDEX,NPY_ITER_F_INDEX, andNPY_ITER_MULTI_INDEX.
-
NPY_ITER_DONT_NEGATE_STRIDES#
This only affects the iterator when
NPY_KEEPORDERis specified for the order parameter. By default withNPY_KEEPORDER, the iterator reverses axes which have negative strides, so that memory is traversed in a forward direction. This disables this step. Use this flag if you want to use the underlying memory-ordering of the axes, but don’t want an axis reversed. This is the behavior ofnumpy.ravel(a, order='K'), for instance.
-
NPY_ITER_COMMON_DTYPE#
Causes the iterator to convert all the operands to a common data type, calculated based on the ufunc type promotion rules. Copying or buffering must be enabled.
If the common data type is known ahead of time, don’t use this flag. Instead, set the requested dtype for all the operands.
-
NPY_ITER_REFS_OK#
Indicates that arrays with reference types (object arrays or structured arrays containing an object type) may be accepted and used in the iterator. If this flag is enabled, the caller must be sure to check whether
NpyIter_IterationNeedsAPI(iter)is true, in which case it may not release the GIL during iteration. If you are working with known dtypes NpyIter_GetTransferFlags is a faster and more precise way to check for whether the iterator needs the API due to buffering.
-
NPY_ITER_ZEROSIZE_OK#
Indicates that arrays with a size of zero should be permitted. Since the typical iteration loop does not naturally work with zero-sized arrays, you must check that the IterSize is larger than zero before entering the iteration loop. Currently only the operands are checked, not a forced shape.
-
NPY_ITER_REDUCE_OK#
Permits writeable operands with a dimension with zero stride and size greater than one. Note that such operands must be read/write.
When buffering is enabled, this also switches to a special buffering mode which reduces the loop length as necessary to not trample on values being reduced.
Note that if you want to do a reduction on an automatically allocated output, you must use
NpyIter_GetOperandArrayto get its reference, then set every value to the reduction unit before doing the iteration loop. In the case of a buffered reduction, this means you must also specify the flagNPY_ITER_DELAY_BUFALLOC, then reset the iterator after initializing the allocated operand to prepare the buffers.
-
NPY_ITER_RANGED#
Enables support for iteration of sub-ranges of the full
iterindexrange[0, NpyIter_IterSize(iter)). Use the functionNpyIter_ResetToIterIndexRangeto specify a range for iteration.This flag can only be used with
NPY_ITER_EXTERNAL_LOOPwhenNPY_ITER_BUFFEREDis enabled. This is because without buffering, the inner loop is always the size of the innermost iteration dimension, and allowing it to get cut up would require special handling, effectively making it more like the buffered version.
-
NPY_ITER_BUFFERED#
Causes the iterator to store buffering data, and use buffering to satisfy data type, alignment, and byte-order requirements. To buffer an operand, do not specify the
NPY_ITER_COPYorNPY_ITER_UPDATEIFCOPYflags, because they will override buffering. Buffering is especially useful for Python code using the iterator, allowing for larger chunks of data at once to amortize the Python interpreter overhead.If used with
NPY_ITER_EXTERNAL_LOOP, the inner loop for the caller may get larger chunks than would be possible without buffering, because of how the strides are laid out.Note that if an operand is given the flag
NPY_ITER_COPYorNPY_ITER_UPDATEIFCOPY, a copy will be made in preference to buffering. Buffering will still occur when the array was broadcast so elements need to be duplicated to get a constant stride.In normal buffering, the size of each inner loop is equal to the buffer size, or possibly larger if
NPY_ITER_GROWINNERis specified. IfNPY_ITER_REDUCE_OKis enabled and a reduction occurs, the inner loops may become smaller depending on the structure of the reduction.
-
NPY_ITER_GROWINNER#
When buffering is enabled, this allows the size of the inner loop to grow when buffering isn’t necessary. This option is best used if you’re doing a straight pass through all the data, rather than anything with small cache-friendly arrays of temporary values for each inner loop.
-
NPY_ITER_DELAY_BUFALLOC#
When buffering is enabled, this delays allocation of the buffers until
NpyIter_Resetor another reset function is called. This flag exists to avoid wasteful copying of buffer data when making multiple copies of a buffered iterator for multi-threaded iteration.Another use of this flag is for setting up reduction operations. After the iterator is created, and a reduction output is allocated automatically by the iterator (be sure to use READWRITE access), its value may be initialized to the reduction unit. Use
NpyIter_GetOperandArrayto get the object. Then, callNpyIter_Resetto allocate and fill the buffers with their initial values.
-
NPY_ITER_COPY_IF_OVERLAP#
If any write operand has overlap with any read operand, eliminate all overlap by making temporary copies (enabling UPDATEIFCOPY for write operands, if necessary). A pair of operands has overlap if there is a memory address that contains data common to both arrays.
Because exact overlap detection has exponential runtime in the number of dimensions, the decision is made based on heuristics, which has false positives (needless copies in unusual cases) but has no false negatives.
If any read/write overlap exists, this flag ensures the result of the operation is the same as if all operands were copied. In cases where copies would need to be made, the result of the computation may be undefined without this flag!
Flags that may be passed in
op_flags[i], where0 <= i < nop
-
NPY_ITER_READWRITE#
-
NPY_ITER_READONLY#
-
NPY_ITER_WRITEONLY#
Indicate how the user of the iterator will read or write to
op[i]. Exactly one of these flags must be specified per operand. UsingNPY_ITER_READWRITEorNPY_ITER_WRITEONLYfor a user-provided operand may triggerWRITEBACKIFCOPYsemantics. The data will be written back to the original array whenNpyIter_Deallocateis called.
-
NPY_ITER_COPY#
Allow a copy of
op[i]to be made if it does not meet the data type or alignment requirements as specified by the constructor flags and parameters.
-
NPY_ITER_UPDATEIFCOPY#
Triggers
NPY_ITER_COPY, and when an array operand is flagged for writing and is copied, causes the data in a copy to be copied back toop[i]whenNpyIter_Deallocateis called.If the operand is flagged as write-only and a copy is needed, an uninitialized temporary array will be created and then copied to back to
op[i]on callingNpyIter_Deallocate, instead of doing the unnecessary copy operation.
-
NPY_ITER_NBO#
-
NPY_ITER_ALIGNED#
-
NPY_ITER_CONTIG#
Causes the iterator to provide data for
op[i]that is in native byte order, aligned according to the dtype requirements, contiguous, or any combination.By default, the iterator produces pointers into the arrays provided, which may be aligned or unaligned, and with any byte order. If copying or buffering is not enabled and the operand data doesn’t satisfy the constraints, an error will be raised.
The contiguous constraint applies only to the inner loop, successive inner loops may have arbitrary pointer changes.
If the requested data type is in non-native byte order, the NBO flag overrides it and the requested data type is converted to be in native byte order.
-
NPY_ITER_ALLOCATE#
This is for output arrays, and requires that the flag
NPY_ITER_WRITEONLYorNPY_ITER_READWRITEbe set. Ifop[i]is NULL, creates a new array with the final broadcast dimensions, and a layout matching the iteration order of the iterator.When
op[i]is NULL, the requested data typeop_dtypes[i]may be NULL as well, in which case it is automatically generated from the dtypes of the arrays which are flagged as readable. The rules for generating the dtype are the same is for UFuncs. Of special note is handling of byte order in the selected dtype. If there is exactly one input, the input’s dtype is used as is. Otherwise, if more than one input dtypes are combined together, the output will be in native byte order.After being allocated with this flag, the caller may retrieve the new array by calling
NpyIter_GetOperandArrayand getting the i-th object in the returned C array. The caller must call Py_INCREF on it to claim a reference to the array.
-
NPY_ITER_NO_SUBTYPE#
For use with
NPY_ITER_ALLOCATE, this flag disables allocating an array subtype for the output, forcing it to be a straight ndarray.TODO: Maybe it would be better to introduce a function
NpyIter_GetWrappedOutputand remove this flag?
-
NPY_ITER_NO_BROADCAST#
Ensures that the input or output matches the iteration dimensions exactly.
-
NPY_ITER_ARRAYMASK#
Indicates that this operand is the mask to use for selecting elements when writing to operands which have the
NPY_ITER_WRITEMASKEDflag applied to them. Only one operand may haveNPY_ITER_ARRAYMASKflag applied to it.具有此标志的操作数的数据类型应为
NPY_BOOL、NPY_MASK,或者字段均为有效掩码 dtype 的结构体 dtype。在后一种情况下,它必须与被 WRITEMASKED 的结构体操作数匹配,因为它为该数组的每个字段指定了一个掩码。此标志仅影响从缓冲区写回数组。这意味着,如果操作数也是
NPY_ITER_READWRITE或NPY_ITER_WRITEONLY,则进行迭代的代码可以写入此操作数,以控制哪些元素将保持不变,哪些元素将被修改。当掩码应为输入掩码的组合时,这很有用。
-
NPY_ITER_WRITEMASKED#
此数组是所有
writemasked操作数的掩码。代码使用writemasked标志,该标志指示只写入选定的 ARRAYMASK 操作数为 True 的元素。一般而言,迭代器不强制执行此操作,遵循此承诺取决于进行迭代的代码。使用
writemasked标志且此操作数已缓冲时,这会更改从缓冲区复制到数组的数据方式。会使用带掩码的复制例程,该例程仅复制缓冲区中对应于 ARRAYMASK 操作数中相应元素返回 True 的writemasked的元素。
-
NPY_ITER_OVERLAP_ASSUME_ELEMENTWISE#
在内存重叠检查中,假定启用了
NPY_ITER_OVERLAP_ASSUME_ELEMENTWISE的操作数仅按迭代器顺序访问。这使迭代器能够推断数据依赖关系,可能避免不必要的复制。
此标志仅在迭代器上启用了
NPY_ITER_COPY_IF_OVERLAP时才生效。
-
NpyIter *NpyIter_AdvancedNew(npy_intp nop, PyArrayObject **op, npy_uint32 flags, NPY_ORDER order, NPY_CASTING casting, npy_uint32 *op_flags, PyArray_Descr **op_dtypes, int oa_ndim, int **op_axes, npy_intp const *itershape, npy_intp buffersize)#
扩展了
NpyIter_MultiNew,提供了多种高级选项,用于更精细地控制广播和缓冲。如果传递给
oa_ndim、op_axes、itershape和buffersize的值是 -1/NULL,则等同于NpyIter_MultiNew。当
oa_ndim不为零或 -1 时,该参数指定将通过自定义广播进行迭代的维度数量。如果提供了该参数,则必须提供op_axes,并且也可以提供itershape。op_axes参数允许您详细控制操作数数组的轴如何匹配和迭代。在op_axes中,您必须提供一个指向oa_ndim大小的npy_intp类型数组的nop指针的数组。如果op_axes中的某个条目为 NULL,则将应用常规广播规则。op_axes[j][i]中存储的是op[j]的有效轴,或 -1,表示newaxis。在每个op_axes[j]数组中,轴不能重复。以下示例说明了常规广播如何应用于 3D 数组、2D 数组、1D 数组和标量。注意:在 NumPy 1.8 之前,
oa_ndim == 0用于指示op_axes和itershape未使用。这已被弃用,应替换为 -1。对于这种情况,使用NpyIter_MultiNew可以实现更好的向后兼容性。int oa_ndim = 3; /* # iteration axes */ int op0_axes[] = {0, 1, 2}; /* 3-D operand */ int op1_axes[] = {-1, 0, 1}; /* 2-D operand */ int op2_axes[] = {-1, -1, 0}; /* 1-D operand */ int op3_axes[] = {-1, -1, -1} /* 0-D (scalar) operand */ int* op_axes[] = {op0_axes, op1_axes, op2_axes, op3_axes};
itershape参数允许您强制迭代器具有特定的迭代形状。它是一个长度为oa_ndim的数组。当某个条目为负数时,其值将根据操作数确定。此参数允许自动分配的输出获得与任何输入维度都不匹配的额外维度。如果
buffersize为零,则使用默认缓冲区大小,否则指定要使用的缓冲区大小。建议使用 2 的幂次方(例如 4096 或 8192)作为缓冲区。Returns NULL if there is an error, otherwise returns the allocated iterator.
-
NpyIter *NpyIter_Copy(NpyIter *iter)#
创建给定迭代器的副本。此函数主要用于实现数据的多线程迭代。
待办:移至关于多线程迭代的部分。
多线程迭代的推荐方法是首先使用标志
NPY_ITER_EXTERNAL_LOOP、NPY_ITER_RANGED、NPY_ITER_BUFFERED、NPY_ITER_DELAY_BUFALLOC,以及可能的NPY_ITER_GROWINNER来创建迭代器。为每个线程创建一个此迭代器的副本(第一个迭代器除外)。然后,获取迭代索引范围[0, NpyIter_GetIterSize(iter))并将其拆分为任务,例如使用 TBB 的 parallel_for 循环。当一个线程获得要执行的任务时,它通过调用NpyIter_ResetToIterIndexRange来使用其迭代器副本,并遍历整个范围。在多线程代码或未持有 Python GIL 的代码中使用迭代器时,必须小心,只能调用在此上下文中安全的函数。在没有 Python GIL 的情况下调用
NpyIter_Copy是不安全的,因为它会增加 Python 引用。通过将errmsg参数传递为非 NULL,可以安全地调用Reset*和其他一些函数,以便这些函数通过它传递错误,而不是设置 Python 异常。每个副本都必须调用
NpyIter_Deallocate。
-
int NpyIter_RemoveAxis(NpyIter *iter, int axis)#
删除迭代中的一个轴。这要求迭代器创建时设置了
NPY_ITER_MULTI_INDEX,如果启用了缓冲或正在跟踪索引,则此函数无效。此函数还会将迭代器重置为其初始状态。这对于设置累加循环等很有用。迭代器可以首先用所有维度(包括累加轴)创建,以便正确创建输出。然后,可以删除累加轴,并以嵌套方式进行计算。
警告:此函数可能会更改迭代器的内部内存布局。任何缓存的函数或指针都必须重新获取!迭代器范围也会重置。
返回
NPY_SUCCEED或NPY_FAIL。
-
int NpyIter_RemoveMultiIndex(NpyIter *iter)#
如果迭代器正在跟踪多索引,此函数将剥离对它们的支持,并执行不需要多索引时可进行的进一步迭代器优化。此函数还会将迭代器重置为其初始状态。
警告:此函数可能会更改迭代器的内部内存布局。任何缓存的函数或指针都必须重新获取!
调用此函数后,NpyIter_HasMultiIndex(iter) 将返回 false。
返回
NPY_SUCCEED或NPY_FAIL。
-
int NpyIter_EnableExternalLoop(NpyIter *iter)#
如果调用了
NpyIter_RemoveMultiIndex,您可能希望启用标志NPY_ITER_EXTERNAL_LOOP。此标志不允许与NPY_ITER_MULTI_INDEX同时使用,因此提供了此函数以在调用NpyIter_RemoveMultiIndex后启用该功能。此函数还会将迭代器重置为其初始状态。警告:此函数会更改迭代器的内部逻辑。任何缓存的函数或指针都必须重新获取!
返回
NPY_SUCCEED或NPY_FAIL。
-
NPY_ARRAYMETHOD_FLAGS NpyIter_GetTransferFlags(NpyIter *iter)#
版本 2.3 新增。
获取 NPY_METH_RUNTIME_FLAGS,它提供有关缓冲是否需要 Python GIL(NPY_METH_REQUIRES_PYAPI)或是否可能设置浮点数错误(NPY_METH_NO_FLOATINGPOINT_ERRORS)的信息。
在 NumPy 2.3 之前,可用的公共函数是
NpyIter_IterationNeedsAPI,它仍然可用,并且除了检查缓冲/迭代需求本身之外,还会检查对象(或类似)dtype。一般而言,应优先使用此函数。
-
int NpyIter_Reset(NpyIter *iter, char **errmsg)#
将迭代器重置回其初始状态,位于迭代范围的开头。
返回
NPY_SUCCEED或NPY_FAIL。如果 errmsg 非 NULL,则在返回NPY_FAIL时不会设置 Python 异常。而是将 *errmsg 设置为错误消息。当 errmsg 非 NULL 时,可以安全地调用该函数而无需持有 Python GIL。
-
int NpyIter_ResetToIterIndexRange(NpyIter *iter, npy_intp istart, npy_intp iend, char **errmsg)#
重置迭代器,并将其限制在
iterindex范围[istart, iend)内。有关如何将此用于多线程迭代的说明,请参见NpyIter_Copy。这要求将NPY_ITER_RANGED标志传递给迭代器构造函数。如果您想同时重置
iterindex范围和基本指针,您可以执行以下操作以避免额外的缓冲区复制(复制此代码时请务必添加返回码错误检查)。/* Set to a trivial empty range */ NpyIter_ResetToIterIndexRange(iter, 0, 0); /* Set the base pointers */ NpyIter_ResetBasePointers(iter, baseptrs); /* Set to the desired range */ NpyIter_ResetToIterIndexRange(iter, istart, iend);
返回
NPY_SUCCEED或NPY_FAIL。如果 errmsg 非 NULL,则在返回NPY_FAIL时不会设置 Python 异常。而是将 *errmsg 设置为错误消息。当 errmsg 非 NULL 时,可以安全地调用该函数而无需持有 Python GIL。
-
int NpyIter_ResetBasePointers(NpyIter *iter, char **baseptrs, char **errmsg)#
将迭代器重置回其初始状态,但使用
baseptrs中的值作为数据,而不是被迭代数组的指针。此函数旨在与op_axes参数一起,由具有两个或多个迭代器的嵌套迭代代码使用。返回
NPY_SUCCEED或NPY_FAIL。如果 errmsg 非 NULL,则在返回NPY_FAIL时不会设置 Python 异常。而是将 *errmsg 设置为错误消息。当 errmsg 非 NULL 时,可以安全地调用该函数而无需持有 Python GIL。待办:将以下内容移至关于嵌套迭代器的特殊部分。
创建嵌套迭代器的迭代器需要一些技巧。所有迭代器操作数必须完全匹配,否则调用
NpyIter_ResetBasePointers将无效。这意味着不应随意使用自动复制和输出分配。仍然可以通过创建一个启用了所有转换参数的迭代器来使用迭代器的自动数据转换和转换功能,然后使用NpyIter_GetOperandArray函数获取已分配的操作数,并将它们传递给其余迭代器的构造函数。警告:在创建嵌套迭代器的迭代器时,代码不得在不同的迭代器中多次使用同一维度。如果这样做,嵌套迭代将在迭代期间产生越界指针。
警告:在创建嵌套迭代器的迭代器时,缓冲只能应用于最内层的迭代器。如果使用已缓冲的迭代器作为
baseptrs的源,它将指向一个小缓冲区而不是数组,内部迭代将无效。使用嵌套迭代器的模式如下。
NpyIter *iter1, *iter1; NpyIter_IterNextFunc *iternext1, *iternext2; char **dataptrs1; /* * With the exact same operands, no copies allowed, and * no axis in op_axes used both in iter1 and iter2. * Buffering may be enabled for iter2, but not for iter1. */ iter1 = ...; iter2 = ...; iternext1 = NpyIter_GetIterNext(iter1); iternext2 = NpyIter_GetIterNext(iter2); dataptrs1 = NpyIter_GetDataPtrArray(iter1); do { NpyIter_ResetBasePointers(iter2, dataptrs1); do { /* Use the iter2 values */ } while (iternext2(iter2)); } while (iternext1(iter1));
-
int NpyIter_GotoMultiIndex(NpyIter *iter, npy_intp const *multi_index)#
将迭代器调整到由
multi_index指向的ndim个索引。如果未跟踪多索引、索引越界或禁用了内部循环迭代,则返回错误。返回
NPY_SUCCEED或NPY_FAIL。
-
int NpyIter_GotoIndex(NpyIter *iter, npy_intp index)#
将迭代器调整到指定的
index。如果迭代器是使用标志NPY_ITER_C_INDEX构建的,则index是 C 顺序索引;如果迭代器是使用标志NPY_ITER_F_INDEX构建的,则index是 Fortran 顺序索引。如果未跟踪索引、索引越界或禁用了内部循环迭代,则返回错误。返回
NPY_SUCCEED或NPY_FAIL。
-
npy_intp NpyIter_GetIterSize(NpyIter *iter)#
返回正在迭代的元素数量。这是形状中所有维度的乘积。当正在跟踪多索引(并且可能调用
NpyIter_RemoveAxis)时,大小可能为-1,表示迭代器过大。这种迭代器是无效的,但在调用NpyIter_RemoveAxis后可能会变得有效。没有必要检查这种情况。
-
void NpyIter_GetIterIndexRange(NpyIter *iter, npy_intp *istart, npy_intp *iend)#
获取正在迭代的
iterindex子范围。如果未指定NPY_ITER_RANGED,则此函数始终返回范围[0, NpyIter_IterSize(iter))。
-
int NpyIter_GotoIterIndex(NpyIter *iter, npy_intp iterindex)#
将迭代器调整到指定的
iterindex。IterIndex 是与迭代器迭代顺序匹配的索引。如果iterindex超出范围、启用了缓冲或禁用了内部循环迭代,则返回错误。返回
NPY_SUCCEED或NPY_FAIL。
-
npy_bool NpyIter_HasDelayedBufAlloc(NpyIter *iter)#
如果传递给迭代器构造函数的标志为
NPY_ITER_DELAY_BUFALLOC并且尚未调用任一 Reset 函数,则返回 1,否则返回 0。
-
npy_bool NpyIter_HasExternalLoop(NpyIter *iter)#
返回 1(如果调用者需要处理最内层的 1 维循环)或 0(如果迭代器处理所有循环)。这由构造函数标志
NPY_ITER_EXTERNAL_LOOP或NpyIter_EnableExternalLoop控制。
-
npy_bool NpyIter_HasMultiIndex(NpyIter *iter)#
如果迭代器是用
NPY_ITER_MULTI_INDEX标志创建的,则返回 1,否则返回 0。
-
npy_bool NpyIter_HasIndex(NpyIter *iter)#
如果迭代器是用
NPY_ITER_C_INDEX或NPY_ITER_F_INDEX标志创建的,则返回 1,否则返回 0。
-
npy_bool NpyIter_IsBuffered(NpyIter *iter)#
如果迭代器是用
NPY_ITER_BUFFERED标志创建的,则返回 1,否则返回 0。
-
npy_bool NpyIter_IsGrowInner(NpyIter *iter)#
如果迭代器是用
NPY_ITER_GROWINNER标志创建的,则返回 1,否则返回 0。
-
npy_intp *NpyIter_GetAxisStrideArray(NpyIter *iter, int axis)#
获取指定轴的步幅数组。要求迭代器正在跟踪多索引,并且未启用缓冲。
当您想以某种方式匹配操作数轴,然后使用
NpyIter_RemoveAxis手动处理它们时,可以使用此功能。通过在移除轴之前调用此函数,您可以获取用于手动处理的步幅。发生错误时返回
NULL。
-
int NpyIter_GetShape(NpyIter *iter, npy_intp *outshape)#
在
outshape中返回迭代器的广播形状。仅当迭代器正在跟踪多索引时才能调用此函数。返回
NPY_SUCCEED或NPY_FAIL。
-
PyArray_Descr **NpyIter_GetDescrArray(NpyIter *iter)#
返回一个指向正在迭代的对象的数据类型描述符(
nop个)的指针。结果指向iter,因此调用者不会获得描述符的任何引用。此指针可以在迭代循环之前缓存,调用
iternext不会改变它。
-
PyObject **NpyIter_GetOperandArray(NpyIter *iter)#
返回一个指向正在迭代的操作数
PyObject(nop个)的指针。结果指向iter,因此调用者不会获得PyObject的任何引用。
-
PyObject *NpyIter_GetIterView(NpyIter *iter, npy_intp i)#
返回一个指向新 ndarray 视图的引用,该视图是
NpyIter_GetOperandArray数组中第 i 个对象的视图,其维度和步幅与内部优化的迭代模式匹配。此视图的 C 顺序迭代等同于迭代器的迭代顺序。例如,如果迭代器是用单个数组作为输入创建的,并且可以重新排列所有轴然后将其折叠成单个跨步迭代,那么这将返回一个视图,该视图是一个一维数组。
-
void NpyIter_GetReadFlags(NpyIter *iter, char *outreadflags)#
填充
nop个标志。如果op[i]可以读取,则将outreadflags[i]设置为 1,否则设置为 0。
-
void NpyIter_GetWriteFlags(NpyIter *iter, char *outwriteflags)#
填充
nop个标志。如果op[i]可以写入,则将outwriteflags[i]设置为 1,否则设置为 0。
-
int NpyIter_CreateCompatibleStrides(NpyIter *iter, npy_intp itemsize, npy_intp *outstrides)#
构建一组与使用
NPY_ITER_ALLOCATE标志创建的输出数组的步幅相同的步幅,其中 `op_axes` 为 NULL。这是为了打包连续的数据,但不一定是 C 或 Fortran 顺序。应与NpyIter_GetShape和NpyIter_GetNDim一起使用,并将NPY_ITER_MULTI_INDEX标志传递给构造函数。此函数的一个用例是匹配迭代器的形状和布局,然后添加一个或多个维度。例如,为了生成每个输入值的向量以进行数值梯度计算,将 `ndim*itemsize` 作为 `itemsize` 传递,然后添加一个大小为 `ndim` 且步幅为 `itemsize` 的新维度到末尾。为了计算 Hessian 矩阵,可以执行相同的操作,但添加两个维度,或者利用对称性,将其打包到一个具有特定编码的维度中。
只能在迭代器跟踪多索引并且使用了
NPY_ITER_DONT_NEGATE_STRIDES以防止某个轴反向迭代时调用此函数。如果使用此方法创建数组,则只需在每次迭代中增加“itemsize”即可遍历与迭代器匹配的新数组。
返回
NPY_SUCCEED或NPY_FAIL。
-
npy_bool NpyIter_IsFirstVisit(NpyIter *iter, int iop)#
检查迭代器指向的指定归约操作数元素是否是第一次被看到。对于归约操作数和禁用缓冲时,此函数会返回合理的结果。对于已缓冲的非归约操作数,结果可能不正确。
此函数仅用于 EXTERNAL_LOOP 模式,当未启用该模式时会产生错误结果。
如果此函数返回 true,调用者还应检查操作数的内部循环步幅,因为如果该步幅为 0,则仅会第一次访问最内层外部循环的第一个元素。
警告:出于性能原因,不进行 'iop' 的边界检查,不确认 'iop' 实际上是归约操作数,也不确认是否启用了 EXTERNAL_LOOP 模式。这些检查由调用者负责,并且应在任何内部循环之外进行。
迭代函数#
-
NpyIter_IterNextFunc *NpyIter_GetIterNext(NpyIter *iter, char **errmsg)#
返回一个用于迭代的函数指针。此函数可能会计算一个专门的函数指针版本,而不是将其存储在迭代器结构中。因此,为了获得良好的性能,要求将函数指针保存在一个变量中,而不是在每次循环迭代时都检索它。
如果发生错误,则返回 NULL。如果 errmsg 非 NULL,则在
NPY_FAIL返回时不会设置 Python 异常。相反,*errmsg 会被设置为错误消息。当 errmsg 非 NULL 时,可以安全地调用该函数而不持有 Python GIL。典型的循环结构如下。
NpyIter_IterNextFunc *iternext = NpyIter_GetIterNext(iter, NULL); char** dataptr = NpyIter_GetDataPtrArray(iter); do { /* use the addresses dataptr[0], ... dataptr[nop-1] */ } while(iternext(iter));
当指定
NPY_ITER_EXTERNAL_LOOP时,典型的内部循环结构如下。NpyIter_IterNextFunc *iternext = NpyIter_GetIterNext(iter, NULL); char** dataptr = NpyIter_GetDataPtrArray(iter); npy_intp* stride = NpyIter_GetInnerStrideArray(iter); npy_intp* size_ptr = NpyIter_GetInnerLoopSizePtr(iter), size; npy_intp iop, nop = NpyIter_GetNOp(iter); do { size = *size_ptr; while (size--) { /* use the addresses dataptr[0], ... dataptr[nop-1] */ for (iop = 0; iop < nop; ++iop) { dataptr[iop] += stride[iop]; } } } while (iternext());
请注意,我们正在使用迭代器内部的 `dataptr` 数组,而不是将值复制到本地临时变量。这是可能的,因为当调用
iternext()时,这些指针将被新的值覆盖,而不是递增更新。如果正在使用编译时固定的缓冲区(同时设置了
NPY_ITER_BUFFERED和NPY_ITER_EXTERNAL_LOOP标志),则内部大小也可以用作信号。当iternext()返回 false 时,大小保证会变为零,从而启用以下循环结构。请注意,如果您使用此结构,则不应将NPY_ITER_GROWINNER作为标志传递,因为它在某些情况下会导致更大的尺寸。/* The constructor should have buffersize passed as this value */ #define FIXED_BUFFER_SIZE 1024 NpyIter_IterNextFunc *iternext = NpyIter_GetIterNext(iter, NULL); char **dataptr = NpyIter_GetDataPtrArray(iter); npy_intp *stride = NpyIter_GetInnerStrideArray(iter); npy_intp *size_ptr = NpyIter_GetInnerLoopSizePtr(iter), size; npy_intp i, iop, nop = NpyIter_GetNOp(iter); /* One loop with a fixed inner size */ size = *size_ptr; while (size == FIXED_BUFFER_SIZE) { /* * This loop could be manually unrolled by a factor * which divides into FIXED_BUFFER_SIZE */ for (i = 0; i < FIXED_BUFFER_SIZE; ++i) { /* use the addresses dataptr[0], ... dataptr[nop-1] */ for (iop = 0; iop < nop; ++iop) { dataptr[iop] += stride[iop]; } } iternext(); size = *size_ptr; } /* Finish-up loop with variable inner size */ if (size > 0) do { size = *size_ptr; while (size--) { /* use the addresses dataptr[0], ... dataptr[nop-1] */ for (iop = 0; iop < nop; ++iop) { dataptr[iop] += stride[iop]; } } } while (iternext());
-
NpyIter_GetMultiIndexFunc *NpyIter_GetGetMultiIndex(NpyIter *iter, char **errmsg)#
返回一个用于获取迭代器当前多索引的函数指针。如果迭代器未跟踪多索引,则返回 NULL。建议在迭代循环之前将此函数指针缓存到局部变量中。
如果发生错误,则返回 NULL。如果 errmsg 非 NULL,则在
NPY_FAIL返回时不会设置 Python 异常。相反,*errmsg 会被设置为错误消息。当 errmsg 非 NULL 时,可以安全地调用该函数而不持有 Python GIL。
-
char **NpyIter_GetDataPtrArray(NpyIter *iter)#
返回一个指向数据指针数组的指针(
nop个)。如果未指定NPY_ITER_EXTERNAL_LOOP,则每个数据指针指向迭代器的当前数据项。如果没有指定内部迭代,它将指向内部循环的第一个数据项。此指针可以在迭代循环之前缓存,调用
iternext不会改变它。可以安全地调用此函数而不持有 Python GIL。
-
char **NpyIter_GetInitialDataPtrArray(NpyIter *iter)#
获取数据指针数组,直接指向数组(绝不会指向缓冲区),对应于迭代索引 0。
这些指针与
NpyIter_ResetBasePointers接受的指针不同,因为某些轴的方向可能已被反转。可以安全地调用此函数而不持有 Python GIL。
-
npy_intp *NpyIter_GetIndexPtr(NpyIter *iter)#
返回一个指向正在跟踪的索引的指针,如果没有跟踪索引则返回 NULL。只有在构造期间指定了
NPY_ITER_C_INDEX或NPY_ITER_F_INDEX标志之一时,才能使用此函数。
当使用 NPY_ITER_EXTERNAL_LOOP 标志时,代码需要知道用于内部循环的参数。这些函数提供了这些信息。
-
npy_intp *NpyIter_GetInnerStrideArray(NpyIter *iter)#
返回一个指向
nop个步幅的数组的指针,每个迭代对象一个,供内部循环使用。此指针可以在迭代循环之前缓存,调用
iternext不会改变它。可以安全地调用此函数而不持有 Python GIL。警告:虽然指针可以被缓存,但如果迭代器已缓冲,其值可能会改变。
-
npy_intp *NpyIter_GetInnerLoopSizePtr(NpyIter *iter)#
返回一个指向内部循环应执行的迭代次数的指针。
此地址可以在迭代循环之前缓存,调用
iternext不会改变它。该值本身在迭代期间可能会改变,尤其是在启用缓冲的情况下。可以安全地调用此函数而不持有 Python GIL。
-
void NpyIter_GetInnerFixedStrideArray(NpyIter *iter, npy_intp *out_strides)#
获取一个步幅数组,这些步幅是固定的,或者在整个迭代过程中不会改变。对于可能改变的步幅,值 NPY_MAX_INTP 会被放入步幅中。
一旦迭代器准备好进行迭代(如果在
NPY_ITER_DELAY_BUFALLOC使用后进行了重置),请调用此函数以获取可能用于选择快速内部循环函数的步幅。例如,如果步幅为 0,则意味着内部循环可以始终将值加载到一个变量中,然后整个循环使用该变量,或者如果步幅等于 itemsize,则可以对该操作数使用连续的版本。可以安全地调用此函数而不持有 Python GIL。
从之前的 NumPy 迭代器转换#
旧的迭代器 API 包括 PyArrayIter_Check、PyArray_Iter* 和 PyArray_ITER_* 等函数。多迭代器数组包括 PyArray_MultiIter*、PyArray_Broadcast 和 PyArray_RemoveSmallest。新的迭代器设计用一个对象和关联的 API 替换了所有这些功能。新 API 的一个目标是,现有的迭代器的所有用法都可以与新迭代器无缝替换。在 1.6 版本中,主要例外是邻域迭代器,它在此迭代器中没有对应的功能。
以下是新迭代器要使用的函数转换表
迭代函数 |
|
|
|
不支持(请改用对多个操作数的支持。) |
|
需要在 Python 暴露中添加此项 |
|
来自 |
|
`iternext` 函数指针的返回值 |
|
多迭代器函数 |
|
来自 |
|
不支持(始终为锁定步进迭代) |
|
`iternext` 函数指针的返回值 |
|
由 |
|
迭代器标志 |
|
其他函数 |
|
迭代器标志 |