vectorized_outer_reduction Class — pytorch Architecture
Architecture documentation for the vectorized_outer_reduction class in Reduce.h from the pytorch codebase.
Entity Profile
Source Code
aten/src/ATen/native/cpu/Reduce.h lines 94–113
template <typename func_t, typename vec_func_t>
inline void vectorized_outer_reduction(char** data, int64_t inner_stride, int64_t size0, int64_t size1, func_t op, vec_func_t vop) {
VEC_LOOP_HEADER(func_t, data)
// reduce down each column of 4 * Vec::size() elements.
constexpr int64_t vector_stride = 4 * Vec::size() * sizeof(scalar_t);
int64_t outer_stride[2] = { vector_stride, vector_stride };
UNARY_OUTER_LOOP(data, outer_stride, size1 / (4 * Vec::size()), [&] {
vectorized_reduction(data, size0, inner_stride, op, vop, /*reduce=*/false);
});
// reduce down the remaining columns
int64_t step[] = { sizeof(scalar_t), sizeof(scalar_t) };
int64_t remaining = size1 % (4 * Vec::size());
UNARY_OUTER_LOOP(data, step, remaining, [&] {
char* ptrs[3] = { data[0], data[0], data[1] };
int64_t strides[] = { 0, 0, inner_stride };
basic_loop(ptrs, strides, 0, size0, op);
});
}
Source
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free