scalar_outer_sum Class — pytorch Architecture
Architecture documentation for the scalar_outer_sum class in SumKernel.cpp from the pytorch codebase.
Entity Profile
Source Code
aten/src/ATen/native/cpu/SumKernel.cpp lines 513–533
template <typename acc_t, typename LoadPolicy, typename StorePolicy>
void scalar_outer_sum(
// NOLINTNEXTLINE(modernize-avoid-c-arrays,cppcoreguidelines-avoid-c-arrays)
char * C10_RESTRICT data[2], int64_t in_strides[2], int64_t out_stride,
int64_t size0, int64_t size1) {
constexpr int64_t nrows = 4;
int64_t j = 0;
for (; j + (nrows - 1) < size1; j += nrows) {
const auto *row_in = data[1] + j * in_strides[1];
auto sums = multi_row_sum<acc_t, nrows, LoadPolicy>(
row_in, in_strides[0], in_strides[1], size0);
store<StorePolicy>(data[0], out_stride, j, sums);
}
for (; j < size1; ++j) {
const auto *row_in = data[1] + j * in_strides[1];
auto ans = row_sum<acc_t, LoadPolicy>(
row_in, in_strides[0], size0);
store<StorePolicy>(data[0], out_stride, j, ans);
}
}
Source
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free