row_sum Class — pytorch Architecture
Architecture documentation for the row_sum class in SumKernel.cpp from the pytorch codebase.
Entity Profile
Source Code
aten/src/ATen/native/cpu/SumKernel.cpp lines 412–431
template <typename scalar_t, typename LoadPolicy>
scalar_t row_sum(const char * C10_RESTRICT in_data,
const int64_t in_stride, const int64_t size) {
constexpr int64_t ilp_factor = 4;
// Interpret row as a (-1, ilp_factor) shaped array to find partial sums
const int64_t size_ilp = size / ilp_factor;
auto partial_sums = multi_row_sum<scalar_t, ilp_factor, LoadPolicy>(
in_data, in_stride * ilp_factor, in_stride, size_ilp);
for (int64_t i = size_ilp * ilp_factor; i < size; ++i) {
partial_sums[0] += LoadPolicy::load(in_data, in_stride, i);
}
for (const auto k : c10::irange(1, ilp_factor)) {
partial_sums[0] += partial_sums[k];
}
return partial_sums[0];
}
Source
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free