load_reduce_vec Class — pytorch Architecture
Architecture documentation for the load_reduce_vec class in SumKernel.cpp from the pytorch codebase.
Entity Profile
Source Code
aten/src/ATen/native/cpu/SumKernel.cpp lines 18–37
template <typename acc_t, typename scalar_t, typename F>
Vectorized<acc_t> load_reduce_vec(const scalar_t* data, F reduce, acc_t ident) {
using vec_t = Vectorized<scalar_t>;
using vacc_t = Vectorized<acc_t>;
static_assert(vacc_t::size() <= vec_t::size());
const auto val = vec_t::loadu(data);
alignas(64) std::array<scalar_t, vec_t::size()> values;
val.store(values.data());
constexpr int vstride = vec_t::size() / vacc_t::size();
alignas(64) std::array<acc_t, vacc_t::size()> acc;
acc.fill(ident);
for (const auto k : c10::irange(vstride)) {
for (const auto i : c10::irange(vacc_t::size())) {
acc[i] = reduce(acc[i], values[i * vstride + k]);
}
}
return vacc_t::loadu(acc.data());
}
Source
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free