reduce_all_impl_two_outputs Class — pytorch Architecture
Architecture documentation for the reduce_all_impl_two_outputs class in ReduceAllOpsKernel.cpp from the pytorch codebase.
Entity Profile
Source Code
aten/src/ATen/native/cpu/ReduceAllOpsKernel.cpp lines 116–139
template <typename scalar_t, typename func_t1, typename func_t2>
inline void reduce_all_impl_two_outputs(
Tensor& output1,
Tensor& output2,
const Tensor& input,
const std::pair<scalar_t, scalar_t>& ident_v,
func_t1 reduce_chunk_func,
func_t2 reduce_acc_func) {
using scalar_t_pair = std::pair<scalar_t, scalar_t>;
const int64_t input_numel = input.numel();
auto input_data = input.const_data_ptr<scalar_t>();
scalar_t_pair result = at::parallel_reduce(0, input_numel, internal::GRAIN_SIZE, ident_v,
[&](int64_t start, int64_t end, const scalar_t_pair& ident) -> scalar_t_pair {
scalar_t_pair partial_out(ident);
for (const auto i : c10::irange(start, end)) {
partial_out = reduce_chunk_func(partial_out, input_data[i]);
}
return partial_out;
},
reduce_acc_func
);
output1.fill_(result.first);
output2.fill_(result.second);
}
Source
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free