reduce_all_impl_vec_two_outputs Class — pytorch Architecture

Architecture documentation for the reduce_all_impl_vec_two_outputs class in ReduceAllOpsKernel.cpp from the pytorch codebase.

Class cpp

Entity Profile

Source Code

aten/src/ATen/native/cpu/ReduceAllOpsKernel.cpp lines 141–168

template <typename scalar_t, typename func_t, typename vec_func_t1, typename vec_func_t2>
inline void reduce_all_impl_vec_two_outputs(
    Tensor& output1,
    Tensor& output2,
    const Tensor& input,
    const std::pair<scalar_t, scalar_t>& ident_v,
    func_t reduce_acc_func,
    vec_func_t1 reduce_chunk_func1,
    vec_func_t2 reduce_chunk_func2) {
  using Vec = Vectorized<opmath_type<scalar_t>>;
  using scalar_t_pair = std::pair<scalar_t, scalar_t>;
  const int64_t input_numel = input.numel();
  auto input_data = input.const_data_ptr<scalar_t>();
  // NOTE: parallel_reduce not support bool type
  std::pair<scalar_t, scalar_t> result = at::parallel_reduce(0, input_numel, internal::GRAIN_SIZE, ident_v,
    [&](int64_t start, int64_t end, const scalar_t_pair& /* ident */) -> scalar_t_pair {
    scalar_t_pair partial_out = vec::reduce2_all<scalar_t>(
        [=](Vec x, Vec y) { return reduce_chunk_func1(x, y); },
        [=](Vec x, Vec y) { return reduce_chunk_func2(x, y); },
        input_data + start,
        end - start);
      return partial_out;
    },
    reduce_acc_func
  );
  output1.fill_(result.first);
  output2.fill_(result.second);
}

Source

View on GitHub

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free