Home / Class/ binary_kernel_reduce_lastdim Class — pytorch Architecture

binary_kernel_reduce_lastdim Class — pytorch Architecture

Architecture documentation for the binary_kernel_reduce_lastdim class in Reduce.h from the pytorch codebase.

Entity Profile

Source Code

aten/src/ATen/native/cpu/Reduce.h lines 290–308

template <typename reduce_func_t>
void binary_kernel_reduce_lastdim(TensorIteratorBase& iter, reduce_func_t reduce_op) {
  auto shape = iter.shape();
  int64_t dim_size = shape[0];
  int64_t grain_size = std::max((int64_t) 1, at::internal::GRAIN_SIZE / dim_size);
  TensorIterator sub_iter(iter);
  // create sub iterator to parallel on all non-reduce-dims
  sub_iter.narrow(0, 0, 1);
  auto loop = [&](char** data, const int64_t* strides, int64_t size) {
    char* out = data[0];
    char* in = data[1];
    for (int64_t i = 0; i < size; ++i) {
      reduce_op(out, in, dim_size);
      out += strides[0];
      in += strides[1];
    }
  };
  sub_iter.for_each(loop, grain_size);
}

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free