Home / Class/ scalar_outer_sum Class — pytorch Architecture

scalar_outer_sum Class — pytorch Architecture

Architecture documentation for the scalar_outer_sum class in SumKernel.cpp from the pytorch codebase.

Entity Profile

Source Code

aten/src/ATen/native/cpu/SumKernel.cpp lines 513–533

template <typename acc_t, typename LoadPolicy, typename StorePolicy>
void scalar_outer_sum(
    // NOLINTNEXTLINE(modernize-avoid-c-arrays,cppcoreguidelines-avoid-c-arrays)
    char * C10_RESTRICT data[2], int64_t in_strides[2], int64_t out_stride,
    int64_t size0, int64_t size1) {
  constexpr int64_t nrows = 4;
  int64_t j = 0;
  for (; j + (nrows - 1) < size1; j += nrows) {
    const auto *row_in = data[1] + j * in_strides[1];
    auto sums = multi_row_sum<acc_t, nrows, LoadPolicy>(
        row_in, in_strides[0], in_strides[1], size0);
    store<StorePolicy>(data[0], out_stride, j, sums);
  }

  for (; j < size1; ++j) {
    const auto *row_in = data[1] + j * in_strides[1];
    auto ans = row_sum<acc_t, LoadPolicy>(
        row_in, in_strides[0], size0);
    store<StorePolicy>(data[0], out_stride, j, ans);
  }
}

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free