_compute_indices_min_size_weights Class — pytorch Architecture

Architecture documentation for the _compute_indices_min_size_weights class in UpSampleKernel.cpp from the pytorch codebase.

Class cpp

Entity Profile

Source Code

aten/src/ATen/native/cpu/UpSampleKernel.cpp lines 791–845

  template <typename scalar_t, typename aa_filter_fn_t>
  static inline scalar_t _compute_indices_min_size_weights(
    const int64_t i, const int64_t input_size, const scalar_t scale,
    scalar_t* wt_ptr, const int64_t max_interp_size, aa_filter_fn_t filter_fn,
    bool align_corners, int64_t& index_min, int64_t& index_size
  ) {
    // Notes. We do not use opmath_t in this method as f16 and other smaller float types are not routed here.
    // Typical usage of this method is with scalar_t = double when computing indices and weights for uint8 input
    // The code below partly adapts indices and lambda computation from compute_indices_weights method and
    // index_min/index_size from _compute_indices_min_size_weights_aa

    bool cubic = max_interp_size > 2;
    const auto real_input_index = area_pixel_compute_source_index<scalar_t>(
        scale, i, align_corners, /*cubic=*/cubic);

    scalar_t lambda;
    int64_t input_index = 0;
    guard_index_and_lambda(real_input_index, input_size, input_index, lambda);

    const auto support = static_cast<int64_t>(max_interp_size * 0.5);
    const auto unbound_index_min = input_index - support + 1;
    const auto unbound_index_max = input_index + support + 1;
    index_min = std::max(unbound_index_min, static_cast<int64_t>(0));
    index_size = std::min(unbound_index_max, input_size) - index_min;
    // There are rare cases when due to precision xsize can be larger than max_interp_size by one.
    // We have to clip the value
    index_size = std::clamp(index_size, static_cast<int64_t>(0), max_interp_size);

    // Below the weights are computed using filter_fn and accumulating values for indices being out of bounds
    // For example, for bicubic mode for output index i = 0, we have input_index = -1,
    // then we have unbound_index_min = -2 and unbound_index_max = 1 => unbounded input indices are [-2, -1, 0, 1] and
    // valid input indices will be [0, 1]
    // For unbounded input indices we compute four non-zero weights values [w0, w1, w2, w3] and as only two weights can
    // be used with valid input indcies, we accumulate values in the following way: [w0 + w1 + w2, w3, 0.0, 0.0]
    // This is equivalent to the float path which would compute indices as [0, 0, 0, 1] and weights as [w0, w1, w2, s3].
    // A similar accumulation should done for unbounded indices larger than input size.
    auto w_index = 0;
    scalar_t wt_max = 0.0;
    for (const auto j : c10::irange(max_interp_size)) {
      // initialize weights value as we will accumulate below
      wt_ptr[j] = 0.0;

      scalar_t w = filter_fn(static_cast<scalar_t>(j + 1 - support) - lambda);
      if (unbound_index_min + j <= 0) {
        w_index = 0;
      } else if (unbound_index_min + j >= input_size - 1) {
        w_index = index_size - 1;
      }
      wt_ptr[w_index] += w;
      wt_max = std::max(wt_max, wt_ptr[w_index]);
      w_index++;
    }

    return wt_max;
  }

Source

View on GitHub

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free