_compute_indices_min_size_weights Class — pytorch Architecture
Architecture documentation for the _compute_indices_min_size_weights class in UpSampleKernel.cpp from the pytorch codebase.
Entity Profile
Source Code
aten/src/ATen/native/cpu/UpSampleKernel.cpp lines 791–845
template <typename scalar_t, typename aa_filter_fn_t>
static inline scalar_t _compute_indices_min_size_weights(
const int64_t i, const int64_t input_size, const scalar_t scale,
scalar_t* wt_ptr, const int64_t max_interp_size, aa_filter_fn_t filter_fn,
bool align_corners, int64_t& index_min, int64_t& index_size
) {
// Notes. We do not use opmath_t in this method as f16 and other smaller float types are not routed here.
// Typical usage of this method is with scalar_t = double when computing indices and weights for uint8 input
// The code below partly adapts indices and lambda computation from compute_indices_weights method and
// index_min/index_size from _compute_indices_min_size_weights_aa
bool cubic = max_interp_size > 2;
const auto real_input_index = area_pixel_compute_source_index<scalar_t>(
scale, i, align_corners, /*cubic=*/cubic);
scalar_t lambda;
int64_t input_index = 0;
guard_index_and_lambda(real_input_index, input_size, input_index, lambda);
const auto support = static_cast<int64_t>(max_interp_size * 0.5);
const auto unbound_index_min = input_index - support + 1;
const auto unbound_index_max = input_index + support + 1;
index_min = std::max(unbound_index_min, static_cast<int64_t>(0));
index_size = std::min(unbound_index_max, input_size) - index_min;
// There are rare cases when due to precision xsize can be larger than max_interp_size by one.
// We have to clip the value
index_size = std::clamp(index_size, static_cast<int64_t>(0), max_interp_size);
// Below the weights are computed using filter_fn and accumulating values for indices being out of bounds
// For example, for bicubic mode for output index i = 0, we have input_index = -1,
// then we have unbound_index_min = -2 and unbound_index_max = 1 => unbounded input indices are [-2, -1, 0, 1] and
// valid input indices will be [0, 1]
// For unbounded input indices we compute four non-zero weights values [w0, w1, w2, w3] and as only two weights can
// be used with valid input indcies, we accumulate values in the following way: [w0 + w1 + w2, w3, 0.0, 0.0]
// This is equivalent to the float path which would compute indices as [0, 0, 0, 1] and weights as [w0, w1, w2, s3].
// A similar accumulation should done for unbounded indices larger than input size.
auto w_index = 0;
scalar_t wt_max = 0.0;
for (const auto j : c10::irange(max_interp_size)) {
// initialize weights value as we will accumulate below
wt_ptr[j] = 0.0;
scalar_t w = filter_fn(static_cast<scalar_t>(j + 1 - support) - lambda);
if (unbound_index_min + j <= 0) {
w_index = 0;
} else if (unbound_index_min + j >= input_size - 1) {
w_index = index_size - 1;
}
wt_ptr[w_index] += w;
wt_max = std::max(wt_max, wt_ptr[w_index]);
w_index++;
}
return wt_max;
}
Source
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free