cpu_masked_scatter_kernel Class — pytorch Architecture
Architecture documentation for the cpu_masked_scatter_kernel class in IndexKernel.cpp from the pytorch codebase.
Entity Profile
Source Code
aten/src/ATen/native/cpu/IndexKernel.cpp lines 359–382
template <typename scalar_t>
void cpu_masked_scatter_kernel(TensorIterator& iter, const TensorBase& source) {
std::ptrdiff_t source_cntr = 0;
const scalar_t* source_ptr = source.const_data_ptr<scalar_t>();
auto numel = source.numel();
auto loop = [&](char** data, const int64_t* strides, int64_t n) {
char* dst = data[0];
const int64_t dst_stride = strides[0];
char* mask = data[1];
const int64_t mask_stride = strides[1];
for (const auto i : c10::irange(n)) {
auto mask_value = c10::load(reinterpret_cast<bool*>(mask + mask_stride * i));
if (mask_value) {
TORCH_CHECK(source_cntr < numel, "Number of elements of source < number of ones in mask");
*(scalar_t*)(dst + dst_stride * i) = c10::load(source_ptr);
source_ptr++;
source_cntr++;
}
}
};
iter.serial_for_each(loop, {0, iter.numel()});
}
Source
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free