normal_fill_AVX2 Class — pytorch Architecture
Architecture documentation for the normal_fill_AVX2 class in DistributionTemplates.h from the pytorch codebase.
Entity Profile
Source Code
aten/src/ATen/native/cpu/DistributionTemplates.h lines 107–135
template<typename RNG>
void normal_fill_AVX2(const TensorBase &self, const float mean, const float std, RNG generator) {
float *data = self.data_ptr<float>();
auto size = self.numel();
std::lock_guard<std::mutex> lock(generator->mutex_);
for (const auto i : c10::irange(size)) {
at::uniform_real_distribution<float> uniform(0, 1);
data[i] = uniform(generator);
}
const __m256 two_pi = _mm256_set1_ps(2.0f * c10::pi<double>);
const __m256 one = _mm256_set1_ps(1.0f);
const __m256 minus_two = _mm256_set1_ps(-2.0f);
const __m256 mean_v = _mm256_set1_ps(mean);
const __m256 std_v = _mm256_set1_ps(std);
for (int64_t i = 0; i < size - 15; i += 16) {
normal_fill_16_AVX2(data + i, &two_pi, &one, &minus_two, &mean_v, &std_v);
}
if (size % 16 != 0) {
// Recompute the last 16 values.
data = data + size - 16;
for (const auto i : c10::irange(16)) {
at::uniform_real_distribution<float> uniform(0, 1);
data[i] = uniform(generator);
}
normal_fill_16_AVX2(data, &two_pi, &one, &minus_two, &mean_v, &std_v);
}
}
Source
Analyze Your Own Codebase
Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.
Try Supermodel Free