normal_fill_AVX2 Class — pytorch Architecture

Architecture documentation for the normal_fill_AVX2 class in DistributionTemplates.h from the pytorch codebase.

Class c

Entity Profile

Source Code

aten/src/ATen/native/cpu/DistributionTemplates.h lines 107–135

template<typename RNG>
void normal_fill_AVX2(const TensorBase &self, const float mean, const float std, RNG generator) {
  float *data = self.data_ptr<float>();
  auto size = self.numel();
  std::lock_guard<std::mutex> lock(generator->mutex_);
  for (const auto i : c10::irange(size)) {
    at::uniform_real_distribution<float> uniform(0, 1);
    data[i] = uniform(generator);
  }
  const __m256 two_pi = _mm256_set1_ps(2.0f * c10::pi<double>);
  const __m256 one = _mm256_set1_ps(1.0f);
  const __m256 minus_two = _mm256_set1_ps(-2.0f);
  const __m256 mean_v = _mm256_set1_ps(mean);
  const __m256 std_v = _mm256_set1_ps(std);

  for (int64_t i = 0; i < size - 15; i += 16) {
    normal_fill_16_AVX2(data + i, &two_pi, &one, &minus_two, &mean_v, &std_v);
  }

  if (size % 16 != 0) {
    // Recompute the last 16 values.
    data = data + size - 16;
    for (const auto i : c10::irange(16)) {
      at::uniform_real_distribution<float> uniform(0, 1);
      data[i] = uniform(generator);
    }
    normal_fill_16_AVX2(data, &two_pi, &one, &minus_two, &mean_v, &std_v);
  }
}

Source

View on GitHub

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free