Home / Function/ test_benchmark_tile_reduce_various_sizes() — pytorch Function Reference

test_benchmark_tile_reduce_various_sizes() — pytorch Function Reference

Architecture documentation for the test_benchmark_tile_reduce_various_sizes() function in bench_nvshmem_tile_reduce.py from the pytorch codebase.

Entity Profile

Dependency Diagram

graph TD
  2b246b4f_aea8_6874_ed04_0bed1d42b788["test_benchmark_tile_reduce_various_sizes()"]
  1fb81840_0784_4f41_94ba_38ef34fe84e5["_benchmark_tile_reduce_single()"]
  2b246b4f_aea8_6874_ed04_0bed1d42b788 -->|calls| 1fb81840_0784_4f41_94ba_38ef34fe84e5
  style 2b246b4f_aea8_6874_ed04_0bed1d42b788 fill:#6366f1,stroke:#818cf8,color:#fff

Relationship Graph

Source Code

benchmarks/distributed/bench_nvshmem_tile_reduce.py lines 133–182

    def test_benchmark_tile_reduce_various_sizes(self) -> None:
        """
        Benchmark tile reduce across various matrix sizes.
        """
        # Test various matrix sizes
        tile_sizes = [512, 1024, 2048, 4096, 8192, 16384]
        full_size = tile_sizes[-1]
        warmup_iters = 5
        bench_iters = 20

        results = []

        for tile_size in tile_sizes:
            try:
                result = self._benchmark_tile_reduce_single(
                    full_size, tile_size, warmup_iters, bench_iters
                )
                results.append(result)

                if self.rank == 0:
                    print(
                        f"Matrix Size: {full_size}x{full_size}, Tile Size: {tile_size}x{tile_size}"
                    )
                    print(
                        f"  Mean Time: {result['mean_time_ms']:.3f} ± {result['std_time_ms']:.3f} ms"
                    )
                    print(f"  Throughput: {result['throughput_gb_s']:.2f} GB/s")
                    print(f"  Bytes: {result['tile_bytes']:.0f}")
                    print()

            except Exception as e:
                if self.rank == 0:
                    print(f"Failed to benchmark matrix size {full_size}: {e}")

        # Print summary
        if self.rank == 0 and results:
            print("=== BENCHMARK SUMMARY ===")
            print(
                f"{'Matrix Size':<12} {'Tile Size':<10} {'Time (ms)':<12} {'Throughput (GB/s)':<18} {'Bytes':<15}"
            )
            print("-" * 70)

            for result in results:
                print(
                    f"{result['full_size']}x{result['full_size']:<7} "
                    f"{result['tile_size']}x{result['tile_size']:<5} "
                    f"{result['mean_time_ms']:<12.3f} "
                    f"{result['throughput_gb_s']:<18.2f} "
                    f"{result['tile_bytes']:<15.0f}"
                )

Domain

Subdomains

Frequently Asked Questions

What does test_benchmark_tile_reduce_various_sizes() do?
test_benchmark_tile_reduce_various_sizes() is a function in the pytorch codebase.
What does test_benchmark_tile_reduce_various_sizes() call?
test_benchmark_tile_reduce_various_sizes() calls 1 function(s): _benchmark_tile_reduce_single.

Analyze Your Own Codebase

Get architecture documentation, dependency graphs, and domain analysis for your codebase in minutes.

Try Supermodel Free