Close

Presentation

Benchmark-Driven Models for Energy Analysis and Attribution of GPU-Accelerated Supercomputing
DescriptionAs advances in energy-efficiency become the primary limiter to increases in power-constrained supercomputing and machine learning performance, it is imperative that developers, architects, and practitioners understand how modern GPUs consume energy when running HPC and ML applications.

Rather than opaque coarse-grained metrics, in this paper, we develop an extensible, microbenchmark-parameterized energy model that is capable not only of attributing application energy by functional unit (FPU, tensor core, integer ALU) and memory level (L1, L2, HBM), but also of differentiating control energy from datapath energy.

We examine trends in energy per operation among four generations of GPUs and validate our results using supercomputing and ML/AI procurement workloads. Our insights and extrapolations can be used to drive the future of CMOS and memory technologies, computer architecture research, algorithmic innovation, optimizations for power-constrained and mobile environments, and data center operations.