Close

Presentation

Roofline Analysis of Tightly-Coupled CPU-GPU Superchips: A Study on MI300A and GH200
DescriptionThe introduction of tightly-coupled heterogeneous architectures, such as AMD's MI300A and NVIDIA's Grace-Hopper(GH200), address a bottleneck in accelerated computing, namely the CPU-GPU interface.
Whereas the GH200 can be seen as a technological leap in CPU-GPU connectivity greatly exceeding PCIe cadence, the unified memory architecture of the MI300A APU enables seamless communication through coherent caches.
When the CPU and GPU execute concurrently, they contend not only for finite bandwidth, but also contend power in a power-constrained environment.
In this paper, we extend the well-established Roofline model to capture the performance implications of contention in concurrent execution on the MI300A and GH200.
We enhance this by noting the impact of different memory allocators, the randomness of data, and the host and device arithmetic intensity.
We conclude with a discussion on the evolution of GPU architectures and the impact in performance, portability, and programmability that emerging tightly-coupled GPUs bring to the HPC landscape.