Close

Presentation

Using Hardware Metrics To Understand Performance of the RAJA Performance Suite Kernels in Different GPU Modes on MI300A
DescriptionModern GPUs play a crucial role in accelerating a wide range of computational workloads. However, their performance is often limited by the memory access patterns of the kernels they execute. AMD’s MI300A APU supports multiple logical GPU partitioning modes to optimize compute resource allocation, offering new opportunities for performance tuning. In this work, we evaluate how different GPU kernels from the RAJA Performance Suite perform in various partitioning modes. Using hardware counters, we compare two kernels with identical computational complexity but different data layouts, highlighting how memory organization can influence performance outcomes. The results demonstrate that data layout and access patterns have a significant impact on runtime performance across different partitioning modes, even when computational complexity and problem size remain constant.