Close

Presentation

Multi-GPU Implementation and Roofline Analysis of a Numerical Global Ocean Model
DescriptionNumerical ocean models are essential tools for climate prediction and marine resource studies, requiring high resolution and realistic physical processes. We developed the global ocean model COCO and implemented it on GPUs using an OpenACC directive-based approach, while maintaining compatibility with CPUs. Performance was evaluated on the Miyabi supercomputer, which includes GPU-based (NVIDIA GH200) and CPU-based (Intel Xeon MAX 9480) systems. Realistic ocean experiments with a 0.17° global grid showed that most components achieved faster execution on GPUs, with the tracer calculation accelerated by a factor of 2.9. Roofline analysis revealed that most loops were memory-bound, and GPU speedup was constrained by memory bandwidth rather than compute capability. Future improvements will require increasing arithmetic intensity and applying kernel-level optimizations, while ensuring compatibility between CPU- and GPU-based codes.