Close

Presentation

Development of a performance portable distributed FFT interface on top of the Kokkos ecosystem
DescriptionThis paper presents the development of a performance portable distributed FFT implementation on top of the Kokkos ecosystem. Thanks to kokkos and kokkos-fft, we largely simplify the implementation details of distributed FFT with performance portability. We newly develop unique features like batched-distributed FFT and interfaces to vendor distributed FFT libraries. We demonstrate that our distributed-FFT works efficiently on NVIDIA A100 and AMD MI250X GPUs, while keeping a reasonable performance on CPUs.