Close

Presentation

Enabling Real-Time, Extreme-Scale Bayesian Inference: FFT-Based GPU-Accelerated Matrix-Vector Products for Block-Triangular Toeplitz Matrices
DescriptionAdjoint-based, matrix-free Newton-Krylov methods have long been the gold standard for solving high-dimensional, ill-posed inverse problems. These methods require a pair of forward and adjoint PDE solves per iteration, usually making them intractable for real-time inference and prediction. We present FFTMatvec, an FFT-based GPU-accelerated algorithm that exploits intrinsic problem structure to enable real-time, high-fidelity, extreme-scale inference and prediction for linear autonomous dynamical systems. This algorithm was used to solve a Bayesian inverse problem for tsunami early warning with over one billion parameters in under 0.2 seconds. The application is performance-portable and open-source; scaling results are presented for up to 4,096 GPUs on OLCF's Frontier and NERSC's Perlmutter supercomputers. On 512 GPUs, FFTMatvec achieves more than a 200,000x speedup over state-of-the-art matrix-free adjoint-based methods. Communication-aware partitioning and dynamic mixed precision provide additional performance boosts. Other application areas include nuclear treaty verification and monitoring atmospheric CO2.