Presentation
High Performance Batch SVD Using GPUs
DescriptionWe consider the problem of computing the singular value decomposition (SVD) of many relatively small matrices using GPUs. This is an essential component in various scientific applications, including computational chemistry, low-rank approximations, and others. Our approach is based on the parallel one-sided Jacobi algorithm, which has a large degree of parallelism, and also heavily relies on compute-bound level-3 BLAS operations, such as matrix multiply. Our approach uses two design strategies. The first one targets very small matrices using a single GPU kernel for the entire SVD operation. The second design strategy uses a blocked version of the parallel Jacobi algorithm, which supports matrices of arbitrary dimensions. The proposed solution supports any matrix shape (square, tall-skinny, or short-wide), requires no limitations on the matrix dimensions, and delivers superior performance against state-of-the-art solutions. This work is set to be released in the MAGMA library.

Event Type
Research and ACM SRC Posters
TimeThursday, 20 November 20258:00am - 5:00pm CST
LocationSecond Floor Atrium
Archive
view
Similar Presentations

