Presentation
Slicing Is All You Need: Towards A Universal One-Sided Algorithm for Distributed Matrix Multiplication
DescriptionScience, data analytics, and AI workloads depend on distributed matrix multiplication. Prior work has developed an array of algorithms suitable for different problem sizes, partitionings, and replication factors. A limitation of existing algorithms is that they are limited to a subset of partitionings. Multiple algorithm implementations are required to support the full space of possible partitionings. If no algorithm implementation is available for a set of partitions, one or more operands must be redistributed, increasing communication overhead. We present a one-sided algorithm for distributed matrix multiplication supporting all combinations of partitionings and replication factors. Our algorithm uses index arithmetic to compute sets of overlapping tiles that must be multiplied together. This list of local matrix multiplies can then either be executed directly, or reordered and lowered to an optimized IR to maximize overlap. We implement our algorithm using a high-level C++-based PGAS programming framework, finding it competitive with state-of-the-art systems.
Event Type
Workshop
TimeSunday, 16 November 202511:50am - 12:10pm CST
Location230
