BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260202T201804Z
LOCATION:230
DTSTART;TZID=America/Chicago:20251120T110000
DTEND;TZID=America/Chicago:20251120T111500
UID:submissions.supercomputing.org_SC25_sess534_drs110@linklings.com
SUMMARY:Managing Heterogeneous Topologies and Understanding Their Impact o
 n Performance
DESCRIPTION:Stepan Vanecek (Technical University of Munich)\n\nTo solve in
 creasingly complex problems more efficiently, modern HPC systems feature h
 ighly heterogeneous components: CPUs, GPUs, and recently QPUs (quantum pro
 cessing units), each with a unique, complex compute topology. The massive 
 parallelism of GPUs, combined with emerging memory technologies on CPUs an
 d GPUs, makes the memory topologies increasingly heterogeneous, complex, a
 nd dynamically configurable. Understanding these topological details, espe
 cially regarding available memory and its usage, is essential to operating
  the systems and applications efficiently.\n\nThis thesis presents a frame
 work targeting several fundamental gaps in the currently available researc
 h and tooling: sys-sage, MT4G, GPUscout, and Mitos modeling. At the core, 
 the sys-sage library offers a unified approach to maintaining static and d
 ynamic topological information from different sources and APIs. Its univer
 sal architecture handles CPUs, GPUs, and QPUs alike. MT4G provides an othe
 rwise unavailable, vendor-agnostic, and complete report on GPU memory topo
 logies, integrable with sys-sage. GPUs' massive parallelism amplifies the 
 potential performance penalties of improper cache and memory usage. Theref
 ore, GPUscout identifies root causes of frequently occurring memory-relate
 d bottlenecks, helping users efficiently utilize the complex memory subsys
 tem of GPUs. Finally, to address emerging memory technologies, such as CXL
 .mem, this thesis presents a novel data access modeling workflow as an ext
 ension of Mitos. The model predicts the performance impact of CXL.mem-base
 d cross-node shared-buffer data exchange as an alternative to point-to-poi
 nt MPI communication. Altogether, these tools capture topologies of HPC sy
 stems and provide missing insights into application data transfer behavior
 .\n\nTag: Research & ACM SRC Posters\n\nRecording: Livestreamed, Recorded\
 n\nRegistration Category: Technical Program Reg Pass\n\nSession Chairs: Ya
 nfei Guo (Argonne National Laboratory (ANL)); Shirley Moore (University of
  Texas at El Paso); Kento Sato (RIKEN Center for Computational Science (R-
 CCS)); Chris Schlipalius (Pawsey Supercomputing Research Centre; Commonwea
 lth Scientific and Industrial Research Organisation (CSIRO), Australia); a
 nd Anja Gerbes (Georg-August-Universität Göttingen)\n\n
END:VEVENT
END:VCALENDAR
