BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:America/Chicago
X-LIC-LOCATION:America/Chicago
BEGIN:DAYLIGHT
TZOFFSETFROM:-0600
TZOFFSETTO:-0500
TZNAME:CDT
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=2SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0500
TZOFFSETTO:-0600
TZNAME:CST
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=11;BYDAY=1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20260202T201557Z
LOCATION:263-264
DTSTART;TZID=America/Chicago:20251118T153000
DTEND;TZID=America/Chicago:20251118T170000
UID:submissions.supercomputing.org_SC25_sess299@linklings.com
SUMMARY:Performance: Sparse Matrix and Tensor Computation
DESCRIPTION:StraGCN: GPU-Accelerated Strassen’s Sparse-Dense Matrix Multip
 lication for Graph Convolutional Network Training\n\nGraph convolutional n
 etworks (GCNs) are a fundamental approach to deep learning on graph-struct
 ured data. However, they face a significant challenge in training efficien
 cy due to the high computational cost of Sparse-Dense Matrix Multiplicatio
 n (SpMM). This paper presents StraGCN, the first GPU-acce...\n\n\nWeidong 
 He, Haikun Liu, Zhuohui Duan, Xiaofei Liao, Shuhao Zhang, Fubing Mao, and 
 Hai Jin (Huazhong University of Science and Technology)\n-----------------
 ----\nFaSTCC: Fast Sparse Tensor Contractions on CPUs\n\nSparse tensor con
 tractions are a core computational primitive in scientific computing and m
 achine learning. Effective optimization of such contractions through loop 
 permutation/tiling remains an open challenge. Our work performs the first 
 comprehensive comparative analysis of data access costs and m...\n\n\nSaur
 abh Raje (University of Utah), Hunter McCoy (Northeastern University), Ata
 nas Rountev (Ohio State University), Prashant Pandey (Northeastern Univers
 ity), and P. Sadayappan (University of Utah)\n---------------------\nSpars
 ified Preconditioned Conjugate Gradient Solver on GPUs\n\nPreconditioned i
 terative sparse linear solvers are memory-efficient for large scientific s
 imulations, but the dependences between iterations introduced by precondit
 ioners limit parallelization. This issue is exacerbated on GPUs, which fea
 ture many parallel cores. We propose a sparsified precondition...\n\n\nDa 
 Ma (McMaster University), Khalid Ahmad (University of Utah), Kazem Cheshmi
  (McMaster University), and Hari Sundar and Mary Hall (University of Utah)
 \n---------------------\nBridging the Gap Between Unstructured SpMM and St
 ructured Sparse Tensor Cores\n\nThe acceleration of Sparse-dense Matrix Mu
 ltiplication (SpMM) using Tensor Cores (TCs) in GPUs has recently garnered
  significant attention. TCs are designed for block-wise matrix multiplicat
 ion; however, block partitioning of general unstructured sparse matrices o
 ften results in low-level density, c...\n\n\nYukang Dong, Ziyuan Shen, Wen
 bin Jiang, Zhenghang Liu, Ye Xu, Bingyi He, Ran Zheng, and Hai Jin (Huazho
 ng University of Science and Technology)\n\nTag: HPC for Machine Learning,
  Performance Measurement, Modeling, & Tools, Programming Frameworks\n\nRec
 ording: Livestreamed, Recorded\n\nRegistration Category: Technical Program
  Reg Pass\n\nSession Chair: Abdel-Hameed A. Badawy (New Mexico State Unive
 rsity, Los Alamos National Laboratory (LANL))
END:VEVENT
END:VCALENDAR
