Session Full Schedule · Contributors · Organizations · Search Program · My Schedule · Happening NowMore…Search ProgramMy ScheduleHappening NowResearch and ACM SRC Posters: Poster Presentations (Research, ACM SRC Grads/Undergrads)Session ChairsKento SatoRIKEN Center for Computational Science (R-CCS)Chris SchlipaliusPawsey Supercomputing Research CentreCommonwealth Scientific and Industrial Research Organisation (CSIRO), AustraliaAnja GerbesGeorg-August-Universität GöttingenEvent TypeResearch and ACM SRC PostersTimeTuesday, 18 November 20258:00am - 5:00pm CSTLocationSecond Floor AtriumTagsResearch & ACM SRC PostersRegistration Categories TP Similar SessionsDoctoral Showcase I PresentationsPMBS25: The 16th International Workshop on Performance Modeling, Benchmarking, and Simulation of High-Performance Computer SystemsPDSW'25: The 10th International Parallel Data Systems WorkshopPresentationsIntelligent Surrogates Pay Attention to Data, Improving Multi-Objective HPC OptimizationAuthorsAshna Nawar AhmedBanooqa BandayTerry JonesTanzima Z. IslamScaling Singular Values Beyond GPU Memory Limits: Out-of-Core, GPU-Accelerated, and Unified Across Data Precision and HardwareAuthorEvelyne RingootEvaluating the Usage of Python Libraries on a Production SupercomputerAuthorThomas PapkaOptimizing the GPU All-Reduce Using Multiple Processes Per GPUAuthorsMichael AdamsAmanda BienzDiOMP-Offloading: Portable OpenMP Offloading for Distributed Heterogeneous SystemsAuthorsBaodi ShanMauricio Araya-PoloBarbara ChapmanScalable Multi-Node Multi-GPU Datalog Engine with Energy-Aware ProfilingAuthorsAhmedur Rahman ShovonSidharth KumarAccelerating Linear Solve with Mixed Precision Nested Recursive Subdivision on AI HardwareAuthorVicki CarricaUnderstanding GPU Utilization Using LDMS Data on PerlmutterAuthorsOnur CankurBrian AustinAbhinav BhateleGATSched: Multi-Objective Graph Attention Networks for Energy-Efficient HPC Job SchedulingAuthorKyrian AdimoraJob Grouping-Based Intelligent Resource Recommendation FrameworkAuthorsBeste OztopBenjamin SchwallerVitus J. LeungJim BrandtBrian KulisManuel EgeleAyse K. CoskunAdversaGuard: A Distributed Data Poisoning Benchmark for Parallel AIAuthorsYulia KumarSolomon ThomasDejaun GayleJ. Jenny LiDov KrugerUnderstanding LLM Behavior on HPC Data via Mechanistic InterpretabilityAuthorsMd Mahbubur RahmanArjun GuhaHarshitha MenonEuropean Open Web Index: Large Complex Graph VisualizationAuthorsPavlina SmolkovaKaterina SlaninovaScalable Execution Framework for R on Manycore SystemsAuthorsXiran ZhangJavier ConejeroSameh AbdulahJorge EjarqueYing SunRosa M. BadiaDavid E. KeyesMarc G. GentonChameleon Concierge: Retrieval-Augmented Generation (RAG) To Enhance Open Testbed DocumentationAuthorSaieda Ali ZadaPerformance Engineering of Scientific Applications with MVAPICH and TAU Using Emerging Communication PrimitivesAuthorsDhabaleswar K. (DK) PandaSameer ShendeAhmad AbdelfattahYifeng CuiDiffPro: Joint Timestep and Layer-Wise Precision Optimization for Efficient Diffusion InferenceAuthorsFarhana AminKanchon GharamiDimitrios S. NikolopoulosLeveraging Large Language Models for Property Prediction in Polymorphic Organic SemiconductorsAuthorsShreya PagariaMei-Yu WangDana O’ConnorJulian UranPaola BuitragoSRAP: Sender-Side Receiver-Aware Port Selection for High-Speed Multi-Flow TCPAuthorsShingo HattoriOsamu TatebeOrchid: Towards Heterogeneous Batched Eigenvalue SolversAuthorMatthew ChungMitigating I/O Bottlenecks in LiDAR Pipelines by Directly Merging Neural Decompression and Semantic SegmentationAuthorsEthan MarquezMax FaykusOyinlolu OdetoyeMelissa SmithJon CalhounThe Impact of Maximum Vector Length on Cache Management Techniques in RISC-V Vector ExtensionAuthorsShunya NomuraJiaheng LiuKeichi TakahashiHiroyuki TakizawaParallel Local Motif Counting on Large-Scale Dynamic GraphsAuthorsAli KhanSanjukta BhowmickMichela TauferCROSS-HPC System Bayesian Optimization with Adaptive TransferAuthorsAbrar HossainKishwar AhmedDistributed Modular Digital Twin Network for High-Performance and Reliable Data CentersAuthorsYan ChenXing LuCary FaulknerAlex VlachokostasHanlong WanJeremy LerondMojo: Python-Like MLIR-Based GPU Portable Science KernelsAuthorTatiana MelnichenkocsDF: A Double-Float Arithmetic Library for the Cerebras CS-2AuthorsReo NagashimaAkeru NakamuraKai MurakamiRyunosuke MatsuzakiDaichi MukunokiTakaaki MiyajimaParaViz3D: MPI Trace Visualization with 3D VideoAuthorsJean-Yves VerhaegheGeorg HagerAyesha AfzalAn Efficient GEMM Acceleration Method for LLM Inference with Variable-Length SequencesAuthorsYu ZhangLu LuAn Agent-Based Viral Venture: Adaptive Tool Selection for Scalable GenomicsAuthorsNaomi KolodisnerAlok Kamatar (Advisor)J. Greg Pauloski (Advisor)Real-Time ML-Based Defense Against Malicious Payload in Reconfigurable Embedded SystemsAuthorsRye Stahle-SmithRasha KarakchiFrom Petabytes to Predictions: Harnessing Large-Scale NeuroBlu Mental Health Data and ML To Mitigate Medication Non-AdherenceAuthorsAlyson CollinsCathy SandovalMaya SeshanSrishti SrivastavaJosh McWilliamsProcess-Based Predictors of Vulnerability ReintroductionAuthorsSamiha ShimmiNicholas SynovicMona RahimiGeorge ThiruvathukalA Quantum Solver for Multidimensional Partial Differential Equations: Practical Case StudiesAuthorsManu ChaudharyKareem El-ArabyAlvir NobelIshraq IslamManish SinghSunday OgundeleKieran EganSneha ThomasVincent VordtriedeDevon BontragerSerom KimEsam El-ArabyInference-as-a-Service Prototype at NERSCAuthorsColin ThomasPo-Han HuangHilary UtaegbulamJohannes BlaschkeBruno CoimbraPengfei DingXiangyang JuAndrew NaylorMichael WangWONDERS: Integrating WOW, PONDER, and SCALE for Enhanced Scheduling PerformanceAuthorsFabian LehmannJonathan RauJonathan BaderOdej KaoUlf LeserDetecting Silent Data Corruption in Sparse Matrices Using Hardware Performance CountersAuthorsMinseop ChoiOrlando AriasSeung Woo SonOptimizing Task-Driven Offloading in LLVMAuthorsJan KrausJoachim JenkeChristian TerbovenMixed Compute Environments with OpenCHAMIAuthorsSean GibsonRichard KimSamuel QuanTravis CottonThomas MackellMassively Parallel GPU Rasterizer for Next-Generation Computational LithographyAuthorsLoay HegazyMohamed TaherSherif HammoudaJulia with Intelligent Runtime for Heterogeneous ComputingAuthorsNarasinga Rao MiniskarPedro Valero-LaraWilliam GodoyKeita TeranishiJeffrey S. VetterFrom Legacy to Portable: An Agentic AI Workflow for Fortran Code Translation and Cross-Architecture OptimizationAuthorsSparsh GuptaKamalavasan KamalakkannanMaxim MoraruGalen ShipmanPatrick DiehlBuilding the Foundation for Machine Learning-Based Mars Weather ForecastingAuthorMohammad AltiwainyCompute System Simulator: Modeling the Impact of Allocation Policy and Hardware Reliability on HPC Cloud Resource UtilizationAuthorsJarrod LeddyHuseyin YildizProductive Scalable Distributed Task Scheduling Using an MPI-based Backend for DaggerAuthorYan GuimarãesTensor Core Accelerated Fast Multipole Method for GROMACSAuthorsJiamian HuangMuhammad Umair SadiqRio YokotaBerk HessLocal vs. Global FFT Approaches for High-Performance Ultrasound Simulation on Multi-GPU SystemsAuthorsOliver KuníkJiri JarosHarmony: Converged Supercomputer Scratch and Archival FilesystemsAuthorJake CarrollDivide, Conquer, and Denoise: Hybrid Parallel Diffusion with Memory-Aware Coarse-to-Fine InferenceAuthorsFarhana AminKanchon GharamiDimitrios NikolopoulosPhySiViT: A Physics Simulation Vision TransformerAuthorsJessica EzembaJames AffulMei-Yu WangOptimizing and Extending Periodogram Computations for AstronomyAuthorsYuwei SunLehman GarrisonIncineRate: Multi-Modal FPGA Accelerator for SCNNsAuthorsBjörn A. LindqvistArtur PodobasHydraCache: LLM Inference Prefill Parallelization Through Distributed Cache BlendingAuthorsAdib Rezaei ShahmirzadiShayan ShabihiMona MoghadampanahFurong HuangDimitrios S. NikolopoulosOptimizing Collectives with Large Payloads on GPU-Based SupercomputersAuthorsSiddharth SinghMahua SinghKeshav PradeepAbhinav BhateleEnergy-Efficient Multimodal LLM Inference: Stage-Level Characterization and Input-Aware ControlsAuthorsMona MoghadampanahAdib Rezaei ShahmirzadiDimitrios S. NikolopoulosWafer-Scale Simulation of Mutator Allele Dynamics in Large Asexual PopulationsAuthorsMatthew Andres MorenoEmily DolsonLuis ZamanHardware-Aware Quantum Circuit SynthesisAuthorsNathan JonesAkhilesh BondapalliToby CoxIan LewisRong GeBetween the NIC and a Hard Place: Evaluating 400 Gb/s Ethernet for HPC Data TransfersAuthorsAdelle FerrisEvelyn NeedhamNikole GrandezJesse MartinezDoug EganA Kokkos-Based Proxy of the Exascale Metagenome Assembler MetaHipMer2: A First Use of Kokkos for Computational BiologyAuthorsLogan WilliamsGavin ConantMichela BecchiJan CieskoAmy PowellLearning To Select Scheduling Algorithms in OpenMPAuthorsJonas H. Müller KorndörferAli MohammedAhmed EleliemyQuentin GuilloteauReto KrummenacherFlorina CiorbaUnderstanding Communication Bottlenecks in Multi-Node LLM InferenceAuthorsPrajwal SinghaniaSiddharth SinghLannie Dalton HoughIshan RevankarHarshitha MenonCharles JekelAbhinav BhateleA Scalability Study of Quantum Algorithms for Dimensionality Reduction of Multidimensional DataAuthorsKareem El-ArabyThom PopovicAlvir NobelSunday OgundeleKatherine KlymkoDaan CampsAnastasiia ButkoEsam El-ArabyDistributed 3D Gaussian Splatting for High-Resolution Isosurface VisualizationAuthorsMengjiao HanAndres SewellJoseph InsleyJanet KnowlesVictor A. MateevitsiMichael E. PapkaSteve PetruzzaSilvio RizziDivergence Prediction System for CFD SimulationsAuthorsTakashi SogaTakanori UchidaSusumu DateA Formal Characterization of Non-Monotonicity in Tensor CoresAuthorsPaul JiangVivian ZhengPractical Viability of Translating Legacy Fortran Code to C++ Using Large Language ModelsAuthorsRen ImaiMasatoshi KawaiKeichi TakahashiHiroyuki TakizawaShortcut Mixup Policy: Toward Improving Robustness and Speed in Goal-Conditioned RLAuthorsMatthew HyattYassir AtlasHal BryntesonDiego Roa PerdomoAthena AngaraMengjiao HanJoseph InsleyJanet KnowlesYongho KimVictor MateevitsiMichael PapkaSilvio RizziGeorge ThiruvathukalNicola FerrierTidalMark: A Scalable Benchmark for Coastal Water Level ForecastingAuthorsLucas RaicuDaniel GrzendaIan FosterKyle ChardA Toolbox for Load Balancing Development and Analysis in WarpX/AMReX ApplicationsAuthorsJessica Imlau DagostiniSowmya YellapragadaKevin GottRebecca Hartman-BakerExploring Fine-Grained Parallelism in Data-Flow Runtime Systems on Many-Core SystemsAuthorsWenyi WangMaxime GonthierHaibin LaiPoornima NookalaHaochen PanIan FosterIoan RaicuKyle ChardBridging the Quantum Coding Gap: Instruction-Tuned LLMs for QiskitAuthorsSixu ChenYuqi ZhangQiang GuanC++ Standard Parallelism for GPU Programming in a Particle-In-Cell ApplicationAuthorsEster El KhouryMathieu LobetJulien BigotLaurent ColombetChatHPC: Building the Foundations for a Productive and Trustworthy AI-Assisted HPC EcosystemAuthorsPedro Valero-LaraAaron YoungMohammad Alaul Haque MonilSwaroop PophaleZheming JinJeffrey S. VetterKeita TeranishiWilliam F. GodoyHigh-Performance Sparse Attention on Tensor Cores: Fused3S and BeyondAuthorZitong LiApplying Lossy Compression Techniques to GNN TrainingAuthorsMilan ShahReece NeffMichela BecchiEnabling Real-Time, Extreme-Scale Bayesian Inference: FFT-Based GPU-Accelerated Matrix-Vector Products for Block-Triangular Toeplitz MatricesAuthorsSreeram VenkatOmar GhattasClassifying Performance Bounds Using Machine LearningAuthorsLewis LittmanTom DeakinScalable Alternative Route Computation with ACE: A C++17 Library for HPC Traffic SimulationsAuthorsPaulo SilvaPavlína SmolkováKateřina SlaninováJan MartinovičJoão BarbosaMatej ŠpeťkoEmanuele VitaliWhen Label Propagation Outperforms BFS in Breadth-First Graph TraversalAuthorsKalsuda LapborisuthSrinivas AluruSync-Free GPU Parallelization of Sparse Kernels from Sequential Python CodeAuthorMalko-Bani SomoGNNs on Evolving Graphs: A Benchmark of Incremental Updates and Meta-Learning ApproachesAuthorsSriram SrinivasanSanjukta BhowmickHamdan AlabsiRand ObeidatFacilitating Mixed Python-Fortran HPC Codes: 4D Drift-Kinetic Simulations with PyccelAuthorsEmily BourneYaman GüçlüHigh Performance Batch SVD Using GPUsAuthorAhmad AbdelfattahGPU Kernels for Mixture of ExpertsAuthorsArthur FeeneyYing Wai LiAparna ChandramowlishwaranEchoes of Earth: Building an Autonomous Environmental Lab for Acoustic SensingAuthorsHudson ReynoldsAlex TueckeMike ShermanKate KeaheyTime-Stepping Hamiltonian Simulation for Solving Nonlinear PDEs via a Quantum-Classical Hybrid ApproachAuthorsSangwon KimJunya OnishiAyato TakiiYounghwa ChoTsubokura MakotoTowards a GPU-Accelerated Web-Based Graph Rendering Framework for Large-Scale Protein NetworksAuthorsJiaxin LuLandon DykenShilpika ShilpikaVenkatram VishwanathMichael PapkaSidharth KumarMassively Parallel Bayesian Inference Framework for GPU Supercomputers: Application to Estimation of Coseismic Fault SlipAuthorsKai NakaoTsuyoshi IchimuraKohei FujitaAnalyzing Dataset Popularity for Optimizing In-Network StorageAuthorsGunwoo KimAlex SimKesheng WuExplicit Low-Order Finite-Element Wave Simulation Accelerated with Variable-Precision Computing Using INT8 Tensor CoresAuthorsKohei FujitaTsuyoshi IchimuraMuneo HoriLalith MaddegedaraCan Lossy Compression Benefit NVMe-Based I/O?AuthorsDarren NgDuo ZhangSheng DiZhaorui ZhangXiaoyi LuMemory-Efficient CFD Based on MPS: Effective One-Billion-Cell Resolution on a Single NodeAuthorsJunya OnishiAyato TakiiSangwon KimYounghwa ChoMakoto TsubokuraMPI-SGX: Enabling Confidential Computing for MPI Parallel Applications with Intel SGX TechnologyAuthorsKota ShimojimaHayato YamakiHiroki HondaShinichiro MatsuoAtsuko TakefusaShinobu MiwaCUR-MoE: Portable Mixture-of-Experts with Interpretable High-Ratio CompressionAuthorRitesh BhirudShipping HPC Ecosystems Across Platforms: Portable and Composable HPC Clusters as CodeAuthorsGerman Felipe Giraldo VillaThéo GrivelGeorge IoannidisEdita KizinevicCarolina LindqvistNicolas LitchinkoPablo LlopisAntonio Javier RussoGilles FouresteyUnmasking Performance Variability in GPU Codes on SupercomputersAuthorsCunyang WeiKeshav PradeepAbhinav BhateleNovel Graph Alignment Algorithms for Identifying Non-Determinism in Large-Scale SimulationsAuthorDhroov PandeyCan Long-Haul RDMA Benefit Federated Learning?AuthorsZhonghao ChenYuke LiDuo ZhangXiaoyi LuUsing Hardware Metrics To Understand Performance of the RAJA Performance Suite Kernels in Different GPU Modes on MI300AAuthorsAmr AbouelmagdStephanie BrinkMichael McKinseyDavid BoehmeJason BurmarkBrian RyujinTom ScoglandOlga PearceVaultX Merge: Breaking Memory Barriers in Proof-of-Space Plot GenerationAuthorsArnav SirigereVarvara BondarenkoIoan RaicuAlgorithms and Applications of Dynamic Network Analysis Using CANDYAuthorsAashish PandeyArindam KhandaS.M. ShovanAli Y. KhanBoyana NorrisSajal K. DasSanjukta BhowmickAccelerating Scientific Workflows with LLM-Driven Compiler Optimizations for Generated High-Performance HardwareAuthorsRobert RamstadNicolas Bohm AgostiniAntonino TumeoJACC: Easy CPU/GPU Performance Portability for Scientific Applications in JuliaAuthorsWilliam GodoyPedro Valero-LaraPhilip FacklerKeita TeranishiJeffrey VetterJhonny GonzalezJose GonzalezAlexis HuanteCATIOS: Time-Resolved I/O-Aware Job Scheduling for HPC SystemsAuthorsYuTsen TsengMasatoshi KawaiKeichi TakahashiHiroyuki TakizawaNumerical Investigation of Radiation Hydrodynamic Instabilities at Scale with FleCSI-HARDAuthorsMåns I. AnderssonIsaac C. BannermanMoon B. HazarikaAkshit JariwalaJonathan MathurinMadela B. QuashieJulien LoiseauHyun LimAutoSlim: Intelligent Automata Graph Optimization for Efficient AccelerationAuthorsTiffany YuRasha KarakchiEvaluating the Power-Monitoring Capabilities of AuroraAuthorPrecious EyabiWiCAT: Reducing Congestion at Wireless Interfaces in Heterogeneous ArchitecturesAuthorTarun SharmaFast Linear Solvers via AI-Tuned Markov Chain Monte Carlo-Based Matrix InversionAuthorsAnton LebedevWon Kyung LeeSoumyadip GhoshOlha I. YamanVassilis KalantzisYingdong LuTomasz NowickiShashanka UbaruLior HoreshVassil AlexandrovEnabling Efficient Runtime Data Analysis to a Crystal Deformation SimulationAuthorArthur JaquardCIRE: LLVM Analysis for Floating-Point Rounding Error Affected by Precision and OptimizationsAuthorsCayden LundTanmay TirpankarGanesh GopalakrishnanSeamless Scaling of Applications Across Programming ModelsAuthorsReto KrummenacherQuentin GuilloteauJonas H. Müller KorndörferFlorina M. CiorbaRange Search on Heterogeneous Systems with Processing-in-Memory ArchitectureAuthorsTasmia JannatSatish PuriMichael GowanlockAccelerating AI Co-Scientists with HPC InfrastructureAuthorSuryatejas AppanaAn Approach for Correlating Compiler Optimizations with Runtime PerformanceAuthorsBefikir BogaleOlga PearceTom ScoglandMichela TauferCharacterizing Performance and Energy Trade-Offs on the Aurora SupercomputerAuthorsSolomon BekeleSwann PerarnauBrice VideauConfiguring Large Language Models for Regional Ocean Model DevelopmentAuthorsAidan JanneyGiovanni Seijo-EllisDan AmrheinMulti-GPU Implementation and Roofline Analysis of a Numerical Global Ocean ModelAuthorsTakateru YamagishiMasao KurogiTakao KawasakiYoshimasa MatsumuraHiroyasu HasumiLuthier: A Dynamic Binary Instrumentation Framework Targeting AMD GPUsAuthorsMatin Raayai-ArdakaniNorman RubinDavid KaeliUnraveling Distant Galaxies: Analyzing IFU Data with Parsl and AcademyAuthorDaniel BabniggEnhancing Usability and Performance in Experimental Environments ManagementAuthorsZahra TemoriPaul MarshallKate KeaheyForward Error Bounds and Efficient Algorithms for Computing a Tensor Times Matrix Chain in Low Precision on GPUsAuthorsJulian BellavitaPiyush SaoRamakrishnan KannanHeterogeneity-Aware Task Allocation for Modern HPC SystemsAuthorsSowmya YellapragadaJessica Imlau DagostiniKevin GottRebecca Hartman-BakerTemplate Task-Based Multiresolution Analysis in Hybrid EnvironmentsAuthorsNilesh ChaturvediJoseph SchuchartRobert J. HarrisonEvaluating LiDAR Compression for 3D Semantic Segmentation in Diverse Off-Road Environments on GOOSE DatasetAuthorsAdam NiemczuraMax FaykusOyinlolu OdetoyeMelissa SmithJon CalhounScott GroelScODA: An Emerging Pipeline for Evaluating Distributed Database Performance To Support Operational Data AnalyticsAuthorsNicholas SynovicFNU ShilpikaSilvio RizziDoug WaldronGeorge K. ThiruvathukalMichael E. PapkaTowards Application Agnostic HPC ProfilingAuthorsHari Teja JajulaDhruva KulkarniBrian AustinPurushotham BangaloreAdvancing EEG Signal Analysis with Quantum Machine LearningAuthorsStephanie MurrayErika ParsonsUnified Performance Modeling Stack for Distributed GPU Applications: Complementing Analytical Insights with Machine LearningAuthorUrvij Saroliya