Session Full Schedule · Contributors · Organizations · Search Program · My Schedule · Happening NowMore…Search ProgramMy ScheduleHappening NowResearch and ACM SRC Posters: Poster Presentations (Research, ACM SRC Grads/Undergrads)Session ChairsKento SatoRIKEN Center for Computational Science (R-CCS)Chris SchlipaliusPawsey Supercomputing Research CentreCommonwealth Scientific and Industrial Research Organisation (CSIRO), AustraliaAnja GerbesGeorg-August-Universität GöttingenEvent TypeResearch and ACM SRC PostersTimeWednesday, 19 November 20258:00am - 7:00pm CSTLocationSecond Floor AtriumTagsResearch & ACM SRC PostersRegistration Categories TP Similar SessionsDoctoral Showcase I PresentationsPMBS25: The 16th International Workshop on Performance Modeling, Benchmarking, and Simulation of High-Performance Computer SystemsPDSW'25: The 10th International Parallel Data Systems WorkshopPresentationsAccelerating AI Co-Scientists with HPC InfrastructureAuthorSuryatejas AppanaA Kokkos-Based Proxy of the Exascale Metagenome Assembler MetaHipMer2: A First Use of Kokkos for Computational BiologyAuthorsLogan WilliamsGavin ConantMichela BecchiJan CieskoAmy PowellGNNs on Evolving Graphs: A Benchmark of Incremental Updates and Meta-Learning ApproachesAuthorsSriram SrinivasanSanjukta BhowmickHamdan AlabsiRand ObeidatDiffPro: Joint Timestep and Layer-Wise Precision Optimization for Efficient Diffusion InferenceAuthorsFarhana AminKanchon GharamiDimitrios S. NikolopoulosLuthier: A Dynamic Binary Instrumentation Framework Targeting AMD GPUsAuthorsMatin Raayai-ArdakaniNorman RubinDavid KaeliDistributed 3D Gaussian Splatting for High-Resolution Isosurface VisualizationAuthorsMengjiao HanAndres SewellJoseph InsleyJanet KnowlesVictor A. MateevitsiMichael E. PapkaSteve PetruzzaSilvio RizziScalable Multi-Node Multi-GPU Datalog Engine with Energy-Aware ProfilingAuthorsAhmedur Rahman ShovonSidharth KumarDivergence Prediction System for CFD SimulationsAuthorsTakashi SogaTakanori UchidaSusumu DateAdversaGuard: A Distributed Data Poisoning Benchmark for Parallel AIAuthorsYulia KumarSolomon ThomasDejaun GayleJ. Jenny LiDov KrugerGATSched: Multi-Objective Graph Attention Networks for Energy-Efficient HPC Job SchedulingAuthorKyrian AdimoraPractical Viability of Translating Legacy Fortran Code to C++ Using Large Language ModelsAuthorsRen ImaiMasatoshi KawaiKeichi TakahashiHiroyuki TakizawaEuropean Open Web Index: Large Complex Graph VisualizationAuthorsPavlina SmolkovaKaterina SlaninovaFast Linear Solvers via AI-Tuned Markov Chain Monte Carlo-Based Matrix InversionAuthorsAnton LebedevWon Kyung LeeSoumyadip GhoshOlha I. YamanVassilis KalantzisYingdong LuTomasz NowickiShashanka UbaruLior HoreshVassil AlexandrovMPI-SGX: Enabling Confidential Computing for MPI Parallel Applications with Intel SGX TechnologyAuthorsKota ShimojimaHayato YamakiHiroki HondaShinichiro MatsuoAtsuko TakefusaShinobu MiwaCIRE: LLVM Analysis for Floating-Point Rounding Error Affected by Precision and OptimizationsAuthorsCayden LundTanmay TirpankarGanesh GopalakrishnanHarmony: Converged Supercomputer Scratch and Archival FilesystemsAuthorJake CarrollCompute System Simulator: Modeling the Impact of Allocation Policy and Hardware Reliability on HPC Cloud Resource UtilizationAuthorsJarrod LeddyHuseyin YildizCATIOS: Time-Resolved I/O-Aware Job Scheduling for HPC SystemsAuthorsYuTsen TsengMasatoshi KawaiKeichi TakahashiHiroyuki TakizawaNovel Graph Alignment Algorithms for Identifying Non-Determinism in Large-Scale SimulationsAuthorDhroov PandeyApplying Lossy Compression Techniques to GNN TrainingAuthorsMilan ShahReece NeffMichela BecchiProductive Scalable Distributed Task Scheduling Using an MPI-based Backend for DaggerAuthorYan GuimarãesSeamless Scaling of Applications Across Programming ModelsAuthorsReto KrummenacherQuentin GuilloteauJonas H. Müller KorndörferFlorina M. CiorbaOrchid: Towards Heterogeneous Batched Eigenvalue SolversAuthorMatthew ChungEvaluating the Power-Monitoring Capabilities of AuroraAuthorPrecious EyabiEnabling Real-Time, Extreme-Scale Bayesian Inference: FFT-Based GPU-Accelerated Matrix-Vector Products for Block-Triangular Toeplitz MatricesAuthorsSreeram VenkatOmar GhattasDetecting Silent Data Corruption in Sparse Matrices Using Hardware Performance CountersAuthorsMinseop ChoiOrlando AriasSeung Woo SonConfiguring Large Language Models for Regional Ocean Model DevelopmentAuthorsAidan JanneyGiovanni Seijo-EllisDan AmrheinMemory-Efficient CFD Based on MPS: Effective One-Billion-Cell Resolution on a Single NodeAuthorsJunya OnishiAyato TakiiSangwon KimYounghwa ChoMakoto TsubokuraWafer-Scale Simulation of Mutator Allele Dynamics in Large Asexual PopulationsAuthorsMatthew Andres MorenoEmily DolsonLuis ZamanThe Impact of Maximum Vector Length on Cache Management Techniques in RISC-V Vector ExtensionAuthorsShunya NomuraJiaheng LiuKeichi TakahashiHiroyuki TakizawaAlgorithms and Applications of Dynamic Network Analysis Using CANDYAuthorsAashish PandeyArindam KhandaS.M. ShovanAli Y. KhanBoyana NorrisSajal K. DasSanjukta BhowmickMulti-GPU Implementation and Roofline Analysis of a Numerical Global Ocean ModelAuthorsTakateru YamagishiMasao KurogiTakao KawasakiYoshimasa MatsumuraHiroyasu HasumiA Quantum Solver for Multidimensional Partial Differential Equations: Practical Case StudiesAuthorsManu ChaudharyKareem El-ArabyAlvir NobelIshraq IslamManish SinghSunday OgundeleKieran EganSneha ThomasVincent VordtriedeDevon BontragerSerom KimEsam El-ArabyJACC: Easy CPU/GPU Performance Portability for Scientific Applications in JuliaAuthorsWilliam GodoyPedro Valero-LaraPhilip FacklerKeita TeranishiJeffrey VetterJhonny GonzalezJose GonzalezAlexis HuanteWONDERS: Integrating WOW, PONDER, and SCALE for Enhanced Scheduling PerformanceAuthorsFabian LehmannJonathan RauJonathan BaderOdej KaoUlf LeserEvaluating LiDAR Compression for 3D Semantic Segmentation in Diverse Off-Road Environments on GOOSE DatasetAuthorsAdam NiemczuraMax FaykusOyinlolu OdetoyeMelissa SmithJon CalhounScott GroelGPU Kernels for Mixture of ExpertsAuthorsArthur FeeneyYing Wai LiAparna ChandramowlishwaranLocal vs. Global FFT Approaches for High-Performance Ultrasound Simulation on Multi-GPU SystemsAuthorsOliver KuníkJiri JarosOptimizing Task-Driven Offloading in LLVMAuthorsJan KrausJoachim JenkeChristian TerbovenUnderstanding LLM Behavior on HPC Data via Mechanistic InterpretabilityAuthorsMd Mahbubur RahmanArjun GuhaHarshitha MenonUnraveling Distant Galaxies: Analyzing IFU Data with Parsl and AcademyAuthorDaniel BabniggHeterogeneity-Aware Task Allocation for Modern HPC SystemsAuthorsSowmya YellapragadaJessica Imlau DagostiniKevin GottRebecca Hartman-BakerAn Agent-Based Viral Venture: Adaptive Tool Selection for Scalable GenomicsAuthorsNaomi KolodisnerAlok Kamatar (Advisor)J. Greg Pauloski (Advisor)Massively Parallel GPU Rasterizer for Next-Generation Computational LithographyAuthorsLoay HegazyMohamed TaherSherif HammoudaEnhancing Usability and Performance in Experimental Environments ManagementAuthorsZahra TemoriPaul MarshallKate KeaheyEvaluating the Usage of Python Libraries on a Production SupercomputerAuthorThomas PapkaUnified Performance Modeling Stack for Distributed GPU Applications: Complementing Analytical Insights with Machine LearningAuthorUrvij SaroliyaMassively Parallel Bayesian Inference Framework for GPU Supercomputers: Application to Estimation of Coseismic Fault SlipAuthorsKai NakaoTsuyoshi IchimuraKohei FujitaPerformance Engineering of Scientific Applications with MVAPICH and TAU Using Emerging Communication PrimitivesAuthorsDhabaleswar K. (DK) PandaSameer ShendeAhmad AbdelfattahYifeng CuiIncineRate: Multi-Modal FPGA Accelerator for SCNNsAuthorsBjörn A. LindqvistArtur PodobasChameleon Concierge: Retrieval-Augmented Generation (RAG) To Enhance Open Testbed DocumentationAuthorSaieda Ali ZadaWhen Label Propagation Outperforms BFS in Breadth-First Graph TraversalAuthorsKalsuda LapborisuthSrinivas AluruInference-as-a-Service Prototype at NERSCAuthorsColin ThomasPo-Han HuangHilary UtaegbulamJohannes BlaschkeBruno CoimbraPengfei DingXiangyang JuAndrew NaylorMichael WangA Scalability Study of Quantum Algorithms for Dimensionality Reduction of Multidimensional DataAuthorsKareem El-ArabyThom PopovicAlvir NobelSunday OgundeleKatherine KlymkoDaan CampsAnastasiia ButkoEsam El-ArabyAutoSlim: Intelligent Automata Graph Optimization for Efficient AccelerationAuthorsTiffany YuRasha KarakchiScalable Execution Framework for R on Manycore SystemsAuthorsXiran ZhangJavier ConejeroSameh AbdulahJorge EjarqueYing SunRosa M. BadiaDavid E. KeyesMarc G. GentonMitigating I/O Bottlenecks in LiDAR Pipelines by Directly Merging Neural Decompression and Semantic SegmentationAuthorsEthan MarquezMax FaykusOyinlolu OdetoyeMelissa SmithJon CalhounExploring Fine-Grained Parallelism in Data-Flow Runtime Systems on Many-Core SystemsAuthorsWenyi WangMaxime GonthierHaibin LaiPoornima NookalaHaochen PanIan FosterIoan RaicuKyle ChardOptimizing Collectives with Large Payloads on GPU-Based SupercomputersAuthorsSiddharth SinghMahua SinghKeshav PradeepAbhinav BhateleAccelerating Linear Solve with Mixed Precision Nested Recursive Subdivision on AI HardwareAuthorVicki CarricaReal-Time ML-Based Defense Against Malicious Payload in Reconfigurable Embedded SystemsAuthorsRye Stahle-SmithRasha KarakchiC++ Standard Parallelism for GPU Programming in a Particle-In-Cell ApplicationAuthorsEster El KhouryMathieu LobetJulien BigotLaurent ColombetTensor Core Accelerated Fast Multipole Method for GROMACSAuthorsJiamian HuangMuhammad Umair SadiqRio YokotaBerk HessDivide, Conquer, and Denoise: Hybrid Parallel Diffusion with Memory-Aware Coarse-to-Fine InferenceAuthorsFarhana AminKanchon GharamiDimitrios NikolopoulosOptimizing and Extending Periodogram Computations for AstronomyAuthorsYuwei SunLehman GarrisonScalable Alternative Route Computation with ACE: A C++17 Library for HPC Traffic SimulationsAuthorsPaulo SilvaPavlína SmolkováKateřina SlaninováJan MartinovičJoão BarbosaMatej ŠpeťkoEmanuele VitaliExplicit Low-Order Finite-Element Wave Simulation Accelerated with Variable-Precision Computing Using INT8 Tensor CoresAuthorsKohei FujitaTsuyoshi IchimuraMuneo HoriLalith MaddegedaraShortcut Mixup Policy: Toward Improving Robustness and Speed in Goal-Conditioned RLAuthorsMatthew HyattYassir AtlasHal BryntesonDiego Roa PerdomoAthena AngaraMengjiao HanJoseph InsleyJanet KnowlesYongho KimVictor MateevitsiMichael PapkaSilvio RizziGeorge ThiruvathukalNicola FerrierOptimizing the GPU All-Reduce Using Multiple Processes Per GPUAuthorsMichael AdamsAmanda BienzEchoes of Earth: Building an Autonomous Environmental Lab for Acoustic SensingAuthorsHudson ReynoldsAlex TueckeMike ShermanKate KeaheyEnabling Efficient Runtime Data Analysis to a Crystal Deformation SimulationAuthorArthur JaquardEnergy-Efficient Multimodal LLM Inference: Stage-Level Characterization and Input-Aware ControlsAuthorsMona MoghadampanahAdib Rezaei ShahmirzadiDimitrios S. NikolopoulosParallel Local Motif Counting on Large-Scale Dynamic GraphsAuthorsAli KhanSanjukta BhowmickMichela TauferNumerical Investigation of Radiation Hydrodynamic Instabilities at Scale with FleCSI-HARDAuthorsMåns I. AnderssonIsaac C. BannermanMoon B. HazarikaAkshit JariwalaJonathan MathurinMadela B. QuashieJulien LoiseauHyun LimCUR-MoE: Portable Mixture-of-Experts with Interpretable High-Ratio CompressionAuthorRitesh BhirudShipping HPC Ecosystems Across Platforms: Portable and Composable HPC Clusters as CodeAuthorsGerman Felipe Giraldo VillaThéo GrivelGeorge IoannidisEdita KizinevicCarolina LindqvistNicolas LitchinkoPablo LlopisAntonio Javier RussoGilles FouresteyHigh Performance Batch SVD Using GPUsAuthorAhmad AbdelfattahAnalyzing Dataset Popularity for Optimizing In-Network StorageAuthorsGunwoo KimAlex SimKesheng WuA Toolbox for Load Balancing Development and Analysis in WarpX/AMReX ApplicationsAuthorsJessica Imlau DagostiniSowmya YellapragadaKevin GottRebecca Hartman-BakerScaling Singular Values Beyond GPU Memory Limits: Out-of-Core, GPU-Accelerated, and Unified Across Data Precision and HardwareAuthorEvelyne RingootFrom Petabytes to Predictions: Harnessing Large-Scale NeuroBlu Mental Health Data and ML To Mitigate Medication Non-AdherenceAuthorsAlyson CollinsCathy SandovalMaya SeshanSrishti SrivastavaJosh McWilliamsParaViz3D: MPI Trace Visualization with 3D VideoAuthorsJean-Yves VerhaegheGeorg HagerAyesha AfzalLearning To Select Scheduling Algorithms in OpenMPAuthorsJonas H. Müller KorndörferAli MohammedAhmed EleliemyQuentin GuilloteauReto KrummenacherFlorina CiorbaDistributed Modular Digital Twin Network for High-Performance and Reliable Data CentersAuthorsYan ChenXing LuCary FaulknerAlex VlachokostasHanlong WanJeremy LerondcsDF: A Double-Float Arithmetic Library for the Cerebras CS-2AuthorsReo NagashimaAkeru NakamuraKai MurakamiRyunosuke MatsuzakiDaichi MukunokiTakaaki MiyajimaPhySiViT: A Physics Simulation Vision TransformerAuthorsJessica EzembaJames AffulMei-Yu WangDiOMP-Offloading: Portable OpenMP Offloading for Distributed Heterogeneous SystemsAuthorsBaodi ShanMauricio Araya-PoloBarbara ChapmanWiCAT: Reducing Congestion at Wireless Interfaces in Heterogeneous ArchitecturesAuthorTarun SharmaUsing Hardware Metrics To Understand Performance of the RAJA Performance Suite Kernels in Different GPU Modes on MI300AAuthorsAmr AbouelmagdStephanie BrinkMichael McKinseyDavid BoehmeJason BurmarkBrian RyujinTom ScoglandOlga PearceA Formal Characterization of Non-Monotonicity in Tensor CoresAuthorsPaul JiangVivian ZhengAdvancing EEG Signal Analysis with Quantum Machine LearningAuthorsStephanie MurrayErika ParsonsTowards Application Agnostic HPC ProfilingAuthorsHari Teja JajulaDhruva KulkarniBrian AustinPurushotham BangaloreIntelligent Surrogates Pay Attention to Data, Improving Multi-Objective HPC OptimizationAuthorsAshna Nawar AhmedBanooqa BandayTerry JonesTanzima Z. IslamRange Search on Heterogeneous Systems with Processing-in-Memory ArchitectureAuthorsTasmia JannatSatish PuriMichael GowanlockSync-Free GPU Parallelization of Sparse Kernels from Sequential Python CodeAuthorMalko-Bani SomoForward Error Bounds and Efficient Algorithms for Computing a Tensor Times Matrix Chain in Low Precision on GPUsAuthorsJulian BellavitaPiyush SaoRamakrishnan KannanCROSS-HPC System Bayesian Optimization with Adaptive TransferAuthorsAbrar HossainKishwar AhmedTime-Stepping Hamiltonian Simulation for Solving Nonlinear PDEs via a Quantum-Classical Hybrid ApproachAuthorsSangwon KimJunya OnishiAyato TakiiYounghwa ChoTsubokura MakotoProcess-Based Predictors of Vulnerability ReintroductionAuthorsSamiha ShimmiNicholas SynovicMona RahimiGeorge ThiruvathukalCharacterizing Performance and Energy Trade-Offs on the Aurora SupercomputerAuthorsSolomon BekeleSwann PerarnauBrice VideauAccelerating Scientific Workflows with LLM-Driven Compiler Optimizations for Generated High-Performance HardwareAuthorsRobert RamstadNicolas Bohm AgostiniAntonino TumeoJulia with Intelligent Runtime for Heterogeneous ComputingAuthorsNarasinga Rao MiniskarPedro Valero-LaraWilliam GodoyKeita TeranishiJeffrey S. VetterCan Lossy Compression Benefit NVMe-Based I/O?AuthorsDarren NgDuo ZhangSheng DiZhaorui ZhangXiaoyi LuChatHPC: Building the Foundations for a Productive and Trustworthy AI-Assisted HPC EcosystemAuthorsPedro Valero-LaraAaron YoungMohammad Alaul Haque MonilSwaroop PophaleZheming JinJeffrey S. VetterKeita TeranishiWilliam F. GodoyBuilding the Foundation for Machine Learning-Based Mars Weather ForecastingAuthorMohammad AltiwainyJob Grouping-Based Intelligent Resource Recommendation FrameworkAuthorsBeste OztopBenjamin SchwallerVitus J. LeungJim BrandtBrian KulisManuel EgeleAyse K. CoskunUnmasking Performance Variability in GPU Codes on SupercomputersAuthorsCunyang WeiKeshav PradeepAbhinav BhateleTemplate Task-Based Multiresolution Analysis in Hybrid EnvironmentsAuthorsNilesh ChaturvediJoseph SchuchartRobert J. HarrisonFacilitating Mixed Python-Fortran HPC Codes: 4D Drift-Kinetic Simulations with PyccelAuthorsEmily BourneYaman GüçlüTidalMark: A Scalable Benchmark for Coastal Water Level ForecastingAuthorsLucas RaicuDaniel GrzendaIan FosterKyle ChardHigh-Performance Sparse Attention on Tensor Cores: Fused3S and BeyondAuthorZitong LiHardware-Aware Quantum Circuit SynthesisAuthorsNathan JonesAkhilesh BondapalliToby CoxIan LewisRong GeUnderstanding Communication Bottlenecks in Multi-Node LLM InferenceAuthorsPrajwal SinghaniaSiddharth SinghLannie Dalton HoughIshan RevankarHarshitha MenonCharles JekelAbhinav BhateleUnderstanding GPU Utilization Using LDMS Data on PerlmutterAuthorsOnur CankurBrian AustinAbhinav BhateleMojo: Python-Like MLIR-Based GPU Portable Science KernelsAuthorTatiana MelnichenkoAn Approach for Correlating Compiler Optimizations with Runtime PerformanceAuthorsBefikir BogaleOlga PearceTom ScoglandMichela TauferSRAP: Sender-Side Receiver-Aware Port Selection for High-Speed Multi-Flow TCPAuthorsShingo HattoriOsamu TatebeTowards a GPU-Accelerated Web-Based Graph Rendering Framework for Large-Scale Protein NetworksAuthorsJiaxin LuLandon DykenShilpika ShilpikaVenkatram VishwanathMichael PapkaSidharth KumarFrom Legacy to Portable: An Agentic AI Workflow for Fortran Code Translation and Cross-Architecture OptimizationAuthorsSparsh GuptaKamalavasan KamalakkannanMaxim MoraruGalen ShipmanPatrick DiehlScODA: An Emerging Pipeline for Evaluating Distributed Database Performance To Support Operational Data AnalyticsAuthorsNicholas SynovicFNU ShilpikaSilvio RizziDoug WaldronGeorge K. ThiruvathukalMichael E. PapkaMixed Compute Environments with OpenCHAMIAuthorsSean GibsonRichard KimSamuel QuanTravis CottonThomas MackellHydraCache: LLM Inference Prefill Parallelization Through Distributed Cache BlendingAuthorsAdib Rezaei ShahmirzadiShayan ShabihiMona MoghadampanahFurong HuangDimitrios S. NikolopoulosClassifying Performance Bounds Using Machine LearningAuthorsLewis LittmanTom DeakinBetween the NIC and a Hard Place: Evaluating 400 Gb/s Ethernet for HPC Data TransfersAuthorsAdelle FerrisEvelyn NeedhamNikole GrandezJesse MartinezDoug EganBridging the Quantum Coding Gap: Instruction-Tuned LLMs for QiskitAuthorsSixu ChenYuqi ZhangQiang GuanLeveraging Large Language Models for Property Prediction in Polymorphic Organic SemiconductorsAuthorsShreya PagariaMei-Yu WangDana O’ConnorJulian UranPaola BuitragoCan Long-Haul RDMA Benefit Federated Learning?AuthorsZhonghao ChenYuke LiDuo ZhangXiaoyi LuAn Efficient GEMM Acceleration Method for LLM Inference with Variable-Length SequencesAuthorsYu ZhangLu LuVaultX Merge: Breaking Memory Barriers in Proof-of-Space Plot GenerationAuthorsArnav SirigereVarvara BondarenkoIoan Raicu