Session Full Schedule · Contributors · Organizations · Search Program · My Schedule · Happening NowMore…Search ProgramMy ScheduleHappening NowResearch and ACM SRC Posters: Poster Presentations (Research, ACM SRC Grads/Undergrads)Session ChairsKento SatoRIKEN Center for Computational Science (R-CCS)Chris SchlipaliusPawsey Supercomputing Research CentreCommonwealth Scientific and Industrial Research Organisation (CSIRO), AustraliaAnja GerbesGeorg-August-Universität GöttingenEvent TypeResearch and ACM SRC PostersTimeThursday, 20 November 20258:00am - 5:00pm CSTLocationSecond Floor AtriumTagsResearch & ACM SRC PostersRegistration Categories TP Similar SessionsBest Poster Finalist Presentations (ACM SRC Grads)Best Poster Finalist Presentations (ACM SRC Undergrads)Doctoral Showcase I PresentationsPresentationsCan Lossy Compression Benefit NVMe-Based I/O?AuthorsDarren NgDuo ZhangSheng DiZhaorui ZhangXiaoyi LuEnabling Real-Time, Extreme-Scale Bayesian Inference: FFT-Based GPU-Accelerated Matrix-Vector Products for Block-Triangular Toeplitz MatricesAuthorsSreeram VenkatOmar GhattasFrom Petabytes to Predictions: Harnessing Large-Scale NeuroBlu Mental Health Data and ML To Mitigate Medication Non-AdherenceAuthorsAlyson CollinsCathy SandovalMaya SeshanSrishti SrivastavaJosh McWilliamsOptimizing and Extending Periodogram Computations for AstronomyAuthorsYuwei SunLehman GarrisonMPI-SGX: Enabling Confidential Computing for MPI Parallel Applications with Intel SGX TechnologyAuthorsKota ShimojimaHayato YamakiHiroki HondaShinichiro MatsuoAtsuko TakefusaShinobu MiwaA Toolbox for Load Balancing Development and Analysis in WarpX/AMReX ApplicationsAuthorsJessica Imlau DagostiniSowmya YellapragadaKevin GottRebecca Hartman-BakerA Formal Characterization of Non-Monotonicity in Tensor CoresAuthorsPaul JiangVivian ZhengEnabling Efficient Runtime Data Analysis to a Crystal Deformation SimulationAuthorArthur JaquardC++ Standard Parallelism for GPU Programming in a Particle-In-Cell ApplicationAuthorsEster El KhouryMathieu LobetJulien BigotLaurent ColombetReal-Time ML-Based Defense Against Malicious Payload in Reconfigurable Embedded SystemsAuthorsRye Stahle-SmithRasha KarakchiEuropean Open Web Index: Large Complex Graph VisualizationAuthorsPavlina SmolkovaKaterina SlaninovaHigh Performance Batch SVD Using GPUsAuthorAhmad AbdelfattahFacilitating Mixed Python-Fortran HPC Codes: 4D Drift-Kinetic Simulations with PyccelAuthorsEmily BourneYaman GüçlüPerformance Engineering of Scientific Applications with MVAPICH and TAU Using Emerging Communication PrimitivesAuthorsDhabaleswar K. (DK) PandaSameer ShendeAhmad AbdelfattahYifeng CuiCATIOS: Time-Resolved I/O-Aware Job Scheduling for HPC SystemsAuthorsYuTsen TsengMasatoshi KawaiKeichi TakahashiHiroyuki TakizawaA Kokkos-Based Proxy of the Exascale Metagenome Assembler MetaHipMer2: A First Use of Kokkos for Computational BiologyAuthorsLogan WilliamsGavin ConantMichela BecchiJan CieskoAmy PowellAn Approach for Correlating Compiler Optimizations with Runtime PerformanceAuthorsBefikir BogaleOlga PearceTom ScoglandMichela TauferForward Error Bounds and Efficient Algorithms for Computing a Tensor Times Matrix Chain in Low Precision on GPUsAuthorsJulian BellavitaPiyush SaoRamakrishnan KannanTime-Stepping Hamiltonian Simulation for Solving Nonlinear PDEs via a Quantum-Classical Hybrid ApproachAuthorsSangwon KimJunya OnishiAyato TakiiYounghwa ChoTsubokura MakotoScaling Singular Values Beyond GPU Memory Limits: Out-of-Core, GPU-Accelerated, and Unified Across Data Precision and HardwareAuthorEvelyne RingootEvaluating the Usage of Python Libraries on a Production SupercomputerAuthorThomas PapkaUnraveling Distant Galaxies: Analyzing IFU Data with Parsl and AcademyAuthorDaniel BabniggMixed Compute Environments with OpenCHAMIAuthorsSean GibsonRichard KimSamuel QuanTravis CottonThomas MackellMojo: Python-Like MLIR-Based GPU Portable Science KernelsAuthorTatiana MelnichenkoAutoSlim: Intelligent Automata Graph Optimization for Efficient AccelerationAuthorsTiffany YuRasha KarakchiSync-Free GPU Parallelization of Sparse Kernels from Sequential Python CodeAuthorMalko-Bani SomoCROSS-HPC System Bayesian Optimization with Adaptive TransferAuthorsAbrar HossainKishwar AhmedcsDF: A Double-Float Arithmetic Library for the Cerebras CS-2AuthorsReo NagashimaAkeru NakamuraKai MurakamiRyunosuke MatsuzakiDaichi MukunokiTakaaki MiyajimaScalable Alternative Route Computation with ACE: A C++17 Library for HPC Traffic SimulationsAuthorsPaulo SilvaPavlína SmolkováKateřina SlaninováJan MartinovičJoão BarbosaMatej ŠpeťkoEmanuele VitaliProductive Scalable Distributed Task Scheduling Using an MPI-based Backend for DaggerAuthorYan GuimarãesCompute System Simulator: Modeling the Impact of Allocation Policy and Hardware Reliability on HPC Cloud Resource UtilizationAuthorsJarrod LeddyHuseyin YildizAccelerating Linear Solve with Mixed Precision Nested Recursive Subdivision on AI HardwareAuthorVicki CarricaChatHPC: Building the Foundations for a Productive and Trustworthy AI-Assisted HPC EcosystemAuthorsPedro Valero-LaraAaron YoungMohammad Alaul Haque MonilSwaroop PophaleZheming JinJeffrey S. VetterKeita TeranishiWilliam F. GodoyConfiguring Large Language Models for Regional Ocean Model DevelopmentAuthorsAidan JanneyGiovanni Seijo-EllisDan AmrheinEnergy-Efficient Multimodal LLM Inference: Stage-Level Characterization and Input-Aware ControlsAuthorsMona MoghadampanahAdib Rezaei ShahmirzadiDimitrios S. NikolopoulosThe Impact of Maximum Vector Length on Cache Management Techniques in RISC-V Vector ExtensionAuthorsShunya NomuraJiaheng LiuKeichi TakahashiHiroyuki TakizawaJob Grouping-Based Intelligent Resource Recommendation FrameworkAuthorsBeste OztopBenjamin SchwallerVitus J. LeungJim BrandtBrian KulisManuel EgeleAyse K. CoskunVaultX Merge: Breaking Memory Barriers in Proof-of-Space Plot GenerationAuthorsArnav SirigereVarvara BondarenkoIoan RaicuTemplate Task-Based Multiresolution Analysis in Hybrid EnvironmentsAuthorsNilesh ChaturvediJoseph SchuchartRobert J. HarrisonJACC: Easy CPU/GPU Performance Portability for Scientific Applications in JuliaAuthorsWilliam GodoyPedro Valero-LaraPhilip FacklerKeita TeranishiJeffrey VetterJhonny GonzalezJose GonzalezAlexis HuanteMitigating I/O Bottlenecks in LiDAR Pipelines by Directly Merging Neural Decompression and Semantic SegmentationAuthorsEthan MarquezMax FaykusOyinlolu OdetoyeMelissa SmithJon CalhounAccelerating Scientific Workflows with LLM-Driven Compiler Optimizations for Generated High-Performance HardwareAuthorsRobert RamstadNicolas Bohm AgostiniAntonino TumeoWONDERS: Integrating WOW, PONDER, and SCALE for Enhanced Scheduling PerformanceAuthorsFabian LehmannJonathan RauJonathan BaderOdej KaoUlf LeserMulti-GPU Implementation and Roofline Analysis of a Numerical Global Ocean ModelAuthorsTakateru YamagishiMasao KurogiTakao KawasakiYoshimasa MatsumuraHiroyasu HasumiParallel Local Motif Counting on Large-Scale Dynamic GraphsAuthorsAli KhanSanjukta BhowmickMichela TauferUnified Performance Modeling Stack for Distributed GPU Applications: Complementing Analytical Insights with Machine LearningAuthorUrvij SaroliyaProcess-Based Predictors of Vulnerability ReintroductionAuthorsSamiha ShimmiNicholas SynovicMona RahimiGeorge ThiruvathukalA Quantum Solver for Multidimensional Partial Differential Equations: Practical Case StudiesAuthorsManu ChaudharyKareem El-ArabyAlvir NobelIshraq IslamManish SinghSunday OgundeleKieran EganSneha ThomasVincent VordtriedeDevon BontragerSerom KimEsam El-ArabyDivergence Prediction System for CFD SimulationsAuthorsTakashi SogaTakanori UchidaSusumu DateShipping HPC Ecosystems Across Platforms: Portable and Composable HPC Clusters as CodeAuthorsGerman Felipe Giraldo VillaThéo GrivelGeorge IoannidisEdita KizinevicCarolina LindqvistNicolas LitchinkoPablo LlopisAntonio Javier RussoGilles FouresteyEchoes of Earth: Building an Autonomous Environmental Lab for Acoustic SensingAuthorsHudson ReynoldsAlex TueckeMike ShermanKate KeaheyMemory-Efficient CFD Based on MPS: Effective One-Billion-Cell Resolution on a Single NodeAuthorsJunya OnishiAyato TakiiSangwon KimYounghwa ChoMakoto TsubokuraPhySiViT: A Physics Simulation Vision TransformerAuthorsJessica EzembaJames AffulMei-Yu WangUnderstanding GPU Utilization Using LDMS Data on PerlmutterAuthorsOnur CankurBrian AustinAbhinav BhateleLuthier: A Dynamic Binary Instrumentation Framework Targeting AMD GPUsAuthorsMatin Raayai-ArdakaniNorman RubinDavid KaeliOptimizing Collectives with Large Payloads on GPU-Based SupercomputersAuthorsSiddharth SinghMahua SinghKeshav PradeepAbhinav BhateleTowards Application Agnostic HPC ProfilingAuthorsHari Teja JajulaDhruva KulkarniBrian AustinPurushotham BangaloreJulia with Intelligent Runtime for Heterogeneous ComputingAuthorsNarasinga Rao MiniskarPedro Valero-LaraWilliam GodoyKeita TeranishiJeffrey S. VetterDistributed Modular Digital Twin Network for High-Performance and Reliable Data CentersAuthorsYan ChenXing LuCary FaulknerAlex VlachokostasHanlong WanJeremy LerondApplying Lossy Compression Techniques to GNN TrainingAuthorsMilan ShahReece NeffMichela BecchiWiCAT: Reducing Congestion at Wireless Interfaces in Heterogeneous ArchitecturesAuthorTarun SharmaA Scalability Study of Quantum Algorithms for Dimensionality Reduction of Multidimensional DataAuthorsKareem El-ArabyThom PopovicAlvir NobelSunday OgundeleKatherine KlymkoDaan CampsAnastasiia ButkoEsam El-ArabyCan Long-Haul RDMA Benefit Federated Learning?AuthorsZhonghao ChenYuke LiDuo ZhangXiaoyi LuIncineRate: Multi-Modal FPGA Accelerator for SCNNsAuthorsBjörn A. LindqvistArtur PodobasHardware-Aware Quantum Circuit SynthesisAuthorsNathan JonesAkhilesh BondapalliToby CoxIan LewisRong GeUnderstanding LLM Behavior on HPC Data via Mechanistic InterpretabilityAuthorsMd Mahbubur RahmanArjun GuhaHarshitha MenonBridging the Quantum Coding Gap: Instruction-Tuned LLMs for QiskitAuthorsSixu ChenYuqi ZhangQiang GuanGPU Kernels for Mixture of ExpertsAuthorsArthur FeeneyYing Wai LiAparna ChandramowlishwaranTidalMark: A Scalable Benchmark for Coastal Water Level ForecastingAuthorsLucas RaicuDaniel GrzendaIan FosterKyle ChardShortcut Mixup Policy: Toward Improving Robustness and Speed in Goal-Conditioned RLAuthorsMatthew HyattYassir AtlasHal BryntesonDiego Roa PerdomoAthena AngaraMengjiao HanJoseph InsleyJanet KnowlesYongho KimVictor MateevitsiMichael PapkaSilvio RizziGeorge ThiruvathukalNicola FerrierLeveraging Large Language Models for Property Prediction in Polymorphic Organic SemiconductorsAuthorsShreya PagariaMei-Yu WangDana O’ConnorJulian UranPaola BuitragoChameleon Concierge: Retrieval-Augmented Generation (RAG) To Enhance Open Testbed DocumentationAuthorSaieda Ali ZadaUnmasking Performance Variability in GPU Codes on SupercomputersAuthorsCunyang WeiKeshav PradeepAbhinav BhateleLearning To Select Scheduling Algorithms in OpenMPAuthorsJonas H. Müller KorndörferAli MohammedAhmed EleliemyQuentin GuilloteauReto KrummenacherFlorina CiorbaCUR-MoE: Portable Mixture-of-Experts with Interpretable High-Ratio CompressionAuthorRitesh BhirudAlgorithms and Applications of Dynamic Network Analysis Using CANDYAuthorsAashish PandeyArindam KhandaS.M. ShovanAli Y. KhanBoyana NorrisSajal K. DasSanjukta BhowmickDivide, Conquer, and Denoise: Hybrid Parallel Diffusion with Memory-Aware Coarse-to-Fine InferenceAuthorsFarhana AminKanchon GharamiDimitrios NikolopoulosSRAP: Sender-Side Receiver-Aware Port Selection for High-Speed Multi-Flow TCPAuthorsShingo HattoriOsamu TatebeWhen Label Propagation Outperforms BFS in Breadth-First Graph TraversalAuthorsKalsuda LapborisuthSrinivas AluruWafer-Scale Simulation of Mutator Allele Dynamics in Large Asexual PopulationsAuthorsMatthew Andres MorenoEmily DolsonLuis ZamanExploring Fine-Grained Parallelism in Data-Flow Runtime Systems on Many-Core SystemsAuthorsWenyi WangMaxime GonthierHaibin LaiPoornima NookalaHaochen PanIan FosterIoan RaicuKyle ChardCIRE: LLVM Analysis for Floating-Point Rounding Error Affected by Precision and OptimizationsAuthorsCayden LundTanmay TirpankarGanesh GopalakrishnanMassively Parallel Bayesian Inference Framework for GPU Supercomputers: Application to Estimation of Coseismic Fault SlipAuthorsKai NakaoTsuyoshi IchimuraKohei FujitaAnalyzing Dataset Popularity for Optimizing In-Network StorageAuthorsGunwoo KimAlex SimKesheng WuNovel Graph Alignment Algorithms for Identifying Non-Determinism in Large-Scale SimulationsAuthorDhroov PandeyDiffPro: Joint Timestep and Layer-Wise Precision Optimization for Efficient Diffusion InferenceAuthorsFarhana AminKanchon GharamiDimitrios S. NikolopoulosLocal vs. Global FFT Approaches for High-Performance Ultrasound Simulation on Multi-GPU SystemsAuthorsOliver KuníkJiri JarosExplicit Low-Order Finite-Element Wave Simulation Accelerated with Variable-Precision Computing Using INT8 Tensor CoresAuthorsKohei FujitaTsuyoshi IchimuraMuneo HoriLalith MaddegedaraBetween the NIC and a Hard Place: Evaluating 400 Gb/s Ethernet for HPC Data TransfersAuthorsAdelle FerrisEvelyn NeedhamNikole GrandezJesse MartinezDoug EganParaViz3D: MPI Trace Visualization with 3D VideoAuthorsJean-Yves VerhaegheGeorg HagerAyesha AfzalAdvancing EEG Signal Analysis with Quantum Machine LearningAuthorsStephanie MurrayErika ParsonsClassifying Performance Bounds Using Machine LearningAuthorsLewis LittmanTom DeakinUsing Hardware Metrics To Understand Performance of the RAJA Performance Suite Kernels in Different GPU Modes on MI300AAuthorsAmr AbouelmagdStephanie BrinkMichael McKinseyDavid BoehmeJason BurmarkBrian RyujinTom ScoglandOlga PearceRange Search on Heterogeneous Systems with Processing-in-Memory ArchitectureAuthorsTasmia JannatSatish PuriMichael GowanlockCharacterizing Performance and Energy Trade-Offs on the Aurora SupercomputerAuthorsSolomon BekeleSwann PerarnauBrice VideauIntelligent Surrogates Pay Attention to Data, Improving Multi-Objective HPC OptimizationAuthorsAshna Nawar AhmedBanooqa BandayTerry JonesTanzima Z. IslamNumerical Investigation of Radiation Hydrodynamic Instabilities at Scale with FleCSI-HARDAuthorsMåns I. AnderssonIsaac C. BannermanMoon B. HazarikaAkshit JariwalaJonathan MathurinMadela B. QuashieJulien LoiseauHyun LimScalable Multi-Node Multi-GPU Datalog Engine with Energy-Aware ProfilingAuthorsAhmedur Rahman ShovonSidharth KumarEvaluating the Power-Monitoring Capabilities of AuroraAuthorPrecious EyabiMassively Parallel GPU Rasterizer for Next-Generation Computational LithographyAuthorsLoay HegazyMohamed TaherSherif HammoudaHigh-Performance Sparse Attention on Tensor Cores: Fused3S and BeyondAuthorZitong LiEvaluating LiDAR Compression for 3D Semantic Segmentation in Diverse Off-Road Environments on GOOSE DatasetAuthorsAdam NiemczuraMax FaykusOyinlolu OdetoyeMelissa SmithJon CalhounScott GroelTowards a GPU-Accelerated Web-Based Graph Rendering Framework for Large-Scale Protein NetworksAuthorsJiaxin LuLandon DykenShilpika ShilpikaVenkatram VishwanathMichael PapkaSidharth KumarGATSched: Multi-Objective Graph Attention Networks for Energy-Efficient HPC Job SchedulingAuthorKyrian AdimoraInference-as-a-Service Prototype at NERSCAuthorsColin ThomasPo-Han HuangHilary UtaegbulamJohannes BlaschkeBruno CoimbraPengfei DingXiangyang JuAndrew NaylorMichael WangHeterogeneity-Aware Task Allocation for Modern HPC SystemsAuthorsSowmya YellapragadaJessica Imlau DagostiniKevin GottRebecca Hartman-BakerAccelerating AI Co-Scientists with HPC InfrastructureAuthorSuryatejas AppanaUnderstanding Communication Bottlenecks in Multi-Node LLM InferenceAuthorsPrajwal SinghaniaSiddharth SinghLannie Dalton HoughIshan RevankarHarshitha MenonCharles JekelAbhinav BhateleAn Efficient GEMM Acceleration Method for LLM Inference with Variable-Length SequencesAuthorsYu ZhangLu LuEnhancing Usability and Performance in Experimental Environments ManagementAuthorsZahra TemoriPaul MarshallKate KeaheyHydraCache: LLM Inference Prefill Parallelization Through Distributed Cache BlendingAuthorsAdib Rezaei ShahmirzadiShayan ShabihiMona MoghadampanahFurong HuangDimitrios S. NikolopoulosTensor Core Accelerated Fast Multipole Method for GROMACSAuthorsJiamian HuangMuhammad Umair SadiqRio YokotaBerk HessFast Linear Solvers via AI-Tuned Markov Chain Monte Carlo-Based Matrix InversionAuthorsAnton LebedevWon Kyung LeeSoumyadip GhoshOlha I. YamanVassilis KalantzisYingdong LuTomasz NowickiShashanka UbaruLior HoreshVassil AlexandrovDetecting Silent Data Corruption in Sparse Matrices Using Hardware Performance CountersAuthorsMinseop ChoiOrlando AriasSeung Woo SonFrom Legacy to Portable: An Agentic AI Workflow for Fortran Code Translation and Cross-Architecture OptimizationAuthorsSparsh GuptaKamalavasan KamalakkannanMaxim MoraruGalen ShipmanPatrick DiehlGNNs on Evolving Graphs: A Benchmark of Incremental Updates and Meta-Learning ApproachesAuthorsSriram SrinivasanSanjukta BhowmickHamdan AlabsiRand ObeidatScODA: An Emerging Pipeline for Evaluating Distributed Database Performance To Support Operational Data AnalyticsAuthorsNicholas SynovicFNU ShilpikaSilvio RizziDoug WaldronGeorge K. ThiruvathukalMichael E. PapkaAn Agent-Based Viral Venture: Adaptive Tool Selection for Scalable GenomicsAuthorsNaomi KolodisnerAlok Kamatar (Advisor)J. Greg Pauloski (Advisor)Building the Foundation for Machine Learning-Based Mars Weather ForecastingAuthorMohammad AltiwainyOptimizing the GPU All-Reduce Using Multiple Processes Per GPUAuthorsMichael AdamsAmanda BienzDistributed 3D Gaussian Splatting for High-Resolution Isosurface VisualizationAuthorsMengjiao HanAndres SewellJoseph InsleyJanet KnowlesVictor A. MateevitsiMichael E. PapkaSteve PetruzzaSilvio RizziHarmony: Converged Supercomputer Scratch and Archival FilesystemsAuthorJake CarrollAdversaGuard: A Distributed Data Poisoning Benchmark for Parallel AIAuthorsYulia KumarSolomon ThomasDejaun GayleJ. Jenny LiDov KrugerOptimizing Task-Driven Offloading in LLVMAuthorsJan KrausJoachim JenkeChristian TerbovenOrchid: Towards Heterogeneous Batched Eigenvalue SolversAuthorMatthew ChungPractical Viability of Translating Legacy Fortran Code to C++ Using Large Language ModelsAuthorsRen ImaiMasatoshi KawaiKeichi TakahashiHiroyuki TakizawaDiOMP-Offloading: Portable OpenMP Offloading for Distributed Heterogeneous SystemsAuthorsBaodi ShanMauricio Araya-PoloBarbara ChapmanScalable Execution Framework for R on Manycore SystemsAuthorsXiran ZhangJavier ConejeroSameh AbdulahJorge EjarqueYing SunRosa M. BadiaDavid E. KeyesMarc G. GentonSeamless Scaling of Applications Across Programming ModelsAuthorsReto KrummenacherQuentin GuilloteauJonas H. Müller KorndörferFlorina M. Ciorba