Close

Presentation

Inference-as-a-Service Prototype at NERSC
DescriptionThe increasing scale and complexity of scientific experiments has led to a growing need for efficient and scalable machine learning model inference serving systems. High-energy physics experiments and simulations of complex climate models involve petabytes of data and massive amounts of computational resources to produce accurate results. Thus, scientists are increasingly turning to utilize ML techniques to analyze and interpret the vast amount of data generated by these experiments.

However, the deployment of ML models in scientific applications poses significant challenges. Traditional approaches to deploying ML models by individual users with local resources or small clusters often suffer from long startup costs and inefficient resource utilization. To address this challenge, we present a prototyped system that provides on-demand inference serving capabilities for multiple scientific ML models. Our system is deployed across the NERSC Perlmutter supercomputer and the NERSC K8s cluster, enabling on-demand scalability.