Presentation
SIGN IN TO VIEW THIS PRESENTATION Sign In
A Query Engine for Scientific Data Exploration using Theory, Simulation, and Artificial Intelligence Models
SessionAI4S: 6th Workshop on Artificial Intelligence and Machine Learning for Scientific Applications
DescriptionModern scientific discovery increasingly integrates simulations, data, and AI models. Existing systems rarely let scientists compose expressive queries that retrieve multi-modal datasets and invoke complex simulations or AI inferences. We introduce the Intelligent Data Search (IDS) framework to bridge this gap. IDS extends the Cray Graph Engine to provide a scalable in-memory datastore (feature, vector, and knowledge graph), a unified query engine combining keyword, set-theoretic, and linear-algebraic operators, a model repository for UDFs and pre-trained AI models, and a distributed multi-tier cache for intermediate and simulation outputs. We evaluate IDS on a life-sciences workflow with the NCNPR, integrating AlphaFold, AutoDock Vina, and Smith–Waterman within a single query. Results show strong HPC scaling, a complex “what-could-be” query executing millions of searches and thousands of inferences in seconds, and 5–15× end-to-end speedup from caching. IDS empowers scientists to ask and iterate model-driven “what-if” questions over petascale data with minimal latency.
