Presentation
Intelligent Surrogates Pay Attention to Data, Improving Multi-Objective HPC Optimization
DescriptionHigh performance computing (HPC) schedulers must balance runtime and power. We present a surrogate-assisted multi-objective Bayesian optimization (MOBO) framework using TabNet regressors and models trained on attention-based embeddings, coupled with active-learning sample selection. The surrogates predict runtime and power, enabling MOBO to efficiently discover Pareto-optimal node allocations. We quantify trade-offs with Pareto fronts, hypervolume (HV), and Spread across PM100 and Adastra production traces. MOBO improves HV over single-objective baselines by 24% (PM100) and 37% (Adastra) and attains lower Spread in 75% of surrogate families. Active learning reduces evaluations by ~53%–70%. To our knowledge, this is the first demonstration of embedding-informed surrogates for MOBO applied to HPC job scheduling traces, optimizing runtime–power trade-offs on production datasets.

Event Type
Research and ACM SRC Posters
TimeTuesday, 18 November 20258:00am - 5:00pm CST
LocationSecond Floor Atrium
Archive
view


