Presentation
GATSched: Multi-Objective Graph Attention Networks for Energy-Efficient HPC Job Scheduling
DescriptionHigh performance computing (HPC) systems face an urgent sustainability crisis, with leading facilities consuming 10–60 MW and incurring multimillion-dollar annual energy costs. Traditional schedulers like SLURM and PBS treat energy as secondary, leading to 30%–50% energy waste above theoretical optimal levels. We present GATSched, a multi-objective graph attention network scheduler that models HPC workloads as dynamic graphs with specialized attention heads. Our approach jointly optimizes energy efficiency, performance, and resource utilization using four attention mechanisms: energy, performance, balance, and temporal. Through trace-driven simulation validation on 389,604 production jobs across three HPC architectures, GATSched achieves 27%–35% energy reduction while maintaining substantial resource utilization. In the poster session, we will demonstrate the GAT architecture and benchmark comparisons through interactive visualizations.

Event Type
Research and ACM SRC Posters
TimeTuesday, 18 November 20258:00am - 5:00pm CST
LocationSecond Floor Atrium
Archive
view

