Close

Presentation

Implementing support for Interactive and AI workloads in a traditional HPC environment
DescriptionThis paper provides an example technique for how to run interactive and AI workloads alongside traditional High-Performance computing jobs. It explains how to use short max runtime for all jobs and a resource limited, high priority QOS, to allow quick interactive job starts without impacting system capacity for traditional HPC jobs. It includes background on the history and culture that contributed to the implementation and technique. The reference implementation, technique, and paper are Slurm centric in terminology, but the computer scheduling concepts and methods will translate to other implementations. The paper concludes with observations on the conditions that made these techniques effective and possible areas of future work to make them more broadly applicable.