Presentation
SIGN IN TO VIEW THIS PRESENTATION Sign In
From Edge to HPC: Investigating Cross-Facility Data Streaming Architectures
SessionThe 12th Annual International Workshop on Innovating the Network for Data-Intensive Science (INDIS)
DescriptionIn this paper, we investigate three cross-facility data streaming architectures, Direct Streaming (DTS), Proxied Streaming (PRS), and Managed Service Streaming (MSS). We examine their architectural variations in dataflow paths and deployment feasibility, and detail their implementation using the DS2HPC architectural framework and the SciStream memory-to-memory streaming toolkit on the production-grade ACE infrastructure at OLCF. We present a workflow-specific evaluation of these architectures using three synthetic workloads derived from the streaming characteristics of scientific workflows. Through simulated experiments, we measure streaming throughput, round-trip time, and overhead under work sharing, work sharing with feedback, and broadcast and gather messaging patterns commonly found in AI-HPC communication motifs. Our study shows that DTS offers a minimal-hop path, resulting in higher throughput and lower latency, whereas MSS provides greater deployment feasibility and scalability across multiple users but incurs significant overhead. PRS lies in between, offering a scalable architecture whose performance matches DTS in most cases.

