Close

Presentation

Integrating Distributed SQL Query Engines with Object-Based Computational Storage
DescriptionExisting object storage systems like AWS S3 and MinIO offer only limited in-storage compute capabilities, typically restricted to simple SQL WHERE-clause filtering. Conse-
quently, high-impact operators—such as aggregation and top-N—are still executed entirely at the compute layer. Recent advances in Object-based Computational Storage (OCS) enable these complex operators to run natively within storage, creating opportunities for substantial reductions in data movement and query time. To demonstrate these benefits in distributed SQL engines, we used Presto as a case study and developed the Presto-OCS connector, which analyzes execution plans to identify pushdown-eligible operators and offloads them to OCS for efficient in-storage execution. Evaluations with real-world HPC analytics queries and the TPC-H benchmark show that our approach achieves up to 4.07× speedup and 99% data movement reduction compared to filter-only pushdown. When combined with compression techniques, our approach delivers 1.39×speedup over compressed filter-only pushdown, demonstrating that advanced query pushdown complements existing optimizations.
Event Type
Workshop
TimeMonday, 17 November 20254:45pm - 5:10pm CST
Location265