Close

Presentation

This content is available for: Technical Program Reg Pass, Workshop Reg Pass. Upgrade Registration
Parallel Data Object Creation: Scalable Metadata Management in Parallel I/O Library
DescriptionHigh-level I/O libraries, such as PnetCDF and HDF5, are commonly used by large-scale scientific applications to perform I/O tasks in parallel. These I/O libraries store the metadata of data objects in files along with their raw data. To ensure metadata consistency during parallel data object creation, they require applications to call the metadata APIs collectively using consistent metadata. Such a requirement can result in an expensive consistency check, as its cost increases with the metadata volume and the number of processes. To address this limitation, we propose a new file header format, which uses partitioned metadata blocks to enable independent data object creation and reduce the objects required for consistency check. Our performance evaluation shows that this new design achieves a scalable performance, cutting data object creation times by up to 196x when running on 4096 MPI processes to create 5,684,800 data objects in parallel.
Event Type
Workshop
TimeMonday, 17 November 202511:30am - 12:00pm CST
Location230
Tags
Data Analytics
High Performance I/O, Storage, Archive, & File Systems
Storage
Recordings
Livestreamed
Recorded
Registration Categories
TP
W