Execution- and Prediction-Based Auto-Tuning of Parallel Read and Write Parameters
Megha Agarwal, Pragya Jain, Divyansh Singhvi , and 1 more author
In 2021 IEEE 23rd Int Conf on High Performance Computing & Communications; 7th Int Conf on Data Science & Systems; 19th Int Conf on Smart City; 7th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys) , 2021
Parallel I/O tuning is useful for scientific applications that read and write huge amounts of data. I/O performance depends on multiple tunable parameters such as the stripe size, stripe count, the collective I/O buffer size, and the number of collective I/O aggregators. The search space being large, it is cumbersome to tune the I/O parameters for every system to achieve optimal results. We propose active learning-based execution and prediction-based tuning models. These recommend a good set of I/O parameter values for an application on a given system. These models use optimization to find the parameter values; the objective is to minimize I/O time. The models allow to focus on improvement of read and/or write performance, and separate tuning of reads and writes. We evaluated our models using I/O kernels of scientific applications (S3D-IO, BT-IO and GenericIO) and the highly configurable IOR benchmark on an Intel-based supercomputer, HPC2010. We achieved an increase in I/O bandwidth of up to 8x over the default parameters, when both read and write are optimized together, and up to 20x in read bandwidths when optimized separately.