publications | Divyansh Singhvi

2024

Knowing Your Nonlinearities: Shapley Interactions Reveal the Underlying Structure of Data

Divyansh Singhvi, Andrej Erkelens, Raghav Jain , and 2 more authors

arXiv preprint arXiv:2403.13106, 2024

Abs HTML

Measuring nonlinear feature interaction is an established approach to understanding complex patterns of attribution in many models. In this paper, we use Shapley Taylor interaction indices (STII) to analyze the impact of underlying data structure on model representations in a variety of modalities, tasks, and architectures. Considering linguistic structure in masked and auto-regressive language models (MLMs and ALMs), we find that STII increases within idiomatic expressions and that MLMs scale STII with syntactic distance, relying more on syntax in their nonlinear structure than ALMs do. Our speech model findings reflect the phonetic principal that the openness of the oral cavity determines how much a phoneme varies based on its context. Finally, we study image classifiers and illustrate that feature interactions intuitively reflect object boundaries. Our wide range of results illustrates the benefits of interdisciplinary work and domain expertise in interpretability research.

2021

Execution- and Prediction-Based Auto-Tuning of Parallel Read and Write Parameters

Megha Agarwal, Pragya Jain, Divyansh Singhvi , and 1 more author

In 2021 IEEE 23rd Int Conf on High Performance Computing & Communications; 7th Int Conf on Data Science & Systems; 19th Int Conf on Smart City; 7th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys) , 2021

Abs HTML

Parallel I/O tuning is useful for scientific applications that read and write huge amounts of data. I/O performance depends on multiple tunable parameters such as the stripe size, stripe count, the collective I/O buffer size, and the number of collective I/O aggregators. The search space being large, it is cumbersome to tune the I/O parameters for every system to achieve optimal results. We propose active learning-based execution and prediction-based tuning models. These recommend a good set of I/O parameter values for an application on a given system. These models use optimization to find the parameter values; the objective is to minimize I/O time. The models allow to focus on improvement of read and/or write performance, and separate tuning of reads and writes. We evaluated our models using I/O kernels of scientific applications (S3D-IO, BT-IO and GenericIO) and the highly configurable IOR benchmark on an Intel-based supercomputer, HPC2010. We achieved an increase in I/O bandwidth of up to 8x over the default parameters, when both read and write are optimized together, and up to 20x in read bandwidths when optimized separately.

2019

Active Learning-based Automatic Tuning and Prediction of Parallel I/O Performance

Megha Agarwal, Divyansh Singhvi, Preeti Malakar , and 1 more author

In 2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW) , Nov 2019

Abs HTML

Parallel I/O is an indispensable part of scientific applications. The current stack of parallel I/O contains many tunable parameters. While changing these parameters can increase I/O performance many-fold, the application developers usually resort to default values because tuning is a cumbersome process and requires expertise. We propose two auto-tuning models, based on active learning that recommend a good set of parameter values (currently tested with Lustre parameters and MPI-IO hints) for an application on a given system. These models use Bayesian optimization to find the values of parameters by minimizing an objective function. The first model runs the application to determine these values, whereas, the second model uses an I/O prediction model for the same. Thus the training time is significantly reduced in comparison to the first model (e.g., from 800 seconds to 18 seconds). Also both the models provide flexibility to focus on improvement of either read or write performance. To keep the tuning process generic, we have focused on both read and write performance. We have validated our models using an I/O benchmark (IOR) and 3 scientific application I/O kernels (S3D-IO, BT-IO and GenericIO) on two supercomputers (HPC2010 and Cori). Using the two models, we achieve an increase in I/O bandwidth of up to 11× over the default parameters. We got up to 3× improvements for 37 TB writes, corresponding to 1 billion particles in GenericIO. We also achieved up to 3.2× higher bandwidth for 4.8 TB of noncontiguous I/O in BT-IO benchmark.