“With Big Data Comes Big Responsibility.” 

The goal of my research is to enable artificial intelligence (AI) to truly benefit society, by responsibly overcoming the emerging challenges of fairness, explainability, and privacy. 

Counterfactual Explanations: Robust Recourse Under Model Multiplicity

Abstract: Counterfactual explanations inform ways to achieve a desired outcome from a machine learning model. However, such explanations are not robust to certain real-world changes in the underlying model (e.g., retraining the model, changing hyperparameters, etc.), questioning their reliability in several applications, e.g., credit lending. We note that models can change a lot in the parameter space under retraining on very similar data. We first introduce a novel metric -- that we call Counterfactual Stability -- that attempts to quantify how robust a counterfactual is going to be to model changes under retraining and comes with desirable theoretical properties. We also provide probabilistic guarantees on robustness under an abstraction of naturally-occurring model change. We also compare the performance of our strategies with popular counterfactual generation methods across benchmark datasets. The results demonstrate that our strategy generates counterfactuals that are significantly more robust (nearly 100% validity after actual model changes) and also realistic (in terms of local outlier factor) over existing state-of-the-art methods. Our work has interesting connections with model multiplicity, also known as, the Rashomon Effect.

Select Publication(s):

Fairness & Explainability

Abstract: With the growing use of AI in high-stakes applications such as hiring, lending, etc., it is of utmost importance that the decisions do not propagate biases and stereotypes. How do we identify which features are propagating biases and stereotypes? For instance, consider the hiring of a software engineer for a safety-critical application. Critical features, like coding test scores, might be weighed strongly in the decision even if biased whereas aptitude tests and reference letters may be used only to the extent that they do not discriminate. We propose a novel information-theoretic decomposition of the total disparity into a non-exempt component, which quantifies the part of the disparity that cannot be accounted for by the critical features, and an exempt component, which quantifies the remaining disparity. Our decomposition enables explainability as well as selective removal of the non-exempt component if desired. We arrive at this decomposition through examples and counterexamples that enable us to first obtain a set of desirable properties that any measure of non-exempt disparity should satisfy. We then demonstrate that our proposed quantification of non-exempt disparity satisfies all of them. This decomposition leverages a body of work from information theory called Partial Information Decomposition (PID). We also explore applications such as feature selection and debiasing filter bubbles.

Select Publication(s):

Fairness-Accuracy Tradeoffs: A Perspective from Mismatched Hypothesis Testing

Abstract: In this work, using a tool from information theory called Chernoff information, we derive fundamental limits on this relationship that explain why the accuracy on a given dataset often decreases as fairness increases. Novel to this work, we examine the problem of fair classification through the lens of a mismatched hypothesis testing problem, i.e., where we are trying to find a classifier that distinguishes between two "ideal" distributions but instead we are given two mismatched distributions that are biased. Based on this perspective, we contend that measuring accuracy with respect to the given (possibly biased) dataset is a problematic measure of performance. Instead one should also consider accuracy with respect to an ideal dataset that is unbiased. We formulate an optimization to find such ideal distributions and show that the optimization is feasible. Lastly, when the Chernoff information for one group is strictly less than another in the given dataset, we derive the information-theoretic criterion under which collection of more features can actually improve the Chernoff information and achieve fairness without compromising accuracy on the available data.

Work featured in New Scientist.

Select Publication(s):

Fairness & Privacy: Can Querying for Bias Leak Protected Attributes?

Abstract: Existing regulations often prohibit model developers from accessing protected attributes (gender, race, etc.) during training. This leads to scenarios where fairness assessments might need to be done on populations without knowing their memberships in protected groups. In such scenarios, institutions often adopt a separation between the model developers (who train their models with no access to the protected attributes) and a compliance team (who may have access to the entire dataset solely for auditing purposes). However, the model developers might be allowed to test their models for disparity by querying the compliance team for group fairness metrics. In this paper, we first demonstrate that simply querying for fairness metrics, such as, statistical parity and equalized odds can leak the protected attributes of individuals to the model developers. We demonstrate that there always exist strategies by which the model developers can identify the protected attribute of a targeted individual in the test dataset from just a single query. Furthermore, we show that one can reconstruct the protected attributes of all the individuals using techniques from compressed sensing. Our results pose an interesting debate in algorithmic fairness: Should querying for fairness metrics be viewed as a neutral-valued solution to ensure compliance with regulations? Or, does it constitute a violation of regulations and privacy if the number of queries answered is enough for the model developers to identify the protected attributes of specific individuals? To address this supposed violation of regulations and privacy, we also propose Attribute-Conceal, a novel technique that achieves differential privacy by calibrating noise to the smooth sensitivity of our bias query function, outperforming naive techniques such as the Laplace mechanism. We also include experimental results on the Adult dataset and synthetic dataset.

Select Publication(s):


Coded Computing: Fundamental Limits and Practical Strategies

Abstract: Faced with saturation of Moore's law and increasing dimension of data, system designers have increasingly resorted to parallel and distributed computing. However, distributed computing is often bottle necked by a small fraction of slow processors called "stragglers" that reduce the speed of computation because the fusion node has to wait for all processors to finish. My work proposes novel coding-theory-based solutions for speeding up distributed machine learning and large-scale computing on unreliable nodes prone to faults or stragglers.

Select Publication(s):

       (Extended version at IEEE Transactions on Information Theory, Oct 2019)

Slow and Stale Gradients can win the Race: Error-Runtime Tradeoffs in Distributed SGD

Abstract: Distributed Stochastic Gradient Descent (SGD) when run in a synchronous manner, suffers from delays in waiting for the slowest learners (stragglers). Asynchronous methods can alleviate stragglers, but cause gradient staleness that can adversely affect convergence. In this work, we present a novel theoretical characterization of the speed-up offered by asynchronous methods by analyzing the trade-off between the error in the trained model and the actual training runtime (wallclock time). The novelty in our work is that our runtime analysis considers random straggler delays, which helps us design and compare distributed SGD algorithms that strike a balance between stragglers and staleness. We also present a new convergence analysis of asynchronous SGD variants without bounded or exponential delay assumptions.

Select Publication(s):

Event Extraction for Natural Language Processing Using Graph Neural Networks

Select Publication(s):

Information Flow in Computational Systems

Select Publication(s):

Compressed Sensing & Sparse Reconstruction

Select Publication(s):

(Received Best Undergraduate Thesis Award for this work)