Arpita Biswas - Projects

Projects

Quantifying Effects of Solar Power Adoption on CO2 Emissions Reduction [Harvard School of Public Health]

In response to the imperative to curb carbon dioxide (CO2) emissions from fossil fuel power plants, we investigate the strategic utilization of solar power adoption. We study solar energy's impact on CO2 reductions at both regional and neighboring levels, by employing a data-driven distributed lag model. This research scrutinizes the delayed effects of additional solar power generation on CO2 emissions reduction, uncovering regional disparities. Additionally, our focus is to examine interregional connections, underscoring the potential for collaborative solar adoption efforts to collectively reduce CO2 emissions over the next decade. This study on solar energy adoption has the potential to provide vital guidance to policymakers and stakeholders working to meet CO2 emissions reduction targets within the U.S. power sector.

Fair Allocation of Conflicting Items [In collaboration with Northwestern University]

The problem of fair allocation of indivisible items becomes more complicated when certain item pairs conflict with each other, rendering those pairs incompatible while allocating them to the same agent. This problem setting finds its relevance in scenarios such as course allocation, where students (the agents) express preferences for courses (the items), and courses may possess conflicting schedules, represented by an interval conflict graph. Additionally, courses have finite seat capacities, and students may have constraints on the number of courses they can enroll in. The goal is to obtain a fair and feasible allocation of items among the agents while ensuring that each allocated bundle constitutes an independent set within the interval conflict graph. While the problem is NP-hard under a general conflict graph, we devise efficient solutions when items are represented as intervals, that is, considering an interval conflict graph. We investigate various fairness notions, such as maximin fairness and almost envy freeness, that are pertinent to this problem setting and devise solutions using a number of interesting techniques that are tailored to different assumptions on the agents' preferences over a bundle of items --- uniform additive, binary additive, identical additive, and non-identical (general) additive preferences.

Estimating the risk of missing required vaccination among infants in India, Mali, and Nigeria [Harvard CRCS Postdoctoral Fellow with GAVI grant]

Many children in low-income and middle-income countries fail to receive any routine vaccinations. There is little evidence on how to effectively and efficiently identify and target such ‘zero-dose’ (ZD) children. We examined how well predictive algorithms can characterize a child’s risk of being ZD based on predictor variables that are available in routine administrative data. We applied supervised learning algorithms with three increasingly rich sets of predictors and multiple years of data from India, Mali, and Nigeria. We assessed the performance based on specificity, sensitivity, and the F1 Score and investigated feature importance. We also examined how performance decays when the model is trained on older data. For data from India in 2015, we further compared the inclusion and exclusion errors of the algorithmic approach with a simple geographical targeting approach based on district full-immunization coverage. We observed that cost-sensitive ridge classification correctly classifies most ZD children as being at high risk in most country-years (high specificity). Performance did not meaningfully increase when predictors were added beyond an initial sparse set of seven variables. Region and measures of contact with the health system (antenatal care and birth in a facility) had the highest feature importance. Model performance decreased in the time between the data on which the model was trained and the data to which it was applied (test data). The exclusion error of the algorithmic approach was about 9.1% lower than the exclusion error of the geographical approach. Furthermore, the algorithmic approach was able to detect ZD children across 176 more areas as compared with the geographical rule, for the same number of children targeted. In summary, the predictive algorithms applied to existing data can effectively identify ZD children and could be deployed at low cost to target interventions to reduce ZD prevalence and inequities in vaccination coverage. This work is supported by the Global Alliance for Vaccines and Immunization (GAVI)

Healthcare Intervention for Telehealth using Restless Multi-Armed Bandits [Harvard CRCS Postdoctoral Fellow]

In many public health settings, it is important to provide interventions to ensure that patients adhere to health programs, such as taking medications and periodic health checks. This is extremely crucial among low-income communities that have limited access to preventive care information and healthcare facilities. To tackle this, a non-profit organization, called ARMMAN, conducts a telehealth program that provides free automated voice messages for spreading preventive care information among pregnant women. One of the key challenges we tackle is to ensure that the enrolled women continue listening to the voice messages throughout their pregnancy and even after childbirth. Disengagements are detrimental to their health since they often have no other source for receiving timely healthcare information. Systematic interventions, for example, scheduling in-person visits by healthcare workers, can help increase their listenership. However, interventions are often expensive and can be provided to only a small fraction of the enrolled women. We model this as a restless multi-armed bandit (RMAB) problem, where each beneficiary is assumed to transition from one state to another depending on the intervention provided to them. We establish convergence of our proposed algorithm when the transition probabilities are unknown a priori. On average, our method improves listenership to 1.64 times over state-of-the-art Myopic policy (intervene on those who were more likely to drop out of the program) and also outperforms other baselines. Additionally, challenges such as uncertain behavior dynamics, changing sets of patients, and disparity in effort distribution among health workers, add various dimensions to the health intervention problem. I have developed solutions for constrained intervention planning using RMABs while tackling challenges related to unknown transition probabilities, robustness, streaming arms, and fairness.

Mobile Health Van Demand Prediction and Placement [Harvard CRCS Postdoctoral Fellow]

Mobile health vans play an important role in increasing access to preventive healthcare for patients from low-income populations. The effective functioning of mobile health vans requires an accurate prediction of the daily user demand at a particular location (number of footfalls). In collaboration with the Family Van (TFV), we develop a novel methodology to predict future demand. We extract features using the daily user demand data provided by TFV, together with data curated from various public data sources, such as weather, bike usage, ferry usage, etc. Empirical evaluation on a real-world dataset from TFV demonstrates that our AI-based method achieves 26.4% lower Root Mean Squared Error (RMSE) than the historical average-based estimation (presently employed by TFV). Our algorithm makes it possible for mobile clinics to plan proactively, rather than reactively. We are working towards leveraging these predictions to help The Family Van’s daily scheduling of staff and healthcare resources.

Visiting Researcher, Google Research (July to October 2020)
I worked with a non-profit organization that carries out call-based program for spreading maternal care information among pregnant women, targeted towards low income households in India. The problem was to dynamically decide whom to provide personalized intervention (say, a visit by a healthcare worker) with the goal of improving the overall well-being of the pregnant women who have enrolled to the free call-based program.

We formulated the maternal health intervention problem as a restless multi-armed bandits problem with unknown uncertainty model, where each beneficiary (i.e., each enrolled woman) is assumed to transition from one state to another depending on the intervention provided to them. If the transition probabilities are known beforehand, then one can compute Whittle Indices for indexable RMABs and use them for selecting a subset of women who would receive the maximum benefit out of the interventions. However, in practice, the transition probabilities are unknown a priori. We propose a Q-Learning-based mechanism for the problem of balancing the explore-exploit trade-off and show that it converges to the Whittle Indices under the indexability assumption.
Our empirical evaluation demonstrates that the intervention scheme employed by our proposed mechanism significantly improves the engagement among beneficiaries compared to other benchmark algorithms.

Doctoral Thesis, Indian Institute of Science, India (2016-2020)
My doctoral thesis addresses fairness concerns that arise in the areas of Computational Social Choice Theory and Machine Learning. I have investigated fairness notions in three important contexts: (1) allocation of indivisible resources (for example, allocating computing resources among interested departments), (2) recommendation in two-sided platforms, and (3) classification problems (for example, predicting recidivism). The problem of fairly allocating indivisible items spurs challenging existential and algorithmic questions in the field of theoretical computer science, namely, computational social choice theory. My doctoral thesis significantly improves the theory of the fair allocation of indivisible goods, by establishing that fairness can be achieved in an efficient manner, even for a broad class of problems with structured set constraints. I have demonstrated the generality of this theory via various application domains, such as recommendation systems and college admissions. I have formalized novel fairness notions and provided a hierarchical relationship between the new and existing fairness notions. My work also elucidates the importance of auditing and mitigating unfairness in classification problems. These results put forward several simple abstractions for fair decision-making and connect the models to concrete applications where fairness is a core priority.

Research Intern, Microsoft Research Cambridge, UK (April to July 2019)
The broad focus of the internship was to understand fairness concerns that may arise while predicting the severity of mental-health conditions in patients. The learning problem was to predict the improvement in depression scores by observing behavioral data (obtained from an online platform that provides Cognitive Behavioral Therapy to over 200,000 users). Joint work with Danielle Belgrave, Sebastian Tschiatschek, Isabel Chien, Tim Regan, David Carter, Anja Thieme, Jan Stuehmer and Aditya Nori.

We defined appropriate fairness notions for the regression problem to audit the prediction models.
The fairness criteria ensured that a flag is raised whenever a model performs disproportionately worse for a particular sub-population (say, for patients who are more than 60 years old).

Research Intern, Microsoft Research India (August to December 2018)
The main goal of the internship was to explore various fairness issues in classification problems, and to develop a prototype of an algorithmic fairness tool that can be used for detecting discrimination and ensuring fairness.

We formalized an appropriate fairness concept and provided an end-to-end method to ensure fairness with minimal compromise on accuracy, even under distributional changes between the training and test datasets. This is a joint work with Suvam Mukherjee, Microsoft Research India.
We provided a theoretically sound explanation for the accuracy-fairness trade-off and conducted an extensive set of experiments to show situations where it is better to just try to maximize accuracy, and in situations where we should not. This is a joint work with Amit Deshpande, Amit Sharma, Navin Goyal and Siddharth Barman.

Xerox Research Centre India (XRCI) (July 2014 to July 2016)

Ride Share Project: Developed the back-end of the “ride matching module” for the Urban Mobility Project which includes features like multi-modal trip planning, ride sharing and cost sharing of the rides.
Electric Vehicle infrastructure management: Designed solutions to find optimal locations to place charging stations such that there is sufficient demand in each charging station, the queue waiting time is less than t minutes (t >0) and there is at least one charging station within a distance of d km (d > 0).
Increasing utilization of park-and-charge stations: Designed mechanism for reducing overstaying-time of electric vehicles at park-and-charge facilities, by modeling user-behavior using real-data and introducing adaptive penalties.
Smart City Management: Designed algorithm for incentivizing people to involve and engage in city management activities like reporting manholes, traffic jam etc. using gamification.
Skill Management System: This project aimed at building an online platform for facilitating skill gap analysis in an organization. It helps to maintain and update skill records for each employee, uses information about upcoming projects to determine the skills that are in shortage, and recommend skill training to the employees. My major contribution in this project was developing the module for recommending skills to employees while balancing between personalized and popular recommendations.

During Masters in Engineering (August 2012 to June 2014): Selected projects

Designed a truthful mechanism for Budgeted Multi-armed Bandit problem in application to crowdsourcing (Post Graduation Thesis).
Designed a website for accumulating alumni details where crowdsourcing methods were implemented to create Alumni Network (Game Theory Course project in a group of three).
Using data set of KDD CUP 2001 (Thrombin.dat) got 94% correct predictions using feature selection methods and a combination of supervised learning methods as a part of Data Mining course project.
A program using Jimple Code for null pointer analysis as a part of Program Analysis and Verification course project.

During Bachelors of Technology (August 2008 to June 2012): Final year projects

A web-based design, Learner’s Management System as a part of B.Tech final year project (in a group of four) under the supervision of Mr. Tamal Chakraborty, IEM.
Improved execution time for pattern search as a part of B.Tech final year research project (group of four) under the guidance of Mr. Tamal Chakraborty, IEM.

Google Sites

Report abuse