Research Projects

Distributional Reinforcement Learning

Guide : Dr. Balaraman Ravindran, RBCDSAI, IIT Madras [B.Tech Project Thesis Link ]

During my final semester, I worked in the field of distributional reinforcement learning under the guidance of Dr. Balaraman Ravindran for my undergraduate thesis. More specifically, I was examining how additional information obtained in distributional reinforcement learning as compared to reinforcement learning techniques in general could be used in beneficial ways. For my undergraduate thesis, I mostly worked on the performance side of the project by finding a way to better solve the most fundamental problem in RL, the explore vs exploit dilemma. As an extension of this project, I am currently working on combining my proposed techniques with others in an effort to prove the applicability of our techniques and publish them.

Multi-Armed Bandits

Guide : Dr. Arun Rajkumar , CSE Dept., IIT Madras [PBMAB Findings and Results Summary ]

In a preference based bandits problem, bandits are played against each other, with the result of winnning/losing modelled as a Bernoulli random variable. The origin of this field was largely based on the observation that people often are unable to give an accurate numeric value to an outcome. However, people are able to compare two options quite easily, making any results for such a problem instance more accurate if depending on data collected from users. My main focus has been on how to rank a set of bandits such that the number of upsets is minimized, where an upset is defined an instance of a pair of bandits B1, B2 such that B1 is higher in the ranking but B2 has a greater than 50% probability of winning against B1. Given all pair wise win/loss probabilities, finding an optimal ranking such that the number of upsets is minimized is an NP hard problem. Therefore, ranking algorithms largely focus on finding an optimal ranking under the assumption that the pairwise probabilities have a certain dependency on each other. The link above summarises the algorithm we've developed, and the relaxation on the set of restrictions imposed as compared to current work in the field. Currently, I am working on testing on whether the ranking algorithm developed for this set of restrictions work in practice, on synthetic and real world datasets to confirm the practically applicability of the results.

Parameterized Algorithms for Red-Blue Weighted Vertex Cover on Trees [arXiv]

In the weighted vertex cover problem, each vertex is assigned a weight. The weighted vertex cover problem aims to to find a vertex cover with minimal sum of weights of vertices chosen in the vertex cover. We initially explored how various standard kernelization techniques for vertex cover can be generalised to the weighted vertex cover problem. We further aimed to find an algorithm for Red-Blue Weighted Vertex Cover problem, in which all vertices are assigned a color, red or blue. The selected vertex cover however has a strict bound on the number of red vertices that can be included.

TCS Research & Innovation Labs

Intern, Hyderabad, India

To identify genetic sequences by developing algorithms using locality sensitive hashing to improve the time taken by existing algorithms on very large databases without a significant decrease in accuracy or an increase in the number of samples required.

May '18 to July '18