Some of the interesting stuff I've worked on in a roughly chronological order.

Conventional robotic mapping algorithms have used occupancy grids to maintain volumetric maps. However, occupancy grids make very strong incorrect assumptions about independence of cells when raytracing sensor observations. These are magnified when rays are glancing or the sensor data is very noisy. With MRFMaps I present a means of incorporating forward sensor noise models by explicitly associating all voxels traversed by a ray using a markov random field. Multiple rays from multiple camera views explicitly couple information and are then used to infer occupancy using belief propagation on the GPU. Here's a quick five minute spotlight video of the RSS 2020 paper! Please visit for code!

A key insight I acquired while deploying downward facing tracking for MAVs is that the faster you track, the easier it is for Lucas-Kanade to work since the per-frame displacement is smaller, and the constancy of brightness assumption is better statisfied. In this paper we push conventional downward facing frame to frame odometry to be as accurate as possible by exploiting the locally planar ground assumption and the presence of accurate pitch and roll estimates from onboard IMUs in a dense direct homography estimation formulation. The loss in consistency offered by not using a local windowed optimisation algorithm is more than offset by the robustness and accuracy of our high speed frame-to-frame tracking in challenging real-world situations as can be seen in the attached video.

Reasoning directly in the space of distributions of the world enables operating with varying fidelity representations that are application specific. Building off of my prior MI tracking work, in this paper we cast localization as a problem of maximizing the log likelihood of the sensor data (depth) having been sampled from an underlying world distribution (for convenience described by GMMs). To account for observability issues we utilize a particle filter and get very reasonable real-time performance! Head on to Aditya's website to see how some of these ideas have evolved.

In order to navigate efficiently in real-world unstructured environments using visual odometry, it is imperative that the odometry algorithms are robust against dynamic lighting changes. However, conventional visual odometry is predicated on the constancy of illumination assumption remaining valid, which is exactly what's violated in-flight. In this paper, inspired from prior work, I present using Mutual Information as a different cost function that thinks of images as spatial distributions of intensities, and tracking as a process of minimizing the divergence between these relative distributions. This paper won the Best Student Paper award at SSRR '16!

After graduating from my undergrad in June 2012 (with distinction!), I headed to CMU again as a Summer Scholar, and later as Research Staff for Fall under Prof. Drew Bagnell and Prof. Martial Hebert on the BIRD MURI project. The project envisions getting mini Unmanned Aerial Vehicles to autonomously navigate through densely cluttered environments like forests autonomously using only monocular vision. I implemented a mouse based controller for the ARDrone, co-developed a ROS driver for the ARDrone 2.0, implemented a PD controller based trajectory follower and calibrated it in a VICON mocap, and integrated various systems to run state of the art imitation learning algorithms on the drones.

I have presented this research poster and contributed to a paper presented at ICRA '13 while on this project.

I continued work on this research as a MS student at RI using a more locally deliberative approach that used receding horizon control on an Arducopter platform with custom hardware on board that led to a FSR journal paper.

In the Summer of 2011 I had the unique opportunity of working at the Robotics Institute at Carnegie Mellon on the Cooperative Robotics Watercraft project. The project envisioned deploying a large number of inexpensive boats for providing situational awareness for flood relief, and for varied purposes such as environmental sampling.

During the Summer I developed the communication backbone for the boats implementing the still in alpha ROSJava and ActionLib to create a communications library for platform independent implementations and contributed to overall systems development. As part of the Robotics Institute Summer Scholar (RISS) program, I also presented my first research poster and contributed as a co author to an extended abstract and a workshop paper at AAMAS '12.

At the Delhi Technological University during my Junior year I was the head developer for Image Processing of the Unmanned Aerial Systems team.In 2011, the year I participated at the Annual AUVSI Students' UAS competition held at Pax River Naval Airbase, USA, we won the Director's Safety Awards and stood Seventh overall. Our journal paper was adjudged as the third best entry in the competition. The problem statement for the competition was to get a UAS to autonomously take off, follow certain waypoints and then hover over a search area to autonomously recognize scattered wooden targets in various geometric shapes with letters within. The UAS was to autonomously beam back 'actionable intelligence' including the shape of the target, the letter contained within and their corresponding colours along with the GPS coordinates.

In our system, I decided to use a single axis roll compensated gimbal containing a digital still camera attached to an onboard SBC to serve as the acquisition system. After identifying color to be the main discriminatory feature, I implemented a morphological gradient operation for detecting shape boundaries, which, after connected-components detection and subsequent filtering, allowed me to narrow down my possible target blobs. Analysis of the hue plane allowed me to distinguish between the shape and letter boundaries. The shapes were recognized using a novel ray tracing procedure, while the letters were recognized using a discriminative classifier based on Hu moments. This allowed us to correctly identify all target shapes - but not the letters - present in the 2011 AUVSI SUAS competition for the first time, contributing significantly to the team’s standing. Here's a paper that we wrote for the competition.


For my Machine Learning project we tried tackling a standard reinforcement learning problem task of learning to follow a demonstrated trajectory for a helicopter in simulation called the nose-in-funnel manoeuvre. The essential idea was to hybridize iLQR based methods with policy search based methods. The former achieves faster performance by virtue of linearization, but accumulates error quadratically. The latter evaluates the true cost to go by storing policies rather than value functions and rolling forward through the system dynamics at every iteration. We presented our implementation at the end of the class in the adjoining video and also submitted this paper as part of the project.

The performance of the algorithm wasn't better than that obtained using the iLQR method for the given trajectory, possibly because the adopted trajectory was well suited to the dynamics being linear around the hover point.

For our Mechanics of Manipulation project we tackled autonomous pile manipulation on a robotic arm testbed. Our approach was to phrase the problem as finding a ranked list of desirable actions at each time step. Using human labeled data to encode the latent reward function and control, we learnt an output ordering of object-action pairs in a discriminatory manner by formulating it as a Ranking SVM problem. We successfully tested our proposed method in an end-to-end robotic manipulation system, which is demonstrated in this video and reported in this paper.

For my Senior project in the Spring of 2012, I decided to work on autonomous grasping on a 5 Degree of Freedom arm that was lying around in one of the labs. I created a 3D CAD model of the arm and then imported it into OpenRAVE by defining its custom XML file. I then worked on trying to interface the OpenRAVE controller with the real world robot but since it had a USB controller the only way to interface it was using a non documented proprietary DLL file, that paucity of time (In trying to do too much at the same time!) didn't allow me to implement. Here's a very rushed thesis that I wrote for this project. The initial aim was to use visual servoing in the planner.

At Hi-Tech Robotics Systemz, Gurgaon, in the Spring of 2012, I worked for a short while towards creating a 3D photo realistic environment for teleoperating mobile vehicles by fusing imagery and 3D laser rangefinder data, with the long term aim of implementing autonomy.

During my intern I performed extrinsic calibration of a 3D laser range finder and a camera. Initially I simulated the entire process within Gazebo/ROS and later implemented my code on a Velodyne 64E laser rangefinder and a GoPro camera.

To view the corresponding blog post, goto

I used to regularly update this blog before starting out my MS work. Unfortunately, disclosure constraints prevented me from continuing to post about my research. Nevertheless, it's fun to look back on me chronicling building my very first efforts at developing image processing algorithms for aerial vehicles and initial forays into MS research.