Key Projects

GAN-based 3D-Consistent Novel View Synthesis

Supervised by: Prof.Shubham Tulsiani (CMU)

Merged concepts from the pi-GAN and pixel-NeRF papers to develop a GAN-based approach to generate a probability distribution over 3D-consistent novel views of a scene from a single-view
The key insight was to modify the GAN architecture to use an image-conditioned NeRF model as a generator along with a discriminator to learn a 360-degree 3D consistent probability-distribution model for objects

Supervised by: Prof.David Held (CMU)

Improved the performance of PlanT, a learning-based route planner for autonomous driving, by 4%
The key insights were - incorporating the historical states of vehicles and ensuring consistency in the frame of reference for vehicle state parameters that improved route optimization in complex environments

Supervised by: Prof. Mausam, Prof. Rohan Paul (IIT Delhi)

Using goal-reaching demonstrations as a source of weak supervision, built an imitation learning model to predict a set of grounded constraints necessary to accomplish a task, given a natural language instruction and the world state
Incorporated Graph Neural Networks (GNN) to encode the environment state and SBERT model for language encoding
Applied goal-conditioned spatial attention on GNN; predicted necessary constraints auto-regressively using policy gradient

Supervised by: Prof. Rohan Paul (IIT Delhi)

Using the Extended Kalman Filter, estimated the state of a drone in simulation with noisy non-linear motion and observation
Designed a data association strategy for a single-sensor two-vehicle simulation, having different noise characteristics

Supervised by: Prof. Mausam (IIT Delhi)

Built a logistic regression model using an ensemble of tf-idf features and balanced class weights for multi-class text classification
Developed a deep neural classifier using Bi-Directional LSTMs and modified F1 loss to achieve a higher F1 score of 88% (+2% more than the second-best in the batch) and faster training time; used trainable GloVe embeddings for initial text encoding

Supervised by: Prof. Rijurekha Sen (IIT Delhi)

In collaboration with Dartmouth College, built a respiratory sound classifier model to study the harmful effects of Delhi’s air pollution on human respiration; reached a precision score of 94% in classifying sounds of cough, sneeze, and wheeze
Trained a random forest classifier on Google AudioSet using features like Mel-Frequency Cepstral Coefficients (MFCCs) and spectral entropy; employed bagging and boosting techniques with random under-sampling to address class imbalance

Supervised by: Prof. Mausam (IIT Delhi)

Developed AI Bot for a 2-player strategy game using Mini-max algorithm with iterative deepening and alpha-beta pruning
Implemented heuristics such as PV search and caching like Zobrist hashing to find the most efficient moves in minimum time
Experimented with temporal difference (TD) learning to learn parameters of the scoring heuristic for comparing different states

Supervised by: Prof. Chetan Arora (IIT Delhi)

Calculated the camera calibration matrix for the webcam using multiple views of an ArUco marker; did pose estimation of the markers to obtain a homography matrix and rendered a 3D ball-shaped object with the same homography as the markers
Created a ping-pong AR game using the pose of two markers such that the virtual ball tosses between the markers as racquets

Supervised by: Prof. Maya Ramnath (IIT Delhi)

Developed an end-to-end user interactive web app for data handling of US Patent Assignment dataset with 14.9M entries
Based on the access rights, users can query or update the database and view data statistics in real-time (used SQL)

Page updated

Google Sites

Report abuse