Robotics and Optimization for Analysis of Human Motion (ROAHM) lab
I worked at the ROAHM lab from the fall of 2024 to end of my winter 2025 semester. I primarily worked on two projects - VLM based object localization and grasping, and computing wrench-spaces form visual demonstration for automating cell manufacturing process.
In this project, I developed a vision-based approach to compute wrench spaces—the set of forces and torques applied during manipulation—from real-world demonstrations in a cell manufacturing environment.
Unlike traditional physics-based models, which may not reflect the variability in human handling, this method captures visual data of human-object interactions and estimates the 6-DoF pose of the object relative to the hand. From these estimates, we compute the corresponding wrenches, enabling the system to learn practical force limits directly from human demonstrations.
Through multiple trials and visual observations, we derive empirically bounded wrench spaces that are both safe and adaptable for robotic systems—particularly for parallel-jaw grippers used in manufacturing. The UMI gripper system was used for reliable data collection after evaluating several object and hand pose estimation methods.
This work enhances robot safety, performance, and generalization in manipulation tasks, bridging the gap between human intuition and robotic control.
ROB 535
This work builds on BEVFormer — a state of the art framework for generating a 2D bird's-eye-view representation of the environment around a vehicle. We use these outputs as the inputs to our planner, which is a model predictive controller augmented with control barrier functions (hence MPC-CBF) for obstacle avoidance. Click here for code and implementation details.
At a high level, our contribution was as follows:
Utilize classical computer vision techniques to extract ego vehicle and obstacle positions from BEVFormer output
Mathematically transform the detected obstacles into the static world frame
Compute bounding ellipses for each obstacle
Assigned a control barrier function (CBF) to each obstacles
Developed a model predictive control (MPC) formulation to plan a path from the ego vehicle's current location to an arbitrary goal location, integrating the CBF's from the previous step for obstacle avoidance. We used CasADi for the trajectory optimization.
CSE 592 & ROAHM Lab
Given a set of posed RGB-D frames captured from a calibrated camera, the objective is to generate a set of feature-enhanced Gaussian representations that facilitate natural language-based querying for autonomous robotic manipulation tasks. It is assumed that the synthesized novel views provide sufficient information to associate the localized gaussians with stable grasp points.
We further this pipeline by creating a similar environment in PyBullet, featuring a Franka Emika FR3 7DoF serial link manipulator, and a selection of commonly encountered house-hold objects obtained from the YCB dataset. Click here to read the entire report.
Barton Research Group
6D pose estimation is a crucial task in robot manipulation, enabling robots to accurately determine the position and orientation of objects for effective grasping and manipulation. In this project, we address the problem of 6D pose estimation for object grasping using a Microsoft Azure Kinect DK RGB-D camera mounted on the end-effector of an iiwa robotic arm.
We propose a method that involves extracting a point cloud of the object of interest from the RGB-D data and performing Global registration to estimate the 6D pose, and use Point-to-point Iterative Closest Point (ICP) for pose refinement, with a pre-generated point cloud from the corresponding STL file of the object. However, before we perform the ICP on the two point clouds, we introduce a few random perturbations on the orientation of the point cloud generated from the CAD model. These random perturbations are utilized to improve the precision of pose estimation by exploring various initial orientations of the target point cloud.
Our approach aims to provide accurate and reliable 6D pose estimation, facilitating precise robot-object interactions in various manipulation tasks. Experimental results demonstrate the effectiveness of our method in achieving accurate pose estimation, thereby enhancing the robot's grasping capabilities.
The 6D pose estimation pipeline is then augmented with a custom grasp planning algorithm GraspMixer, for a parallel-jaw gripper.
Click here for code and implementation details. Click here for documentation about hardware experiments.
Linear Feedback Systems
The goal of this project is to implement a Kalman Filter to estimate the rotation dynamics of a satellite, focusing on the angular position and gyro bias using noisy sensor measurements from a star tracker and gyroscope. The underlying model of the satellite’s rotation is described in continuous time, which reflects the natural dynamics of satellite motion, while the measurements from the sensors are sampled in discrete intervals, mirroring real-world data collection scenarios. The filter was designed to handle real-world challenges such as constant gyroscopic bias and Gaussian white noise in measurements. Through systematic simulation, I set the satellite to undergo a constant rotation, with true conditions unknown to the estimator to mimic practical scenarios.
The project includes detailed visualizations such as the comparison of estimated values with true and measured data, error dynamics over time, and the effectiveness of bias estimation. These plots not only demonstrated the efficacy of the Kalman Filter in reducing estimation errors but also highlighted its critical role in enhancing the reliability of satellite navigation systems. The implementation and subsequent analysis provided valuable insights into the adaptive capabilities of Kalman Filters in complex, noisy environments.
ROB 550 Course work
This purpose of this project is to gain understanding of SLAM and how it can be employed in autonomous indoor navigation tasks like in a warehouse environment
The botlab project included a multitude of things, spanning from simple plain odometry to autonomously performing pick and place tasks using SLAM and a custom built forklift mechanism.
We employed gyrodometry (sensor fusion between IMU and wheel-odometry) to obtain robust odometry design (when compared to ground truth). Utilized Monte-Carlo localization, wrote custom action model and sensor model, and mapping code in C++, and finally putting it all together to perform Simultaneous Mapping and Localization (SLAM) using RP LiDAR.
In addition, performed camera calibration using an 8x11 checkered pattern on a 5MP arducam to perform visual-servoing for tasks such as picking and placing crates/boxes of set dimenstion in the workspace, simulating a warehouse scenario.
The robot, also known as 'mbot', is powered by a low level microcontroller pico, and a NVIDIA Jetson Nano is used to run computationally intensive tasks such as path planning algorithm like A* and visual-servoing. This particular Mbot is based on a differential drive platform, on which a simple yet robust PID controller has been implemented.
ROB 550 Course work
This project focused on developing autonomy for a 5-DOF robotic arm to manipulate a variety of objects using integrated computer vision, kinematics, and motion planning techniques. I engineered an end-to-end manipulation pipeline that fused sensing, reasoning, and actuation modules to achieve robust object interaction.
Key Contributions:
Implemented forward and inverse kinematics for a 5-DOF manipulator.
Performed rigid-body transformations using homogeneous transformation matrices.
Enabled object grasping and manipulation through motion control and joint coordination.
Calibrated a depth camera and integrated 3D image data for perception.
Detected and localized objects using OpenCV and depth sensing technologies.
Designed a finite state machine to orchestrate robot behavior across tasks.
Developed path planning and trajectory smoothing algorithms to ensure safe and efficient motion.
Outcome:
Achieved a modular robotic manipulation pipeline capable of autonomously sensing, planning, and acting in structured environments, and designed state machines.
Final Year BE Project -
This project primarily focuses on the 'Localization and Control' problem with the holonomic class of mobile robots. It discusses a computationally less intensive with high rate localization and navigation techniques for a three wheeled omnidirectional mobile robot, using wheel encoders and kinematic relations. Insight to an elegant and pragmatic approach to navigate the robot from point A to point B in the defined work-space, enabling the robot to autonomously reach a target position and orientation defined by the user, is given. Implementation of the ‘Go to Pose’ algorithm is done for both single-waypoint and multi-waypoint navigation. The experimental results obtained reinforce the robustness of the algorithm that incorporates PID controller.
This work led to a conference publication in IEEE DISCOVER - doi : https://doi.org/10.1109/DISCOVER52564.2021.9663644
The challenges include: Design and Construction of three wheeled omnidirectional mobile robot, Odometry sensory network, PID tuning (finding result space).
Winner 2nd place in National Finals 2019-20 - Indian Institute of Technology, Bombay
Recently we have witnessed a lot of natural disasters in India and across the world. One such natural disaster is Flooding. This monsoon floods have been rampant across our country. Due to this, providing relief and rescue operations have been at the forefront of activities in the country. Most of these operations revolve around evacuating residents and livestock to safer places; over and above providing sacks of food, medical aid and other necessities to affected areas where people are stranded.
With the incessant downpour of rain in recent times and metropolitan cities getting flooded; e-Yantra came up with Flood Disaster Management theme called “Supply Bot”. The idea is to build a robot to provide the affected district or cities with aid. The aid to be received by the district is communicated wireless to the Bot using an overhead imaging camera that emulates a Satellite.
Once the Bot has identified the beacon of help: release of food, medical aid or other necessary goods for surviving floods or evacuation is carried out. The package of supplies required by the district or city is embodied in the beacon (metaphorically). Moving this package as close as possible to the affected district or city is the primary task of the Supply Bot. If the capital of the state is affected, the Bot has to attend to it first.
The challenges in the theme include: designing and building a robot with basic components given, Python Programming, Image Processing, Embedded C Programming.
Designed and Implemented
A health care monitoring system aimed at paralytic patients was developed which included several sensory modules and actuators. The project basically consisted of a glove that housed electronics which continuously monitored patient's vital signs, such as body temperature, and Beats Per Minute (BPM). This data was continuously logged on to real time data base at fixed time intervals, and the same was retrieved from the database and was being visualized in graphs and other formats at a distance and alerted the doctors/nurses/guardians accordingly.
The existing system consists of communication using GSM which is not very efficient. The proposed system makes use of ESP8266 for communication which is much more efficient. The proposed system output can be viewed by the doctor or guardian on their computer screen and also the OLED can be used as a part of a wrist watch and be given to the concerned person to monitor the heartbeat continuously. In the existing system, a separate LM35 temperature sensor is used to measure the temperature. In the proposed system, we use MPU6050 which can also be used to measure temperature. The proposed system also aims at using RFID tags which can be used to store all information regarding a particular patient in the chip. Later, the doctor or guardian can just scan the tags to obtain detailed information about the patient. This makes the system more secure and easy to handle.
Of all the domains that needs uttermost attention, health care seemed to crave a bit more attention. As Indians, most of the common people cannot afford rich hospital facilities, and if their health issues and conditions are not addressed within the stipulated time, we cannot cure the patient. More so , paralytic patients face a great deal of difficulty and challenge in performing day to day activities. Due to their restricted movement of their body, they struggle and take time to convey a simple message such as conveying their guardian when they are thirsty, or need to use the washroom, etc. This project aims to eliminate this barrier of communication by assigning simple functions based on the hand tilt of by using various electronic sensors and actuators as discussed above. Also in order to keep track of their vital signs such as temperature and heart rate or pulse rate, all of these health signatures are pushed to a local webserver or a cloud, known as firebase , which works in real-time. The same information can be retrieved automatically and is given to the doctors in a timely basis, so that when no immediate check on the patient is required , the doctor or nurse can look on this information to monitor the health of such paralytic patients in real-time.
Control of mobile robots is a very vast and diverse field within Robotics and Automation. We attempted to design a robot that is entirely autonomous and successfully achieved doing so. The project's objective is to develop a robot that can follow an n-shaped structure, or rather a polygon ( n - sided ), based on the coordinates of the sides ( x, y ) of the polygon given initially. These coordinates are taken in as system variables and a PID controller governs the traversal of the robot in the desired trajectory.
This project was developed as a practical exposure to the course "Control of Mobile Robots" by Prof. Magnus Egerstedt from The Georgia Institute of Technology
Two-factor authentication (2FA) is a security process in which the user provides two different authentication factors to verify themselves to protect both the user's credentials and the resources the user can access. 2FA provides an elevated level of assurance than authentication methods that depend on single-factor authentication, in which the user provides only one factor predominantly a password or passcode. This 2FA method relies on users providing a biometric scan as well as a second factor, passcode or one time generated password (OTP). Passwords only provide evidence, or proof of knowledge whereas biometrics provides unique advantage since it relies on identifying the user by “who they are” compared to “what they know” or “what they have”. The scanning process starts when one places their finger on the glass surface, and a CCD camera takes a picture. The scanner has its own light source, typically an array of light-emitting diodes, to illuminate the ridges of the finger. OTP generation methods predominantly make use of pseudo-randomness, making prediction of successor OTPs by an attacker difficult. SMS based OTP is sent to the user’s registered mobile number via Global System for Mobile communication (GSM) module. Therefore, this adds an additional layer of security to the authentication process by making it even more difficult for intruders to gain access since knowing the victim's password alone is not sufficient to obtain full access.
Two-factor authentication adds an additional layer of security to the authentication process by making it even more difficult for intruders to gain access since knowing the victim's password alone is not sufficient to obtain full access. Two-factor authentication has long been used to control access to sensitive security systems and data, and online service providers are increasingly using 2FA to protect their user’s credentials from being used by hackers who have stolen password, or used phishing campaigns to obtain user credentials.
Presented a conference paper in International Conference on Recent Trends on Science and Technology - 2020 based on this work.