Ultimately, I want embodied agents to be able to understand and reason about the physical world in an inherently 3D manner, much like humans. My goal is to develop agents that not only understand 3D geometry and semantics but can also ground and fuse multimodal concepts, such as language, into the 3D world. Specific research topics of interest include 4D reconstruction, scene flow, and visual-language reasoning.
VLA-3D: A Dataset for 3D Semantic Scene Understanding and Navigation
RSS 2024, SemRob Workshop
Haochen Zhang, Nader Zantout, Pujith Kachana, Zongyuan Wu, Ji Zhang, Wenshan Wang
Neural Field Dynamics Model for Granular Object Piles Manipulation
CoRL 2023
ICRA 2023, Representing and Manipulating Deformable Objects Workshop
(Oral Presentation, Best Paper Finalist)
Shangjie Xue, Shuo Cheng, Pujith Kachana, Danfei Xu
[ARXIV] [VIDEO] [PROJECT PAGE]
Persistent Pick: Enhanced Grasping with Tactile Feedback
AMLC 2023, Robot Learning Workshop
(Oral Presentation)
Pujith Kachana, Nathalie Hager, Taskin Padir