Hi there! I'm Shuo Cheng (成硕), I'm a Ph.D. student at Georgia Tech advised by Prof. Danfei Xu. I'm generally interested in robotics, computer vision, and machine learning, where my current focus is to enable robots with the ability to reason and perform in complex and highly variable environments for achieving long-horizon tasks. I'm open to discussions and collaborations, please feel free to drop me an email if you are interested.

Before studying at Georgia Tech, I'm also very fortunate to work with Prof. Hao Su, Prof. Ravi Ramamoorthi, Prof. Lin Shao, and Prof. Bingbing Ni.

Featured Publications (Full List)

NOD-TAMP: Multi-Step Manipulation Planning with Neural Object Descriptors

Shuo Cheng, Caelan Garrett*, Ajay Mandlekar*, Danfei Xu

Under Review

CoRL 2023, Workshop on Learning Effective Abstractions for Planning (Oral presentation)


A TAMP-based framework featuring neural object descriptors, capable of learning from only a handful of brief demonstrations yet exhibiting robust performance in long-horizon tasks involving diverse object shapes, poses, and goal configurations.

Neural Field Dynamics Model for Granular Object Piles Manipulation

Shangjie Xue, Shuo Cheng, Pujith Kachana, Danfei Xu

CoRL 2023

ICRA 2023, Workshop on Representing and Manipulating Deformable Objects (Oral presentation, best paper finalist)


A new field-based representation (occupancy density) to model and optimize granular object manipulation.

LEAGUE: Guided Skill Learning and Abstraction for Long-Horizon Manipulation

Shuo Cheng, Danfei Xu

RA-L 2023

CoRL 2022, Long Horizon Planning Workshop (Oral presentation, best paper finalist)


We use Task and Motion Planning (TAMP) as guidance for learning generalizable and composable sensorimotor skills.

Learning to Regrasp by Learning to Place

Shuo Cheng, Kaichun Mo, Lin Shao

CoRL  2021


We propose a point-cloud-based system for robots to transform the initial grasp pose to the desired grasp pose, where stable object placements serve as intermediate waypoints. A challenging synthetic dataset is introduced for learning and evaluating the proposed approach.

Deep Stereo using Adaptive Thin Volume Representation with Uncertainty Awareness

Shuo Cheng*, Zexiang Xu*, Shilin Zhu, Zhuwen Li, Li Erran Li, Ravi Ramamoorthi, Hao Su (*equal contribution)

CVPR  2020 (Oral presentation, top 5.7% accepted papers)


We design a novel multi-stage framework that progressively sub-divides the vast scene space with increasing depth resolution and precision, which enables scene reconstruction with high completeness and accuracy in a coarse-to-fine fashion. 

Fine-grained Video Captioning for Sports Narrative

Huanyu Yu*, Shuo Cheng*, Bingbing Ni*, Minsi Wang, Jian Zhang, Xiaokang Yang (*equal contribution)

CVPR  2018 (Spotlight presentation, top 6.7% accepted papers)


We develop a well-modularized pipeline for automatic sports game narration. We incorporate human pose and optical flow to depict the movements of each player, which facilitates action recognition and human interaction learning.



Co‐organizer: Learning for Task and Motion Planning Workshop, RSS 2023