CtRNet-X: Camera-to-Robot Pose Estimation in Real-world Conditions Using a Single Camera

Jingpei Lu*, Zekai Liang*, Tristin Xie, Florian Ritcher, Shan Lin, Sainan Liu, Michael C. Yip

CTRNet-X

CtRNet-X is a novel framework capable of estimating the robot pose with partially visible robot manipulators. Our approach leverages the Vision-Language Models for fine-grained robot components detection, and integrates it into a keypoint-based pose estimation network, which enables more robust performance in varied operational conditions. 

Robot mask (blue) rendered based on the estimated robot pose on DRIOD robot learning dataset.


Our demo video

final_demo_5.mp4