ME 326 (CS 339R) Collaborative Robotics
Problem Statement
How do we make a robot a safe, effective collaborator and assistant for a human conterpart?
Methodology
We use the TidyBot++ base with a custom bimanual manipulation setup using the 6-DoF WidowX Arms. The TidyBot++ has two head-mounted Intel RealSense depth cameras and a LiDAR for localization. Our TidyBot++ implementation utilizes a ROS2-based finite state machine to coordinate a full pick-and-place pipeline, translating voice commands into actions through Gemini-powered NLP and multi-modal perception. The system integrates YOLOv8n and MediaPipe for 3D object and hand localization, then executes movement via SLAM Toolbox coupled with Navigation2 and a manipulation sequence that relies on numerical IK and collision checking.
Presentation » Link
Github Repository » Link
Task 1: Object Retrieval
We use natural language to request a desired object is taken to a desired location (e.g. "Locate the banana in the scene and bring it back to origin."). Our robot can interpret the verbal command, scan and navigate the scene, locate the banana in a cluttered environment, navigate to the banana, pick it up with the manipulator, and place it at a desired destination.
Task 2: Sequential Task
We use natural language to request a sequence of actions to be performed in a given environment. For example, "Find the banana and find the basket, then place the banana in the basket."
Task 3: Object Retrieval with Human Handoff
Using a natural language request, this task requires our robot to navigate a scene, find and retrieve an object, navigate toward a human and safely handoff the object to the person's open palm.