Collaborative Robotics Final Group Project:
Robotic Dexterity: Principles and Practice | Stanford Graduate Mechanical Engineering
Collaborative Robotics Final Group Project:
Our team successfully completed all pick-and-place tasks, as shown below. The XArm manipulator successfully detected the target object and zone in a cluttered scene, planned and executed a grasping policy (with a custom gripper in Task 3), and traversed to the target zone, safely placing the object in it.
Task 1: Box Pick/Place
Task 2: Peg Insertion
Task 3: Bill Pick/Place
Task 4: Coin Pick/Place
Object detection was a particularly challenging aspect of this project, with many design iterations required before achieving working object segmentation for each task. For tasks 1 and 2, a broad segmentation based on general HSV colors of red, blue, and green colors were satisfactory for robust detection of each object. However, for tasks 3 and 4, the target objects (dollar bill and coin) did not have uniform color, and were often confused with objects in the surrounding environment (e.g. table). For these tasks the unintended environmental objects being segmented always appeared larger in the camera frame than the object itself. Therefore, by taking the second largest contour, accurate segmentation of the object of interest was possible.
Gripper Design
Designing a gripper to successfully grasp each object was a challenging mechanical design project.
Our group adopted a simulation-first approach to testing our algorithms, allowing for rapid iteration and safer debugging.
However, after simulating the first task, the sim-to-real gap proved a significant bottleneck, due to limitations in contact modeling, sensor fidelity, and lack of realistic object deformation. Furthermore, the force-torque sensing pipeline did not function reliably in the Gazebo simulation environment.
As a result, algorithms for each task were developed and tuned directly on the real robot.
Integrate DenseTact-mini fingers with the gripper design for high-resolution tactile feedback.
Utilize more sophisticated models for object detection (e.g. YOLO V8/V11, Grounded-SAM2, etc).
Use an anthropomorphic hand with more dexterous gripping capabilities (e.g. LEAP, Allegro, etc).
Implement a physics-based simulation of each object, incorporating their material properties.