Sawyer Robotic Arm:
The Sawyer robotic arm is a single-arm collaborative robot designed for tasks that require precision, flexibility, and adaptability. Developed by Rethink Robotics, it features seven degrees of freedom (DOF) and a lightweight design, making it suitable for complex movements and confined spaces.
RealSense Depth Camera:
Intel’s RealSense Depth Camera is a line of devices designed to capture 3D depth information by using stereoscopic vision and other depth-sensing technologies. It can calculate the distance of objects in its field of view with high accuracy.
Parallel Gripper:
A parallel gripper is an end-effector used on robotic arms to grasp and manipulate objects. It is equipped with silicone anti-slip covers to enhance gripping force.
AR Tags:
Strategically placed to provide reference frames for coordinate transformations.
Sawyer Robot & Gripper: A standard Sawyer robotic arm mounted on a work surface.
Camera Mount: A RealSense camera positioned above the workspace. Although not perfectly orthogonal, careful placement helps minimize positional errors.
Printed AR Tags: Placed in the camera’s field of view to aid in coordinate transformations.
Key Software Components:
Python Scripts:
capture.py: Captures images from the RealSense camera.
upload_img.py: Uploads captured images to Google Colab for classification.
download_img.py: Downloads classification results from Google Colab.
main.py: Executes the main sorting logic based on classification and coordinates.
ROS Nodes:
Joint Trajectory Action Server: Controls the Sawyer arm's movements.
MoveIt Configuration: Handles motion planning for the Sawyer arm with the electric gripper.
Camera Display: Visualizes the camera feed for monitoring.
Image Capture and Processing:
The RealSense camera continuously publishes images to a ROS topic.
capture.py subscribes to this topic, captures images, and triggers upload_img.py to send images to Google Colab.
Google Colab processes the images using the "Open-vocabulary Object Detection via Vision and Language Knowledge Distillation" model and returns classification results and 2D bounding boxes.
download_img.py retrieves these results and publishes them to another ROS topic for further processing.
Coordinate Transformation and Motion Planning:
AR tags are detected using a ROS package ( ar_track_alvar), providing the necessary reference for mapping 2D image coordinates to the robot's 3D base frame.
The system calculates a transformation matrix to convert 2D bounding box centers to 3D positions at a fixed height (Z-axis).
main.py interprets these coordinates and sends motion commands to the Sawyer arm via the Joint Trajectory Action Server.
Grasping and Sorting:
The Sawyer arm moves above the target object, descends vertically to grasp it, and then transports the item to the appropriate bin based on its classification.
(Camera → Capture → Upload → Classification →Download → Transformation → Motion Planning → Grasping → Sorting.)