The goal of the project is to set up an UR3e robot arm in order to serve coffee to the cafeteria customers. For this, the following tasks were done:
Set up Moveit2 in order to control the UR3e arm
Set up a Perception system so that the UR3e arm is aware of its surroundings
Create a web application in order to control and monitor the whole process
Containerize all the applications using Docker
First the whole scenario was simulated and tested in Gazebo, including the depth camera and the UR3e arm. Then small tweaks were made to optimize the results on the real arm.
The overall node configuration is shown in the diagram below. The realsense node is continuously feeding the barista detector node, with RGB and depth images. The barista detector node is processing this information and calculating the holes where the cup must be placed. Then as soon as the pick & place node receives a request from the website it calls a service to request the coordinates of each hole in the barista robot. Finally, the pick & place node moves the arm.
Hardware Intel Realsense D415:
Stereoscopic.
RGB Resolution 1920 × 1080.
Depth Resolution 1280 × 720.
Depth 16UC1 -> mm.
Configuration:
ros2 param set /D415/D415 enable_color True
ros2 param set /D415/D415 enable_depth True
ros2 param set /D415/D415 rgb_camera.profile 480x270x6
ros2 param set /D415/D415 depth_module.profile 480x270x6
ros2 param set /D415/D415 align_depth.enable True
To be able to fuse the RGB and depth images, it is important that they have the same resolution and that the align depth parameter is set to true. As seen on the images below the images align perfectly.
RGB image.
Depth image.
Fused images.
Then using OpenCV, the holes can be detected using the Hough circles transform, it is important to adjust the minDist parameter to avoid detecting the same circle several times as shown on the image below (yellow and blue circles).
OpenCV documentation.
Circle detection.
Then the intrinsic parameters are used to transform the image coordinates into camera coordinates. The equations used are shown in the left.
Then the camera coordinates must be transformed to world using the extrinsic parameters, from the optic link to the world link.
Frame diagram.
Results
The arm was setup using Moveit2, to avoid collisions with the environment a scene was created and the models imported. Also an orientation constraint was implemented to avoid spilling the coffee by rotating the cup upside down.
The approach to move the arm is shown in the image below. The main strategy is to use joint values as much as possible since there are more robust and faster than to calculate for cartesian coordinates.
Also, to simplify the movement and calculate the cartesian coordinates faster with the orientation constraints an intermediate position is defined, the Hover position.
Demostration.
In site video.