Implementation

Summary

Our implementation consists of two parts. The first part is to generate a structure in simulation, take photos of it, and reconstruct the bottom left corners of the structure from these images. The second part is for Robob to explore a separate world with blocks placed around the world, then pick up and place blocks in order to rebuild the structure. These two parts communicate through a text file that the bottom left corners are written to/read from.

Implementation (CV)

Main components used:

We used a Kinect camera (it does have a depth image, but we used parallel RGB images so we could use the triangulation techniques we learned in class). We also generated our own blocks using a urdf file, and we simulated the cameras and blocks with Gazebo.

URDF Files:

We wrote urdf files for the different types of blocks we would be using.

    1. Structure block (cookie monster): This was the block we used to build the structures we took images of. We made this block very heavy so the structure would be stable.

    2. Aruco Block: (oscar) This was the block we used when rebuilding the structure. It is dumbbell shaped to make it easier to pick up, and it has an aruco marker on top of it which helps Robob locate and pick it up.

    3. Placeholder block (count von count): This represents a placeholder we used to tell Robob where to place a block. It is just a very thin aruco marker.

We named our structure block "cookie monster" because it is blue.

We named our ArUco block "oscar" because we named the other block cookie monster.

We named our placeholder block "Count Von Count" because of his sharply defined features.

World Simulation:

The world_simulation package is used for generating both the world with the structure and the world for Robob to move around in.

    1. The world_simulation.py file generates one of the two worlds, depending on which function is called in main().

    2. After creating a structure, the functions in take_photos.py are used to take photos of the structure. Originally, we spawned 24 kinect cameras to take photos of the structure from different angles, but after getting some advice during our presentation, we modified it so we simply use one camera that moves around and takes 16 photos (or 8 pairs of parallel photos). The user can also specify if they want a different number of images/different image angles. We move the camera by writing to the gazebo model state topic.


Generate Schematic:

The generate_schematic package is used to process the images we take and generate the bottom left coordinates of the structure to send to our robot. The steps that we use to generate these corners are:

  1. Detect the corners of the blocks in each image using Harris corner detection (feature_detect.py)

  2. For each pair of images, we matched the corners found in the images using OpenCV's BFMatcher. We then filtered these matches to only include matches with an epipolar error less than 0.01, using the techniques from the CV lab. Finally, we generated the 3d coordinates corresponding to these matches using the triangulation technique from lab (image_matching.py). Unfortunately, as you can see in "Raw world coordinates" section of the diagram below, there was a lot of noise in these 3d coordinates.

This shows the matches detected between the corners in one pair of images.

3. To get rid of the incorrect 3d points, we applied our custom-designed filtering algorithm in generate_schematic.py, explained below:

    1. We don't know where the corners of the blocks are, but we do know the block sizes. The goal of this first step is to find the x and y offsets of the closest corner from the origin. The z-offset is trivially 0. In order to find the x and y offsets, we loop through many potential offsets (0.01 apart), filter out points that don't make sense given these offsets and the block sizes, and choose the offsets that filtered out the fewest points.

    2. Once we know the offsets of one particular corner from the origin, we can again filter out points that don't make sense given these offsets and the block sizes. Then, we use the block sizes to determine all potential corner locations (essentially an infinite grid). We then round all remaining points to the closest possible corner location, and remove duplicate points.

    3. We divide the 3d coordinates into layers, where each layer contains all points with the same z-coordinate. Then, we loop through the layers from top to bottom, and at each layer apply a "square filter" (removes points that aren't part of a square of points). Then, we project all remaining points down to the layer below, because every corner that appears on a layer must appear on the layer below it.

    4. Finally, we take our set of corners and calculate the bottom left corners of each block (essentially all corners excluding the top layer, corners with the max x coordinate in every layer, and corners with the max y coordinate in every layer). We write these corners to output/schematic.txt

    5. The method get_robob_corners() reads in the bottom left corners from the file. Currently, it removes all coordinates that don't have a z-coordinate of 0, and takes in a spacing argument that specifies how far apart the corners should be (since the robot places blocks more spaced out than the original structure)

Raw world coordinates represents the coordinates reconstructed from triangulation, world coords before filtering represents the rounded coordinates before we apply the square filter/push down, and after processing represents our final corners.

Implementation (RoBob Kinematics)

Main Components Used:

The backbone of the kinematics portion of our implementation are the nodes that facilitate moving to a new discovered block and picking/placing a block. To allow Robob to find a new block in its world, we utilize SLAM and gmapping as well as the navigation and determine_navigation_goal nodes. When Robob is ready to pick up or place a block, we take advantage of ArUco which is an augmented reality package that allows use to detect a block's pose by finding a unique ArUco marker through Robob's camera.

The high-level communication graph is shown on the left, and a detailed diagram of "Determine Nav Goal" is shown on the right.

Tiago model in Gazebo, developed by PAL Robotics
High level diagram of Kinematics Implementation
Detailed pipeline of moving TiaGo to pickup/placedown blocks

Launch Files

In this part of the implementation, we used 3 launch files:

  1. One to launch the ROS Gazebo program as well as load in the TiaGo model and its controllers. This launch also takes care of initializing SLAM and gmapping

  2. Another to launch the nodes that facilitate picking up and placing down blocks. This also launches two ArUco nodes, (1) to find blocks to pick up, and (2) to find markers on the ground at which to place blocks.

  3. The last launch file takes care of navigation. We launch 2 nodes, (1) to use the gmapping grid to find new blocks to pick up, and (2) to navigate to the block's destination.

Finding a Block to Pickup

(Top) View of initial world in Gazebo, (Bottom) Costmap generated by gmapping to allow TiaGo to move to a specified goal

To find a block to pick up in the initial world, we take advantage of the SLAM and gmapping costmaps. The costmap is essentially a large matrix where each cell corresponds to a certain grid-point in the world. When gmapping detects a block using TiaGo's lasers, the cell in the costmap associated with that block will have a high value indicating there is a high probability that an object is there.

Then, gmapping sends the Nav Goal node an updated costmap which we process to find block coordinates in the world. When we find the first block (moving left to right, top to bottom), we send Move Base a 3D Position and the action client will attempt to actuate TiaGo continuously until the position is reached. Not only do we tell Move Base to move to the position, but we also ensure that once arrived, TiaGo is facing the block in the correct orientation.

Picking up a block

(Top left) Camera view of an ArUco block, and a marker indicating the blocks pose, (Bottom left) View of RViZ of TiaGo preparing to pickup a block, (Right) TiaGo picking up an ArUco block in Gazebo

When TiaGo has arrived to a block, we then utilize the Pickup/Placedown Block node to get the robot to detect the block's pose via it's camera and actuate its arms and grippers such that it picks it up. The following procedure is what's used to pickup and ArUco block:

  1. Actuate the head to face downwards and wait for ArUco to detect the block. This is done by listening to the aruco_single/pose topic for the block's pose.

  2. Transform the object's pose from the ArUco reference frame to TiaGo's base frame (base_footprint)

  3. Use MoveIt to place a planning obstacle for the block, and take advantage of TiaGo's internal controllers to actuate the arms and grippers to pickup the block. When we find the transformed object pose, we send TiaGo's controllers the position and it will actuate the arms such that the grippers are outside the block's handle

  4. Close the grippers with force and lift TiaGo's torso such that block is not intersecting the lasers on the bottom.


Placing a block down

(Top left) Camera view of an ArUco place-down marker, (Bottom left) Gazebo view of TiaGo placing a block over the ArUco marker, (Bottom) RViZ illustration of TiaGo moving to block destination

When TiaGo has successfully picked up a block, we tell Move Base to navigate to the middle of the world, and then to the block's destination. This is because the costmap is too small to cover the entire world at the same time for computational reasons. Therefore, once TiaGo moves to the middle, it's able to detect the surrounding blocks in the right half of the world and will be able to successfully plan a path. (As a side note, the block's destination is generated by the CV portion of this project)

Once TiaGo has arrived at the goal position, it will actuate its arms to move the block away from the camera view. Then, we tilt the head down to find the ArUco place-down marker (as seen on the top left). Similar to the previous section, we listen to the single_aruco/pose topic to find the pose of the place-down marker. Once detected, we will manually actuate the arms in 3 phases: (1) Actuate the arm such that the block is hovering over the marker, (2) Open the grippers all the way in a rapid manner, and finally (3) Lower the end-effector in the z-direction. An image of the aftermath is shown in the bottom left.

Once the above has finished, we gracefully actuate TiaGo to back out of its current position without hitting the block. This consists of a short jolt backward, followed by a 90-degree rotation clockwise, and then a steady movement backwards for a couple seconds.

At this point, TiaGo will navigate back to the origin and repeat this entire process until all blocks have been picked up and placed down.