Our primary goal of getting Baxter to identify a towel, find important corner points, and execute a folding motion was accomplished. We were able to perform this task effectively and consistently with different sizes and colors of clothing, positions of the table with respect to the gripper, and lighting conditions. We did fall short of our auxiliary tasks of stacking the clothes and placing them on a Turtlebot.
Baxter consistently executed a triangle fold in an efficient manner. The arm first moved to a consistent calibration position so as to take an image of the AR tag and the towel in the same frame. This position helped ensure that our pictures remained similar between trials, which increased robustness. It identified the AR tag on the table and calculated the transform between the camera, gripper, and the tag. The picture was also passed to the image analysis pipeline to perform homography for straightening. These two steps allowed us to shift the towel without changing the overall results. The known markers around the AR tag were used to calculate physical distances. A convolutional neural network identified corners, and offsets were found. The neural network worked very well and found the corners correctly nearly every time. The folding motion was finally executed with these calculated offsets using an inverse kinematics solver.
The model was robust to changing conditions. Even if we moved the towel and the table around, because the camera detected the position of the AR tag, it was able to respond to these changes and adjust the positioning appropriately. Additionally, we tried the algorithm with different towels and even a jacket, and the Baxter folded all of them. At times, the room was brighter, but it did not affect the performance of the folding.
Most of our technical challenges were related to either hardware availability or some non-deterministic behavior on the robot or computer.
Particularly at the beginning of the project, we had a hard time getting the cameras to function properly. We were not able to reliably read from the head camera, left gripper, right gripper, and USB camera back-to-back which made evaluating the performance of each camera difficult. Once we had chosen the camera, we occasionally ran into issues with the camera not being detected that required a restart.
Additionally, at the beginning of the project we had a hard time finding a gripper that could close all the way because they were being used by other robots during our session. This led us to consider 3D printing our own gripper or use foam blocks to allow it close fully. However, we later were able to find a reliable gripper that we used for the rest of the project.
Another issue we faced was Baxter not being able to run multiple services at a time, such as the camera, planning, and AR tracking services. We originally thought this was due to a bug in our code, and spent considerable time trying to debug it. However, after switching computers while using the same robot solved the issue, so it was instead a bug with the computer we were using.
Furthermore, another challenge we encountered with our project was that towel folding and clothing folding in general is very difficult because the fabric material is very thin - only a few millimeters in width. Therefore, every motion by our robot had to be very precise with little margin of error. We were informed very early in the process by TAs that the Baxter Robot itself was not a very precise machine, casting initial doubts on whether or not we would even be able to execute any fold whatsoever using that robot. The fact that we were able to execute such folds just goes to show how robust the rest of our system is, from sensing the towel to path planning to executing all of the folds.
Majority of our work needed to be tested on the physical robot pretty quickly, but between robots being out of commission and the number of teams that required the robot as well, the amount of time we got to spend testing directly on the Baxter robots were limited. Much of our code of initially programmed without knowing if it would successfully run on the robot, and without full access to the Baxter, a lot of debugging was necessary later on to ensure that everything would run smoothly.
We think that our current solution as it stands is very robust - avoiding many of the hacks that we used during our project process - but there is always room for improvement. Here are some of the ways that we can improve our project.
To make our system more generalizable, we could remove the dots at the corners of the AR tag and towel. Currently, these dots make it significantly easier for our model to detect corners. However, with enough training data we should be able to get a model that is able to do the corner detection without the use of the red dots.
Similarly to removing the dots, removing the AR Tag would allow our system to work in more general conditions. The main function of the AR Tag is to get the transform between the table and the gripper of the Baxter using an existing ar_track_alvar package. However, we could consider either trying to calculate a similar transform ourselves using one or more images, or try a different approach to get the transform. One idea we had was to bring the Baxter's gripper down towards the table until it felt resistance and use that to get height information.
We currently have trained our model using a fixed calibration position, which allowed us to get a usable model with fewer images. However, having a model that could pull the corners from a flexible calibration position would make our system more robust to the Baxter not being able to move its arm to the correct spot. This could be achieved by training our model with pictures from different calibration positions.
Currently, the Baxter knows to fold everything one way only, no matter the type of clothing it detects, it looks for only 4 corners. This ultimately does not lead to the most desirable folding results. In the future, we hope to have different folding sequences based on the type of clothing item it is (i.e. t-shirt, pants, jackets).