As a research project, we want to ensure that our system is practical for use beyond just this project. We decided that our system must:
Generalize to any orientation of the channel
Generalize to any type of cable.
Be quick.
Maintain a low cost of operation.
Overall this leads to a design which emphasizes functionality to sense and detect the position and orientation of a cable (rope or tubing) and insert it snugly into a channel.
We made various design choices in our setup to fit the criteria listed above:
We built our vision algorithm to detect both endpoints of the channel, allowing us to compute the orientation of the channel and execute the pushing actions for fitting along the length of the channel. Because we do these calculations every-time, we seek to place the channel at any orientation on the workspace and still get our desired result.
To find the endpoint of a cable, regardless of its material, we knew we could not completely rely on color thresholding to isolate the cable. This is because a color thresholding built for a red cable would not necessarily work on a blue cable. Further, we could not rely on IR tags as we want our solution to work on any rope, not ones specifically modified to be detectable. Rather than use color or IR, we instead use a "flood-fill" method, that given one point on the rope, will use a breadth-first search method to expand outward and create a point cloud only containing points on the rope. This method will work on any material or color or rope as it relies on depth, making it ideal for our design goal. The method uses a simple breadth-first search queue to search neighboring points, then adds any particular point to the point-cloud if that point was of similar depth to its neighbors. If that pixel was a significantly different depth, we know that the point in question is no longer on the rope, but on the table beneath the rope! We do need to use color to find one pixel of the rope in this method but this is significantly more flexible than only using color. Because our channel was metallic, we had to detect any glare and remove it, then we used the brightest pixel on the screen, which is most likely the rope, then feed it into the above algorithm.
Because processing point-cloud data in python is very computationally slow and taking depth scans is slow, we want to minimize the number of depth scans we make. There are no independently moving parts on the workspace so we can safely assume that the image we take at the beginning will be representative of the channel position throughout and do not need additional images during runtime. The only additional depth scans we make is at the end to check if we need to make another round of pressing into the channel. Because most passes do not take more than 2 passes, we don't often need to make these scans.
We used cheap materials for the wooden base, cable, channel, and foam to ensure that the cost of all materials, excluding the robot, is under $50. Most labs that wish to replicate or build upon this work will already have a YuMi or similar robot, so ensuring that the cost of all other materials is low makes it easy for this work to be replicated for practical use or adapted for future research projects.
To simplify the development process, we initially taped one end of the cable to the channel, so the task was simplified to just placing one end of the cable into the channel and fitting it in. Once we got this working, we relaxed this design choice and had both ends unfixed, requiring the robot to solve a harder problem.
To ensure that our system was generalizable, we decided not to use AR tags to detect the ends of the cable and channel. Doing so would have made our task significantly easier, but would have significantly limited the usability of our system as is. The tradeoff was that we spent a significant portion of our time reliably identifying the cable and the channel for the sake of generalization. As a result, we were able to have a robust and generalizable solution.
In designing our project, we consistently considered both robustness and efficiency in every decision made.
We made sure our project was robust to perturbations in the rope's position.
We detect the endpoints of the rope with a scan at the start of each run to account for either ends of it moving.
We detect the endpoints of the channel with a scan at the start of each run to account for the current position and orientation of the channel.
We made the design choice to not use AR tags which makes us robust to the rope rotating or being blocked or occluded by the channel.
We make sure to double-check after pushing the rope into the channel to make sure the fit is firm.
We made sure to Gaussian blur the image from the Phoxi to lead to a durable system against specks of dust and other noise.
We scan the environment only once at the start.
In the week after our presentation, we adjusted our robot's procedure to only do additional run-throughs of the fitting sequence if the rope is not well fit into the channel beyond a threshold, helping avoid unnecessary pushes and making it more intelligent