Design

In this section, we share design criteria and high-level design descriptions for our system. We also discuss design choices and tradeoffs we needed to make, and the impact these choices had on the robustness, durability and efficiency of our system. Note that a more detailed description of our implementation can be found on the implementation page of this website.

Design Criteria

The desired functionality of the system is as follows:

  • Successfully navigate a ball through the maze 80%+ of the time in less than 30s

  • Solve arbitrary paths through the maze

  • Navigate difficult maze features where walls cannot be used to direct the path of the ball or where the robot needs to control ball velocity precisely to not overshoot targets

  • *stretch criteria* Navigate the maze with minimal contact with maze boundaries

To be successful, our system needed to meet physical, control, and software sub-criteria.

Hardware/physical criteria

  • Maze must rigidly attach to Sawyer end effector

  • Maze must have strong contrast between paths and boundaries

  • Maze must be within full view of camera with enough resolution for maze reconstruction and pathfinding

  • Maze boundaries must have sufficient width for detection by vision system

  • All boundaries of the maze must be visible to the camera so that pathfinding algorithms do not explore outside of the maze

  • Maze paths must have sufficient width for ball to travel through

  • Ball and end point must be colored with sufficient contrast for detection by the vision system

  • Camera must be fixed above the maze with close-to-normal gaze angle and constant horizontal orientation to enable accurate maze reconstruction and maintain consistent coordinates

Control criteria

  • Controller must not rotate the maze around the z-axis to simplify/improve the accuracy of maze reconstruction and pathfinding

  • Controller must keep the maze under the webcam in full view

  • Controller must not tilt the maze by large angles so as to maintain efficacy of vision system based on top-down view

  • Controller must move the ball to target positions with sufficient accuracy so as to solve the maze when given a series of correct waypoints

Sensing/planning software criteria

  • Vision system must accurately reconstruct maze and solve it so that valid target positions are returned to the controller

  • Vision system must accurately determine ball and end position to enable path finding

  • Vision system must be able to accurately convert measurements in pixels to meters

  • Vision system must be able to accurately relay ball position and target waypoint on path to controller at 20-50 Hz

Hardware design

Maze mount and control system:

  • Controlling the last two joints of the Sawyer

  • 6th joint control the pitching motion of the maze, 7th joint control the rolling motion of the maze

  • Maze assembly attached to the end-effector

Top-down view of maze with endpoints:

  • Color mapping for CV to easily distinguish the objects

  • Blue: Destination, Black: wall
    White: space, Green & Purple: Ball

Overall hardware setup

Our final system consists of a webcam mounted to a tripod positioned such that the webcam is looking down on the maze at a small angle from normal to the maze surface. The maze is mounted rigidly to the end of the sawyer arm with screws, and the sawyer arm is positioned such that it can control the pitch and roll of the maze with only the last two joints so as to keep the maze in view of the camera.

Hardware design choices / tradeoffs

Webcam mount:

When designing this system, we decided to mount the webcam to a tripod rather than to the arm itself for a few reasons.

  1. We wanted the background behind the maze to remain consistent and controlled to enable the vision system to effectively use thresholding techniques.

  2. We mounted the webcam such that it was not facing directly down at the maze, but rather positioned at a slight angle, to allow the vision system to detect the shadows on the sides of the maze boundaries for maze reconstruction to make the system resilient to the glare that often occurred on the tops of the maze boundaries.

  3. We wanted to keep the mass of the end effector as small as possible so that the robot could rapidly adjust the pitch and roll of the maze.

In the end, we traded off a consistent, direct, and precise view of the maze (that we may have achieved by mounting the camera to the maze apparatus) for the benefits listed above. This design choice was an important one that enabled our project to be successful.

Maze construction:

We first needed to decide what material to build the maze out of. We decided to 3D print the maze to keep it rigid and light, and also to allow us to precisely mount it to the end of the Sawyer arm with screws.

From there, we decided to laser-cut white paper to line the bottom of the maze. We did this to create a stark color contrast between the paths and the maze boundaries, as well as to create a consistent, smooth surface for the ball to roll on. The paper also very-slightly increased the friction of the ball with the maze, slowing the ball just enough to make our controller effective.

We also needed to determine the best height and thickness of the maze boundaries. We ultimately decided to make the maze boundaries on the taller end of the range (1 inch) with a thinner thickness. We did this because we realized that the tops of the maze boundaries were prone to glare even after being sanded and colored black, making them difficult for the vision system to detect. With a greater boundary height, combined with a slight angle of the webcam, the vision system was able to reliably detect the boundaries by relying in-part on the shadows on the sides of the boundaries.

Finally, we needed to pick a ball with a color distinct from anything in the scene (including all background features), and also mark the endpoint with an equally distinct color to enable location detection. We picked colored plastic marbles as our balls (which also gave us a good mass & rotational inertia to work with) and marked our endpoint(s) with colored tape.

Software design

Overall software setup

The software we wrote to enable this robot consists of three main components. The first is a computer vision (sensing) system that measures the ball position and end position in pixel space using color thresholding and radius constraints. The CV system also detects the boundaries of the maze using thresholding and reconstructs a "grid-world" version of the maze using max-pooling and buffering processes.

The second component is a planner, which determines the desired position of the ball at any given point in time. It does so by solving the reconstructed maze using A* or BFS, given the ball position and the end position, to create a viable path for the ball to follow. The planner then saves this path, and at any given point in time returns the next position the ball should be aiming for given its current position. On regular time intervals, the planner re-solves the maze to account for any errors.

The planner then returns the desired position of the ball, as well as the ball's current position, to the controller. The controller uses these two positions to calculate an error vector as well as the velocity of the ball, and uses PID control to determine the desired acceleration of the ball. This acceleration is then converted into desired joint angles through a 2D ball-and-beam model, and fed to the Sawyer for actuation.

Software design choices / tradeoffs

Vision / sensing:

For the CV system to detect the ball and end position, we first needed to determine our detection technique. We landed on color thresholding because it is computationally cheap, keeping our overall response time fast. It is also effective in tracking position given good color contrast, which is not hard to achieve.

We also needed to determine how to reconstruct the maze such that it could be analyzed by the path-planning algorithm. We decided to use binary thresholding, followed by a max pooling process to create a "grid-world" representation of the maze. This combination made it so that potential sparse detection of maze boundaries due to glare would still result in solid boundaries in our reconstruction. We then added buffers around the boundaries in our maze reconstruction to make it so that our path planner would not be able to pick paths that directly hugged the walls of the maze.

Path planning:

We tested two different path planning algorithms: breadth-first search (BFS) and A* search with a euclidean distance heuristic. BFS returned more-optimal paths with smoother trajectories and shorter total distances, while A* ran faster and still returned decent paths. We ultimately decided to keep A* search in our final product, because its reduced runtime made our controller more effective, which is more important for solving the maze than having a slightly better path.

Second, we decided to only run maze reconstruction and path planning approximately once every 5 seconds, and rely on a saved path to determine other target positions. This way, the publishing process is very fast most of the time, only doing this computationally expensive process periodically to get the ball "unstuck" if issues are ever encountered.

We also needed to determine which cell in the path to return as the target for the controller to hit at any given time step. Instead of returning the very next cell in the path, we return a cell several steps ahead of the cell closest to the current position of the ball, as this is closer to the resolution that our controller could achieve.

Controller:

For our control algorithm, we were faced with several options: 1) advanced control algorithms, 2) P control or 3) PID control. Due to hardware constraints (30 FPS webcam) we were not able to implement advanced control algorithms that require feedback of around 100Hz. We then tried to implement P control (because our velocity estimates were very noisy), but this is insufficient to control the direction of the ball when the velocity is high and walls could not be used to direct the ball. Finally, we implemented PID control and used a moving average of the velocity term to create an effective controller.

How do these design choices impact robustness, durability and efficiency?

Hardware commentary

Tripod camera mount: By mounting our camera on a tripod, we limit the regions in which the robot can solve the maze, as the maze must be solved in view of the camera. However, this external camera source feature also enables us to possibly extend our work so that the robot can solve the problem in a hostile environment while the camera observes from a distance, making this design choice more suitable for a number of applications.

Rigidly-attached, 3D-printed maze: By making our maze rigidly attached to the end effector of the robot, we greatly increased the durability and efficiency of the system. We also created maze hardware that could be transferred to different robotic arms if needed, so long as screw attachment is viable.

Software commentary

Thresholding for position tracking: Using thresholding for position tracking makes it so that our system is reliant on specific color variations for key features of the maze and the background view. Similar lighting conditions are also needed, and re-tuning must take place when lighting conditions change significantly or a new webcam is introduced. This reduces the robustness of our system. At the same time, the computational speed and efficiency achieved via thresholding enable our system to solve more complex maze features, improving our robustness given consistent lighting.

Grid-world maze reconstruction: By transforming our maze into a grid representation and buffering boundaries, we make the system robust to glare on maze boundaries. We also make our maze fast to solve computationally, improving our efficiency and making our controller more effective. On the other hand, we reduce our resolution, making it so that our paths are not perfectly optimal, but close enough given the accuracy of our controller.

A* for path planning & future cell selection for target: By using A*, we guarantee a path is found given accurate maze reconstruction and position detection, making our system robust and durable in a variety of maze contexts. We also increase the efficiency of our system. By selecting a cell several steps ahead of the ball for the target, we open ourselves up to the possibility of hitting the walls of the maze around tight turns if buffering is not sufficient, but we also give our PID controller clear error to minimize, resulting in more robust and smooth control.

PID control: The PID controller we have built is robust to significant deviation in maze features. We can change the distance the webcam is from the maze, change the size of the maze, and add bumps to the surface of the maze paths (all of which we did accidentally during experimentation) and the controller will still be able to solve the maze. It is very durable and robust.