Our original MVP is a 2v2 soccer match with real-time dynamic path planning, obstacle avoidance, passing, dribbling and shooting. We were able to accomplish this by implementing two teams with an aggressor and goalie, with basic passing logic between the two, real-time obstacle avoidance replanning, and real-time A*.
We experimented with the path planning algorithms listed below, and ultimately settled for a real time implementation of A*. The main challenge with path planning was choosing an algorithm that is right for our application that also adapts well to running in real time. Tradeoffs for these algorithms included balancing real-time with accuracy, and obstacle-avoidance with proper path-planning to the goal.
Overview: RRT generates a random tree given some number of iterations and a list of obstacles. As soon as the tree intersects the goal, the final path is returned. Figure 1 shows an example of RRT.
Implementation: Because our environment is dynamic, we adapted RRT to run in "real-time". This real-time implementation runs a small number of iterations given a snapshot of the current environment and returns the path that ends closest to the objective. Our robot implements this by repeatedly moving along this best returned path and then rerunning RRT at its updated location. In one of our implementations, we smoothed the returned paths with the Ramer-Douglas-Peucker algorithm and quadratic interpolation.
Results: To tune our implementation, we experimented with the number of iterations, the maximum branch length, and ways to smooth the returned path, all with the goal of reducing the robots time spent pathing to its objective. Ultimately, we found that lowering the number of iterations to something extremely low, like 4, setting the branch length to roughly the radius of a player, and reducing the returned path all the way down to a single line between the robot and path's endpoint gave the best results.
Explanation: The reasons for this is that with all players moving at the same speed, there's rarely an advantage to build a complex path around a player. For the time it takes to execute the path, the player being avoided will have already moved elsewhere. Thus we use an extremely low number of iterations (say 4) in combination with a short branch length to reduce the chance that that the returned path intercepts the moving obstacle in the following timestep. With these parameters, the returned path for 4 total iterations will on average only be 2 branches long (because RRT is random and the other 2 branches will point backwards). Thus there is little risk in reducing this already primitive path two a single line. It was around this point that we realized our extremely simplified RRT was beginning to look like a worse A*.
Take Away: For a soccer field which has relatively few obstacles, there is no need to use RRT. Instead a robot is often better off pathing straight towards the ball, making only minor adjustments as to not run into other players. This essentially equates to A*.
Figure 1: RRT finding a path to the goal in a static search space.
Figure 2: A "real-time" implementation of RRT. The moving red x represents our player and the blue ellipse represents a moving obstacle.
Overview: RRT* returns an optimal path as the number of iterations approaches infinity. It does so by constantly "re-wiring" the tree such that the path to every node is the shortest given the current tree. Once the path is found, adding more nodes will build a denser tree that returns smoother, more optimal paths.
Implementation: Because our real-time RRT implementation only runs a small number of iterations and RRT* benefits from a large number of iterations, it wouldn't make sense to use RRT* for the added computational time. However, we did test this by using a modified implementation from [6, 7].
Overview: A* builds a path by essentially beelining towards its target. If the current path collides with an obstacle, A* will hug this obstacle until the current path becomes longer than the an alternative path generated from backtracking and exploring an alternative route.
Implementation: To adapt A* for real time and keeping in mind what we learned from RRT, we sample a circle of points around the robot and return the point that brings the robot closest to the goal but does not cause the robot to collide with any obstacles. The robot then moves to this point and resamples at its new location. We also tested it with two heuristics: Euclidean distance and Manhattan distance.
Results: Overall, this planner is very aggressive and works well for robots trying to fight for a ball. The results for this real-time A* worked better than the other path-planning implementations we tested, so we decided to use this in our final implementation.
Figure 3: A real-time approach to A*. The green points are potential target points.
Figure 4: A robot can become stuck if it encounters static obstacles. Due to this, we chose to have the robot move randomly or back up if it intersects with an obstacle.
Overview: Generates a valid search space, and dynamically selects an optimal solution in the search space, which is restricted to safe trajectories that can be reached within a short time interval and are free from collisions.
Implementation: We modified an algorithm for real-time dynamic window planning that we obtained from [7] to path around a set of dynamic obstacles, and similarly to our RRT approach, smoothed paths with the Ramer-Douglas-Peucker algorithm and quadratic interpolation.
Results: Dynamic window planning usually resulted in an excessively curvy path, which caused weird behavior when combined with other logic when we imported it into the simulation. Additionally, in some cases, it took a long time to generate paths.
Overview: The Extended RRT algorithm [3, 4] builds off RRT by improving replanning efficiency for real-time predictions on dynamic stimuli over standard RRT, and introduces the waypoint cache and adaptive cost penalty search, both of which serve to improve the quality of generated paths.
Implementation: We conducted tests on an implementation we obtained from [6] on a grid-world. Additionally we worked on modifying the replanning part of the algorithm to take into account our use cases (pathing around moving robots and to the moving ball).
Results: We ran into issues converting the online implementation we found into a version usable in the simulator. Due to time constraints we decided to focus more on the standard RRT and A* implementations instead.
Overview: The quintic polynomials algorithm [8] determines the trajectory between two points by using a fifth order polynomial for path planning. Each coefficient of the quintic polynomial models 6 constraints total: position, velocity, and acceleration for each point.
Implementation: We conducted tests on an implementation we obtained from [7] on a grid-world.
Results: The results of the algorithm were good in the Jupyter Notebook implementation, but the real-time A* algorithm, which we were testing simultaneously, was more accurate in practice.
Algorithms that performed well in the NumPy grid-world didn't perform well in simulation
While RRT performs well for constrained environments, for a relatively open environment its randomness wastes time
The real-time trajectory tracking problem can be solved in many different ways. We focused our research on two methods in particular, proportional-integral-derivative (PID) and model predictive control (MPC). In the end, we did not have to implement either of these methods due to the simplifying nature of the simulation, but both are good candidates for integration into the actual hardware.
Proportional-integral-derivative (PID) control is an extremely simple and effective method of controlling systems, which is why it is the most widely-used control algorithm employed in industry. A PID controller takes as input a quantity we wish to regulate (drive) to zero and its output is (usually) input to our system, e.g. a motor torque. To achieve this, the controller uses the input, it's integral (with respect to time), and its derivative (with respect to time), assigning weights to each and summing to determine the appropriate system input. PID control offers fast performance, but struggles with systems with constraints such as input saturation.
Model predictive control uses the power of optimization to compute a feedback law that optimizes some user-defined cost function. Formulating the control problem as an optimization problem allows us to easily incorporate system constraints into the control problem as constraints on the optimization. Additionally, MPC has the power to perform obstacle avoidance on its own, since the obstacles themselves can be thought of as constraints on the robot position. MPC can be computationally expensive and not always have a feasible solution; however, these issues can be avoided with a well-designed MPC problem.
We implemented player strategy using finite state machines.
Implementation: Writing the state machine for a 1 vs. 0 scenario is straightforward. The robot should path towards the ball, and once the ball is secured, dribble to the goal and score.
Figure 5: 1 player vs. 0 players. For a 1 vs. 1, this state machine is implemented on both players.
Implementation: To implement passing, we used a hierarchical state machine, where higher level states are aggressor, wingman, and ball-less. Lower level states define the behavior of each high level state. See figure 1 in the appendix.
High level states:
Aggressor: A player becomes an aggressor if it has possession of the ball. The main objective of an aggressor is to score a goal, or pass if the opportunity arises.
Wingman: A player becomes a wingman if it's teammate has possession of the ball. The main objective of a wingman is to get in position to receive a pass and subsequently shoot on the goal.
Ball-less: Players become ball-less if their team does not possess the ball. The main objective of a ball-less player is to path directly towards the ball and retrieve it at all costs.
Figure 6: 2v2. The dashed arrows point to a player's role being reassigned based on the next game state.
We attempted to use OpenCV from an overhead camera to provide each robot with position data for the ball, itself, and all obstacles. For this, we spawned a camera into gazebo, which provided images over the /rrbot/camera1/raw_info topic. We use cv_bridge to convert from the ROS sensor_msgs/Image type to an OpenCV compatible type. Then, we masked out all colors not corresponding to the robots/ball and blur and greyscale the result in preparation for blob detection. Unfortunately, blob detection failed to recognize our robots and ball and facing time constraints, we defaulted to providing each robot with perfect state knowledge.
Some design criteria for real-life engineering applications include repeatability, which ensures that a system will be able to execute actions multiple times under different initial conditions; reliability, which indicates that a robotic system is robust, is able to finish every action that it starts; and constraints, which establish a set of tolerances in which a system is guaranteed to perform optimally. The design choices that we have made reflect this. We make no assumptions on the initial positions of the robots or balls, allowing a game to be executed well when started from any position. Our path planning algorithm is designed to work when constrained with any number of obstacles. Our system is close to reliable in that there are very few instances in which robots (or the ball) get stuck or perform strangely, and the system, when in many edge cases, will eventually correct itself.