● Trained an agent using a PPO and DQN model that is able to navigate a car from a starting point to a goal destination in an efficient manner, without veering off the road.
● Compared the different pros and cons between both models with similar amounts of training to find that PPO was 14% more successful
● Iterated on top of an existing Gymnasium environment meant to train race cars. We took the existing continuous and discrete input systems, completely revamped the environment and trained on that.
A 5 minute presentation of our two agents can be found in this video, with a quick overview of our processes and results.
For a full comprehensive report diving deep into our project, please read here.