ME326 Team 5 Winter 2025

Future Work

One of the main next steps of a project like this is to expand upon the variety of objects that can be recognized through the perception system, as right now it can only detect cubes, baskets, charging blocks and hands. One way to expand this capability would be to collect data for other common household objects and train the YOLO11n model on those as well, however in testing, we found a much more robust solution to be switching over to the Grounding Dino model. When using this model, and prompted with a user input, we found it was able to consistently recognize almost any object in an image, unfortunately it requires a GPU to run, which was not immediately available on the LocoBot.

Another important next step is to add an algorithm for grasp orientation. So far we have used objects that can be grasped from any orientation, but for objects that require a specific orientation to be grasped, a more advanced affordance model would be needed, or train a separate model to determine the right orientation for a given grasp.

Lastly, our system could greatly benefit from improved camera/base control. Right now, the entire base rotates to scan for the object when it it is not in view, consolidating that movement to just the camera could greatly shorten the scan time and make the robot more efficient overall. Similarly, due to time constraints, the base only has a very simply proportional velocity control algorithm to move towards a desired location. Incorporating a full PID controller for the wheels, along with bringing the LiDAR system online would allow us to build out a map of our environment (SLAM), and use a path planning algorithm to safely navigate through cluttered areas to retrieve objects much faster, safer, and more reliably.

Page updated

Report abuse