CS 339R / ME 326 Tidybot Portfolio Website

Collaborative Robotics Group Project

Group 7 - Winter 2026

Overview

For this project, our seven-person team was tasked with programming a mobile, collaborative robot to carry out assistive tasks. In order to complete the three tasks, we integrated vision models, speech recognition, mapping/navigation, and grasp control algorithms.

Hardware:

Tidybot2 mobile robot base
2 Bimanual WX250s 6-DoF arms
2 Robotiq 2F-85 adaptive parallel jaws
Intel RealSense D435 depth camera

Software:

Code written in Python & Ros
Simulation environment in RViz & Mujoco
Vision using YOLOv8, AprilTags, and Face recognition
Speech recognition using Google Cloud YTT & Gemini
Grasp positioning using GraspNet

Tasks and Demonstrations

Task 1: Object Pickup

Given a verbal command, the robot must scan for, navigate to, and pick up the target object.

"Get the banana."

Task 2: Sequential Task

Given a series of verbal commands, the robot must perform a set of actions in sequence.

"Get the banana and place it in the bin."

Task 3: Fetch and Sort

After retrieving an object, the robot must find the specified person via facial recognition and bring it back to them. After bringing it to the person, the robot accepts a sorting task.

“Get the banana and give it to Michaël.”

“Now, drop it in the banana bin.”

Task1 Video.mp4

task1task2.mp4

task 3 demo.mp4

Team

Alexander Chen

Mechanical Engineering (MS)

Frances Raphael

Electrical Engineering (BS/MS)

Ananya Sridhar

Mechanical Engineering (BS/MS)

Michaël Dooley

Mechanical Engineering (MS)

Baihan Zhang

Computer Science (MS)

Sophia Huang

Electrical Engineering (MS)

Elisabeth Holm

Computer Science (BS/MS)

Team Contributions

Alexander Chen

2D depth based occupancy map generation
- Navigation states/plannning + basic outline for frontier exploration.
Assisted with non-occupancy map navigation.

Ananya Sridhar

Created basic navigation system (no occupancy map, simple obstacle avoidance)
- Also created a version with no obstacle avoidance for initial task 1 trails
Testing of Sophia's integration pipeline (YOLO --> exploration)
Initial code for AprilTag detection

Baihan Zhang

Implemented Confirmation pipeline for post-grasp and post-bin-dropping with YOLO integration
Created baseline navigation pipeline for task 1 fallback, with no obstacle avoidance and simple navigation logic (direct navigation without A* and frontier exploration), integrated full task 1 pipeline with voice.

Elisabeth Holm

Led navigation and integration with voice and yolo
Organized team on assignments (eg presentation, website) and timeline tracking
Frontier exploration logic
Map refinement (mapping and correcting the map as the robot drives)
Robust navigation of the robot through obstacle-filled space (including obstacle avoidance through inflation)
Integration into pipeline voice command → explore until object is found via YOLO → grasp. This was the basis of our final version, which replaced the map-based navigation with vision-based navigation
Help with graspnet and attempted GDP (grasp pose detection model)

Frances Raphael

Implemented and helped integrate the voice command module, allowing users to control the robot using spoken instructions.
Developed and tested the drop-bin behavior, allowing the robot to place a grasped object into a bin.
Helped integrate the voice command pipeline with the main control system for sequential task execution.

Sophia Huang

YOLOv8 set up and testing
Integration of pipeline YOLO → exploration
Bin detection alternative plan using color masking (back up plan, not used in final)
Exploration alternative method (back up plan, not used in final)

Michaël Dooley

Full system integrator (brought all of the parts together)
- Revising parts for functionality
Sim tester (getting sim working, providing laptop for sim work to teammates)
System/lab tester (in charge of deploying + testing code on real robot)
- Was at all but one lab section for testing + implementation on the real robot
Successful GraspNet codebase implementation + successful GPD integration (not used on real robot)
Built final grasping approach (min-width sweep + coordinate conversion)
Facial recognition stack/implementation
Codebase restructuring + refactoring
Properly scoped the tasks, sub-tasks, and tech stack which allowed us to be the first group to finish without a late-night/early morning session

Page updated

Google Sites

Report abuse