Our initial ideas for utilizing the Stretch to help mobility impaired individuals were:
Home fall hazard detection for stroke survivors: Deploy the Stretch to detect and move objects that could be fall hazards. The Stretch would roam walkways and use computer vision to detect objects that could be walk hazards, then move those objects away.
Oral healthcare: Use the Stretch's arm to assist a user with brushing their teeth. The Stretch could help the user maintain their oral healthcare by assisting with applying toothpaste to a toothbrush, then holding the (electric) brush up for the user to use. It could then grab water for the user to rinse their mouth with.
Posture/lumbar support for those with Ankylosing spondylitis (AS): AS is an inflammatory disease that can cause vertebrae in the spine to fuse. The Stretch would utilize camera sensors to detect when the user is slumped over and alert them to, and perhaps help them, adjust their posture. The Stretch could additionally bring lumbar support pillows to the user.
Household chores: Utilize the Stretch's gripper to perform basic household chores for the user. Examples include folding laundry or sweeping the floor. This would not only assist the primary user's independence, but also can help to alleviate the workload of any human caregivers.
After experimenting with the Stretch, we decided to focus on home fall hazard detection for stroke survivors and posture/lumbar support for AS.
Our first idea is targeted at stroke survivors, who are often at elevated risk of falling. 7% of stroke sufferers fall within a week of their stroke, and 73% fall within one year. This is due to several reasons: strokes can cause weakness or paralysis on one side of the body, can lead to loss of sensation in the feet or legs, and can cause vision disturbances or trouble with concentration. It is important to prevent falls to avoid further injury to the stroke survivor, which may affect the prognosis and level of independence that the stroke survivor can achieve. One exacerbating factor of falls during stroke recovery at home is the presence of fall hazards such as clutter, loose wires, or objects left in walkways . As they are recovering and struggling with balance, stroke survivors may not be able to keep up with housekeeping and decluttering tasks required to avoid these fall hazards, or it may not be safe for them to perform these tasks with their balance issues.
As a solution to this problem, the Stretch mobile manipulator can be autonomously deployed to look around common walkways in the home at a time when it is not obtrusive (for example, every night or whenever the person recovering is resting or away for an appointment). It can then use its cameras to identify objects in the middle of walkways in the home and place them in a predetermined location such as a basket. It can also clear walkways in other ways, such as pushing chairs back into place. The Stretch robot is equipped with an adaptive and capable gripper to pick up common household objects, enough degrees of freedom to reach and grab objects on the floor and move them elsewhere, and two cameras on its sensor head to scan the environment.
In order to engage users with this project, we intend to contact people who have recovered from a stroke (or are/were at elevated risk of falling due to other reasons) who are living alone or lived alone during their recovery period; for example, our elderly family members. We also plan to ask the course staff to connect us with people with this experience, if they know any. We plan to ask them how they kept their home clear, what they felt was unsafe about recovering at home, and whether they would benefit from our proposed solution. As we continue to iterate on our project, we plan to touch base with them about design decisions that we've made to see if they are appropriate--does the Stretch fit in their home environment? Would they have felt comfortable having it in their home while recovering?
Our second idea is targeted at individuals suffering from ankylosing spondylitis, an autoimmune condition that can cause inflammation in the joints and spine, and can lead to vertebrae fusing together. In order to prevent the spine from fusing in a hunched over posture, it is important that individuals maintain proper upright posture, so that even if bone does form between the vertebrae the spine will be in its natural, most comfortable position. Remembering to always utilize proper posture is not something that happens overnight; from personal experience, Jeffrey, who has rheumatoid arthritis, often slouches over when sitting, despite his rheumatologist repeatedly emphasizing the importance of sitting upright.
The Stretch mobile manipulator could be used to follow the individual around while in their home, with its camera watching them and monitoring their posture. When the user begins to slouch, the Stretch robot could alert the user, via a reminder from its built-in speaker. In addition, the gripper could bring a lumbar support cushion with it, which the user can use when sitting in a backed chair. When the user moves from the chair, the robot can pick up the support cushion and carry it as it follows the user around the house, freeing up the user's hands.
To find and engage users for this project, we will reach out to people with ankylosing spondylitis--we do not know anyone with this condition, so we plan to look into online groups for recruiting as well as ask the course staff. We will ask them about their experiences with maintaining upright posture, and what resources they have used as reminders. Proper posture is beneficial to people without ankylosing spondylitis as well, so for general user research on the camera interface and reminder system, we can also reach out to a broader base.
One of the biggest challenges we faced when learning ROS2 was understanding the differences between the three major message exchange modes: topics, services, and parameters. At first, we struggled to understand the distinct use cases for each of these strategies (why not just use a topic for everything?). However, as we progressed through the tutorials and saw how the different paradigms were used within TurtleSim (for example, topics being used for continuous velocity information and services being used for distinct rotation requests), we started to understand the benefits and drawbacks of each. While we're looking forward to gaining more experience driving the Strech with ROS2, we have a good foundation of understanding from the beginning tutorials to get started.
After getting hands-on experience with the Stretch 3, we realized that fine-tuning its controls to perform tasks requiring high dexterity and precision was a lot harder than expected. Initially, when we were controlling the Stretch via the Mujoco simulation, we found it difficult to orient its arm the way we wanted to relative to the other simulation objects. This difficultly came in part from unfamiliarity with the controls, but also the viewing angles available. When operating the Stretch in real life, being able to move around to get multiple points of view (much like in crane games) made it far easier to pick up objects. We haven't experienced remotely teleoperating the Stretch using its camera system yet, but similar troubles to the simulation are likely to arise. Getting experience in controlling the Stretch has both limited and broadened our view of what our project focus should be. Based on the contents of the safety tutorial and our difficulty controlling it, it would be ill-advised to use the Stretch perform tasks near delicate parts of the body. We had initially considered using the Stretch for oral healthcare, but this may not be the path we choose to pursue anymore.
Featured: the Stretch 3 picking up a small pink crab plush toy off the floor and lifting it about 1 foot off the ground. This reassured us that small-object manipulation, as well as retrieving items from the floor, is feasible using the Stretch 3.
Featured: In order to learn ROS2, we followed the beginner tutorial online, which guided us though ROS2 fundamentals using a program called TurtleSim. Our ability to control the turtle and examine ROS2 nodes and message exchange modes is shown.
Before operating the real Stretch 3, we practiced picking up objects in-simulation using Stretch Mujoco.
The Stretch Mujoco simulation offered keyboard controls with similar functionality to the controller we used to teleoperate the Stretch 3.
Teleoperating the Stretch 3 was a great first step in testing the limits and functionality of the robot in the real world.
Teamwork will be crucial as we embark on this design journey. Using guidance from our instruction team, we assigned roles to the team in a way that allows each of us to touch both the technical aspect of the project as well as the more human-oriented and organizational aspects. You can see our roles on our About page, but we'll go into more detail about them here.
Responsibilities relating to technical roles are more long-term. As hardware lead, Naama is expected to read up on the technical specifications of the Stretch 3 and advise the rest of the team about what is feasible and not feasible with its components. If additional hardware is added to the Stretch 3 or the environment, Naama is expected to gain experience with the tools required for their construction and maintenance. As perception lead, Karen is expected to find vision-related libraries for use on the Stretch 3 and learn the basics of how to use them so that she can explain them to the rest of the team. She will be responsible for defining the major vision tasks for the project. As design and fabrication lead, Diana is expected to gain familiarity with software and tools useful for fabrication (such as CAD software and 3D printing). If we decide that the Stretch 3 requires attachments or the robot's environment needs modification, Diana will be responsible for leading the design, construction, and maintenance of these components. As ROS2 lead, Jeffrey is expected to have an understanding of ROS2 constructs so that he can advise the team on the structure and overall design of our code. He is expected to be able to troubleshoot common ROS2 bugs and consult documentation as appropriate.
Human-oriented responsibilities have more weekly goals. As manager, Naama is responsible for ensuring channels of communication between the team are open and that everyone feels clear on their responsibilities for the week. To do this, she will create a weekly task checklist for the team and, when appropriate and with input from the rest of the members, designate a leader for each task. Over the quarter, she will ensure that the team keeps their goals in mind by posting reminders and evaluating the progress of the project. As the documentations and communications lead, Diana will record the team's weekly progress, finalize blog posts, and create documentation to guide both developers and users on our final product. As the user research lead, Jeffrey will find and communicate with our users, scheduling meetings with them and preparing materials for these sessions. He will collect and analyze data from the users. As the user interface lead, Karen will take charge of creating a human-friendly and usable interface for our users. She will work with Jeffrey in ensuring a good interaction experience for our users.
Even though we all have separate roles, we all hope to learn some common skills from this class. We will regularly rotate the "driver's seat" both in terms of group programming and driving/operating the Stretch. This way, everyone gets experience with programming in ROS2 and familiarizing themselves with the Stretch 3's hardware--the leads in these areas are expected to help out if a non-lead member gets stuck, but we should all understand what is going on. Every group member must also contribute to the draft of the weekly blog posts on the site. This allows for a more even spread of documentation work and ensures everyone is up to speed on our progress. Everyone is responsible for updating the rest of the team about their progress every week--communication is key, and we will not ghost 👻 each other. When making any big design decisions or changing the scope of the project, everyone will give their input and agreement.
After discussing our ideas with the course staff and doing more brainstorming, we realized that our project's direction is shifting. We were all drawn to the way that assistive robotics could connect people socially; therefore, we decided to design a robot to help people with hand dexterity and mobility challenges play tabletop tile games. As a first game, we've decided to focus on Scrabble. The posts below reflect the work we've done on this new project so far.
With our new direction in mind, we've created a sketch of our robot-based solution, shown below. It depicts the Stretch with adapted grippers to grip Scrabble tiles. The Stretch is placed adjacent to the table with the Scrabble board on it, with the top camera being used to see the board. The arm is positioned over the board so that the Stretch can place tiles on the board.
We also created a storyboard depicting why our users would want to utilize our robot:
We asked Dylan, one of our users, what he thought about the sketch and storyboard and how we envision the Stretch being used. He had this to say:
"I really love how you are envisioning Stretch being used! I might be optimistic but I believe that Stretch should be able to do everything when it comes to Scrabble."
We've started a literature review pertaining to our project. During this phase of the review, we've gathered up relevant papers and products that may have implications for our robot-based solution. Our work is collected in the spreadsheet below:
We spent some time figuring out the mechanics of picking up a tile. Though the Stretch was able to successfully pick the tiles up, it was difficult to consistently place the tile down with the letter facing up. We decided to modify the grippers by attaching a pair of chopsticks, with rubber bands tied on the tips, to the gripper. We found that the flat ends of the gripper, along with the friction of the rubber band, was much more suitable than the round gripper cups to stably pick up the tiles. They also added the utility of flipping a tile over if it was facing the wrong way. To make the robot more appealing to both the user and the designers, we added a chef hat as well. These adjustments are temporary--we plan to fabricate more replicable and sturdy adjustments for future iterations. The chef hat, however, is here to stay.
Stretch grabbing a tile
Stretch with adaptive grippers
Once we made the necessary adjustments to pick up tiles, we teleoperated the robot to test them out. We were able to consistently pick up tiles and place them next to each other to spell a word. A video of these adjustments in action is to the left.
We then set about automating the pickup action and creating a user interface to control the robot. Following some examples online and a template provided by the course staff, we were able to create a simple web-based connection to the robot with buttons that allow the user to pick up and put down a tile. We demonstrated our success in picking up and putting down a tile with this "version 0" of the user interface.
In the future, we envision a simple user interface that allows the user to view the board through the camera, overlaid with grid marks to indicate position; the tiles in their hand; and elements that allow them to click and move the tiles from their hand onto the virtual board. On the click of the "Go" button, the robot then autonomously places the specified tiles onto the board. When instructed to, it can also draw another tile to end the turn.
We used our V0 interface to look at the Stretch's head camera and pick up a single tile. This version of the interface is quite simple, but it reflects our intention to create an interface with a small set of buttons that result in a series of robot actions. Our future designs will be more readable and visually appealing.
The lo-fi sketch of our envisioned interface. Our goal for this design is to be a simple as possible as to require very little hand dexterity. The user can move the camera around the board to look at the state of the game, use the "play turn" button to place tiles from their hand onto the board, then either confirm or cancel their play. They can also draw a tile from the center.
We created this short speculative video to share with our users that depicts how the robot can assist them. In the video, the robot is teleoperated using a custom backend interface without use of perception--in the future, this will not be the case. The user interface is also simulated for the time being. We hope you enjoy our filming and acting skills!
Based on the video, we asked Dylan about which game actions he prefer the Stretch do, as well as whether our user interface prototype addresses his needs:
"I would like to try and do everything. It may end up not being possible to do so but I would like to try [...] I think that current interface should work just fine!"
As we're starting making this project a reality, we sat down to create a list of the robot's technical capabilities as well as the environmental modifications needed to support the robot-based solution. In order to assist in playing a game of Scrabble, the robot needs to do the following:
Board recognition: the robot needs to be able to recognize the board itself, as well as the grid spaces on the actual board. The board will hopefully remain static, so there is no temporal element relating to this.
Tile recognition: the robot needs to be able to recognize the tile object, and to identify the letter on the actual tile, both on the board and in the user's hand. It needs to be able to recognize the letter on the tile even when the tile is at an unusual orientation (i.e., upside down or sideways).
Tile holder recognition: the robot needs to be able to identify the user's tile holder as opposed to their opponents' holders. It needs to be able to know the orientation of the holder and where there is room for tiles to be placed.
Collision awareness: the robot should be able to recognize if a foreign object (e.g. another hand) enters into its path of motion, so it should not hit anything or anyone in the course of its motion.
Overall, perception needs to be fairly precise. The actual letter size is around 1 cm x 1 cm, and the robot will need to be able to detect the grid space accurately enough to place tiles neatly in their position.
Pick up tiles from a holder: this requires grabbing the top of a tile and lifting it in a way that does not interfere with other tiles in the holder.
Draw upside-down tiles and place them in a holder: this requires rotating the tile so that its letter is not visible to opponents, then placing it so that it faces correctly in the holder.
Play a tile on the board in a specified location: the robot will need to be able to hold onto the tiles from either pair of opposing sides; if the robot is trying to play a word vertically, then it needs to grab the left and right sides of the tile, and if it is trying to play a word horizontally, then it needs to grab the top and bottom sides of the tile.
The robot will need to be able to position its gripper over a precise location on the board grid in order to place tiles.
The game area of the board is 21 cm x 31.5 cm. Each space on the grid is 1.8 cm x 2 cm. The robot should successfully be able to navigate within this designated space.
The robot's base will be moving slightly to align the arm with different positions of the board. It will not be rotating--only moving back and forth along the side of a table. The arm, wrist, and gripper will be moving constantly to position tiles appropriately.
The robot will need to be able to determine the state of the board and the user's hand.
The user will need to be able to communicate what word they would like to play on their turn and in what location.
Once the word placement has been completed, the robot should report whether it was successful or not.
The robot should be supervised when moving tiles to be placed on the board in case its collision detection algorithms fail. In case of that happening, the robot should be instructed to stop its current motion and lift its gripper from the board.
Thankfully, the risk of injury is fairly low. The most common issue we anticipate is the robot jostling the board or other tile holders.
If the robot gets lost, it or the board can be moved so that it is in the camera frame.
We will design a tile holder that is wider and has a grippy base. This will provide more spacing between tile's in the user's hand, making it easier for the the robot to grab a single tile without disturbing the others. Adding more friction to the tile holder ensures the robot will not accidentally move it.
To help with the computer vision aspect, we may consider adding ArUco markers to the corners of the board.
Our project will require a lot of perception work--this week, we spent time in the lab working with RViz and ArUco markers to test the capabilities of the robot's vision system.
We first spent some time exploring the options within RViz--we were able to see the point cloud representation of the camera feed, Stretch's transform tree, and the raw camera footage. We explored deep perception, which allowed the Stretch to perform object detection and classification. Lastly, we placed ArUco markers in the environment and and successfully tracked them as they moved. We plan to potentially use an ArUco marker to identify a corner of the game board and another to identify the assisted player's tile holder.
Pictured to the left is a rough idea of how we might use ArUco markers in our project. In the final design, the markers will be securely attached and oriented in such a way that they are always fully visble to the robot's cameras.
(1) The robot's point cloud representation of the board on the table and (2) the deep perception algorithm detecting the board itself.
As we moved the ArUco marker across the table, the RViz interface updated with the current position of the marker relative to the robot.
Board recognition: the robot needs to be able to determine the corners of the board in order to determine the location of the board, as well as to unwarp the perspective of the image to get a top-down view of the board (to make tile recognition more consistent). Since the board should not move over the course of gameplay, theoretically this should only need to be done once, though calculating this information multiple times may make the program more resistant to error.
Minimum viable solution
We can modify the environment for board recognition by adding one ArUco marker at each corner of the gameplay area of the board. Built-in libraries for detecting ArUco markers will thus allow the robot to determine the board's position. We would need to ensure that the entire board is in the camera frame without obstruction, and that specific markers will always correspond to specific corners. However, these restraints are quite feasible to satisfy, so this is a reasonable solution that should also yield high accuracy.
We can let a user draw a bounding box around the board on the user interface. While this again should be accurate, this seems like a tedious task for the user or caregiver to perform, and is not really a viable solution.
General solution:
The classic Scrabble board has a thick red border around the playing area, as well as brightly-colored special squares in the playing area (red squares on the corners and middle of the edges, yellow squares on the diagonals and center, and blue squares on inner triangles). We can use these differences in color to determine where the corners of the board lie, and use the special tile colors to check this work. Non-classic boards may not use these exact colors, but tend to follow the same pattern of coloring special tiles differently, so an even more general solution would be based on determining the board position and corners through the special tiles only. We can use the OpenCV library to perform segmentation to recognize the boundary of the border as well as the tile positions within the board.
Tile holder recognition: the robot needs to be able to determine the position of the tile holder so it can play tiles from the user's hand onto the board.
Minimum viable solution:
We can modify the environment to also have one (or more) ArUco marker(s) to indicate the position of the tile holder. To make image processing easier, it should also be viewable in the same frame as the board. Again, this is a feasible solution that yields high accuracy.
Similarly, we could let a user draw a bounding box around the tile holder on the user interface. Again, while this should be accurate, it is still a tedious task for the user or caregiver to perform, and is not really a viable solution.
General solution:
The general solution would be to train a model to recognize the shape of the tile holder, from various angles and when there are various numbers of tiles on it. We would therefore need to collect a large number of images (preferably with the Stretch's overhead camera) with these different variations in different environments for training and testing. We would then train a model to recognize the tile holder. One approach to this would be through an object detection model such as YOLO.
Tile recognition: the robot needs to be able to recognize the tiles in order for the user to be able to indicate which tiles they want to move, as well as to display the state of the game nicely for the user.
Minimum viable solution:
At a minimum, the robot does not need to recognize the actual letters on the tiles, but just the tiles themselves. Given that tiles are expected to be in specific positions relative to the board or to the tile holder, the user would only need to indicate which tile to draw from their hand (perhaps with a number 1–7), and place it on the board at a certain location (input a board coordinate). This should be fairly feasible—it would just involve some math to determine positions of letters (and may need to be calibrated if the user has a board that is something other than the classic Scrabble game size).
Another solution would be to custom-create tiles such that they have distinguishable markers for both the robot, while still being readable for humans. It is hard to find a feasible solution that does not disrupt the gameflow—this would almost certainly result in needing larger tiles, which in turns results in needing a larger board, which is harder to read and to keep all in frame. However, this would be a highly accurate way to determine the identity of tiles.
General solution:
The robot would be able to recognize the letters on the tile, so the user can refer to a tile by its corresponding letter. (Presumably, the user would input the entire word and location on the board, or letters one-at-a-time, and the robot would be able to autonomously select the indicated tile(s) and place them on the board.) This involves optical character recognition. We would need training data of various letters in their processed state (unwarped from the board recognition, and likely has some sort of thresholding to apply to isolate the black text of the characters). This is generally done with deep learning techniques such as convolutional neural networks with a library such as TensorFlow. The location of the tile can be determined by the center of the letter.
Special consideration would need to be taken in for the blank tile. To solve this problem, it may be easiest to modify the environment by drawing an asterisk on the blank tile and training the model with this modification.
We removed the idea for collision awareness. While this is a nice feature to have, it is not strictly necessary to implement. It is reasonable to assume that nobody else will have their hands in the playing area on the Stretch user's turn, and that the path from tile to board will remain unobstructed during the course of the user's turn. Additionally, the robot should not make any sudden movements, but will move at a fairly natural rate, meaning it should not do much damage even if it does hit another object.
This week, we started work on a programming by demonstration tool.
The first video demonstrates our ability to save and reload a relevant position to our project. The pose we demonstrate in the video attached is a "stow" pose needed for the Stretch's camera to get an unobstructed view of the board. We demonstrate saving the stow pose, moving the robot out of the stow pose (what we would need to do to pick up a tile), then using the load feature to move the robot back into the stow position.
The second video shows us teleoperating the robot into a sequence of poses, then replaying all of those poses. The last section of the video shows a closer look at the interface where poses can be added, deleted, and cleared; each individual pose can be replayed individually using its button or given a specific point in the replay sequence (including none) using the dropdown menus.
Our next steps for this tool is to have more control over the motion planning so that we don't accidentally bump into objects (as can be seen in the first video when the gripper attachments brushed against the table--don't worry, they're fine!)
We continued our literature review this week, briefly summarizing all of the papers we looked at for the last review post. Due to increased workload this week, this literature review is a work-in progress. Here is what we have so far:
This week, we were able to meet with V Nguyen, OT Clinical Research Lead at Hello Robot, who graciously gave us some helpful pointers for how to proceed with our project. Here is a summary of our discussions and the lessons we learned:
The end-effector prototype
We asked V about our work-in-progress gripper and whether she had any insights on how to improve the prototype.
Overall, we aim to create a consistent attachment so that we can correctly calibrate the Stretch to account for it in its movements.
V had the idea of using foam to prevent jostling of our gripper attachment! This would solve the issue of the attachments being slightly wiggly and vulnerable to jostling.
The user interface
We had a few considerations in mind when refining our user interface:
Placement of buttons: is it preferable to place the buttons close together (so that they are easy to navigate between) or farther apart (so they can easily be distinguished?
Level of automation: what is the right balance between automation and user independence? We wanted to make sure that the robot allows the user to fully participate in an in-person game, and we also want to respect that users may want more teleoperation so that they participate on the same level as non-assisted players.
V let us know about how different users might have different interface needs that we should be aware of:
For example, some users may be low-vision and thus require a zoom-in option.
Users may also use a joystick, which has different easy/difficult movements.
She also had the idea of using categories within the interface to save screen space: for example, placing the board and hand in a separate category from the gameplay buttons so that each can take up more of the screen.
We concluded that we can make some progress on our interface prototype, but we'll hold off on a few key decisions until we speak to our users.
Including non-assisted players in the loop
V brought up that we should be more intentional about how we plan to bring the other, non-assisted players into the gameplay loop:
Tile drawing: how might the other players be coached about where to place drawn tiles? If/when we fabricate a new holder, how can the other players know where to place tiles so that they can be easily grabbed by the robot, while at the same time not cheating by seeing the letter on the tile?
We also had a different idea for how tiles can be drawn: non-assisted users can bring a face-down tile to a specific location identified by a marker, then the robot can grab the tile from that location.
Adjusting the board: Currently, the game board is taped down, and the play environment in general is very controlled. When working with the other players, we would need to alert them of the necessary setup to play the game with the Stretch in a way that does not disturb motion or vision.
Overall, including the non-assisted players helps us to offset some of the robot's tasks to humans who are more suited to do them. The important thing is to preserve the independence of the assisted player, which is really what the robot is there for.
Connecting with potential users of this project
V agreed to send out a call for potential users she is connected with and to get back to us ASAP. She's looking to connect us with users of different ages and levels of impairment, which will allow us to gain some diverse perspectives.
Thank you again to V for offering your time and expertise, and thank you Maya for connecting us! We're looking forward to implementing this feedback in our project.
We drafted a formal project proposal to describe all of the work we have done so far as well as where the project is headed.
A video of our servoing in action is above. We based this movement off of Hello Robot's Align to ArUco tutorial; however, this tutorial was not functional out of the gate. We had to make significant modifications in order to exhibit the above behavior, eventually opting to forgo the tutorial entirely and do our own math to calculate the required transform.
In this clip, the Stretch uses its camera and the ArUco marker detector node to identify the large marker (ID 131) on the table. It then rotates until it is parallel with the marker, then "parallel parks" by turning to the side, moving forward, and re-aligning.
We are exploring how the friction of the carpet in our lab room affects the turning of the robot; currently, while the friction is slightly disruptive, we are able to correct for it by realigning after the parallel parking routine.
This week, we took stock of the navigation and servoing plans for our project. We decided on the following goals:
Navigation: the robot does not need to navigate to the game board; we assume that it can be moved or teleoperated to a position that is "close enough": we define "close enough" as a position such that the robot's camera can see and identify the ArUco markers used for servoing.
We believe this is appropriate since the robot will need to move very little over the course of a Scrabble game--the only base movement that is needed to implement this behavior is moving backwards and forwards by centimeters in order to reposition the arm over a column of the board.
Servoing: At the beginning of the game, the robot must align itself with the game board so that its base is parallel to the board's base.
We will use ArUco marker detection to determine the edges of the game board and program the Stretch to place itself at a fixed offset from those markers. The relevant markers are those denoting the bottom left and bottom right corners of the board (we used marker IDs 2 and 3).
This routine will run once at the beginning of the game after the robot is brought to the table. If the robot is jostled or the board is moved, this routine can be run again, but it must be initiated by a human in the loop.
We believe this is appropriate since the robot does not move much over the course of the game; the only movements of the base are slight centimeter adjustments to align the robot with columns of the board. However, these movements must not misalign the row that the robot's arm is placed over, so we must ensure the base is properly parallel.
We decided not to have the routine automatically run for the sake of simplicity; we believe that keeping the board steady is a reasonable accommodation on behalf of the non-assisted players for use of the robot.
We created the above diagram to document the structure of our code.
Key:
Blue: node
Red: non-node interactors
Orange: fiducial
Purple: device
Single-sides arrow: topic
Double-sided solid arrow: service
Double-sided dotted arrow: action
One notable omission are distinct service requests for moving the arm over a specific tile position. We plan to implement this functionality using the follow_joint_trajectory action within the user interface logic, with target joint positions informed by the vision and alignment nodes.
Note that there are connections between nodes that we did not create, e.g. between the camera node and the detect_aruco_markers node. We omitted these from the diagram, as we did not modify or interact with those links in any ways.
We also came up with implementation responsibilities:
Nodes:
camera, stretch_driver, rosbridge_websocket, detect_aruco_markers: provided by Hello Robot and do not need additional implementation.
alignment_math: Jeffrey and Naama
vision_services: Karen
Non-node interactors:
User interface: Diana
Devices:
Hardware management: Naama
Fiducials:
ArUco markers: Naama
Game board: already provided
Most of the diagram constitutes the minimum viable product for our project. The parts that constitute stretch goals will be implemented within the alignment_math node. As a stretch goal, we would implement additional functionality to auto-align the Stretch to the board when it detects drift instead of having a human in the loop to trigger the realignment.
As we continue working on our project, it's important to consider the ethical implications of this work and of assistive robots in general.
Impact on the economy:
Assistive care robots could allow mobility-impaired individuals join the workforce by making it possible for them to more readily interact with the world. Additionally, the advancement of this industry would create more jobs in design, manufacturing, quality control, development, and maintenance.
However, if assistive care robots ever become a feasible replacement for caregivers, they could potentially put people out of jobs. Given that there is a social aspect to caregiving, we hope that assistive care robots can help ease this burden on caregivers instead of taking job opportunities away from them. There is already a shortage of caregivers, and caregivers must often work long hours to support their patients: robots would hopefully help fill that gap instead of completely replacing people. However, the risk of eliminating human jobs, especially if robots become cheaper than hiring human labor, is high. We mustn't let profit incentives guide our choices when it comes to assistive care.
Impact on society:
Assistive robots are intended to give the users freedom to participate in society to the extent they choose. With more people integrating into society, more diverse perspectives would become available to the larger public. Hopefully, more people with a need for assistive care could spend time doing the things they love and sharing their work and ideas with society.
On the other hand, robot software is not immune to bugs and security vulnerabilities, and there are privacy concerns associated with the data that assistive robots would likely have to collect from their patient's home to properly perform their work. If an assistive care robot is hacked, potentially private and vulnerable information about their patients and their home could be leaked. This would harm the dignity and right to privacy of people using assistive care robots. Assistive care work is also very human work--a human caregiver not only addresses the physical needs of their patient but their emotional and social needs, too. It is important that we consider which areas of assistive care robots can perform and which will always be done by humans, lest we degrade the quality of care that people receive.
Impact on the environment:
One unexpected positive consequence that assistive care robots may have is the elimination of some plastic waste. Many food products that are accessible to people with disabilities (such as pre-cut vegetables, pre- cooked and peeled eggs, etc.) come with additional plastic waste that inaccessible versions of the same product to do have. If robots can perform that additional processing needed to make cooking and eating accessible, perhaps the amount of plastic waste related to the purchase of accessible food items would decrease. Any decrease of plastic waste is welcomed and helpful.
At the same time, assistive robots require complex electronics and precisely-machined parts. The environmental impact of manufacturing these robots (including the waste of factories, scrap materials, and other byproducts ending up in landfills) could harm nearby communities and the Earth. Similarly, many robots use batteries that, while rechargeable, have a limited lifespan. Battery waste is difficult to dispose of correctly without leaking harmful materials into the ground and water.
It's almost the end of the quarter! Here is what we've achieved up to this week. For each section, the lead team member is listed, but since all of these parts interact with one another, we communicated progress, brainstormed ideas, and contributed work together as a team.
We've been able to successfully incorporate board recognition and letter classification into our project! Given an image of the board on the table, the program is able to detect the AruCo markers at the corners of the board and (un)warp the perspective such that a top-down view of the board is displayed. This will be fed to the UI to display a real-image view of the board for the user alongside a virtual display of the board.
For the virtual display, we need to be able to classify the letters in their proper place on the board. Taking the unwarped board, we perform a binary inverse thresholding on the values channel (of HSV) to be able to find contours more easily. We pass a cropped image around these contours into the letter classifier. As a baseline, we've taken an existing open-source model trained on Scrabble tiles (using k-nearest neighbors) and adapted various configurations to our board. However, this model still often gets mixed up on letter pairs such as Q's and O's, as well as I's and J's, and often is not able to distinguish non-letter contour noise from the actual letter contours. The dataset it is trained on appears to be unbalanced with many more O samples than Q's, and has a different font with much more distinctive I's and J's than our Scrabble tile set's letters, which may also be part of the issue. We therefore hope to train a better model using a convolutional neural network on a dataset of our tiles. We have gathered and labeled a dataset with almost 100 images of each letter, and will finish training the model this week!
Other further work includes similar recognition for the custom hand, detecting the position of the tiles within the hand, and making sure that the letter classification model is able to classify letters in the hand as well.
Board w/ perspective (un)warp
Board w/ thresholding applied & countours detected
Contours classified into letters
We've built a first version of the site with UI for gameplay and other actions using TypeScript and React. The UI features large buttons with multiple sub-menus to ensure that the UI is easily legible and easy to interact with. The user can see the state of the board and their hand as well as play tiles using the interface. We intend to add functionality to swap between the UI depiction of the board and an un-warped image of the board from the Stretch's camera. We've ensured that the UI can be accessed remotely (a special thanks to Henri Fung from Team 4 for setting up rosbridge remotely and allowing the other teams access!).
We've checked that the site is able to communicate with the Stretch and populated a series of configuration buttons that allow for manual teleoperation and adjustment in case any autonomous routines fail. Our next step is to implement the gameplay buttons (selecting and playing tiles), which are simply a series of custom Stretch commands we've already programmed. We have yet to incorporate communication with the computer vision server, but there is an option in the play menu for the user to manually input and remove tiles from the board display to override the vision system in case of an error. A video of our interface is below:
User Interface Demo
We finalized our custom tile holder and gripper attachment! The tile holder features a flat surface to pick up tiles with a lower vertical profile so that the Stretch's arm does not collide with it during normal gameplay. It also has an inset border layer for each tile slot that ensures the tiles cannot be pushed up against the edge of the holder, which would render them impossible to pick up. We are able to achieve consistent tile pickups from this holder; more on this in the robot motion section below. The gripper attachment features a snug fit on the fingers of the Stretch so that we can achieve a consistent attachment point every time (since other teams are not using these attachments, we must remove them regularly). The gripper's "fingers" extend downwards at the same angle as the original gripper cups of the Stretch. We use rubber bands to add friction and make tile pickup easier. We've also created wooden housings for the ArUco markers we intend to use for the board and holder. These ensure a 90-degree angle and consistent offset from the corners of the board and holder, allowing vision and robot alignment routines to rely on specific marker positions. Images of these components are below.
Another decision we made this week revolved around how tiles will be drawn. Due to the complexity of consistently flipping tiles with the Stretch, we found that it is simplest to have the unassisted users at the table draw tiles for the assisted player. To do this, we plan for all tiles in the draw pile to be face-down and visible (instead of in a bag). We then marked the top edge of every tile with a small piece of blue tape; a non-assisted player can then pick up a tile face-down, rotate it to the correct orientation, then place it in the tile holder of the assisted player, flipping it only when it is out of view.
Our custom holder along the game board. Note the ArUco marker housings used both for the holder and the board.
Our custom gripper attachment on top of the Stretch's fingers.
A marked tile. All tiles in the set are marked in the same way.
This week, we focused on using our finalized hardware components to make final calibrations for tile pickup and placement routines. We use ArUco marker detection to align parallel to the board and parallel park, but due to hardware limitations of the Stretch (including imperfect marker detection), this process requires some minute manual adjustments, buttons for which are on the user interface. Once the Stretch is aligned, we perform manual calibration of the board's corners and the tile holder--since a Scrabble board has a very small margin of error, this is necessary to avoid placing tiles ambiguously. This calibration needs to be performed once before the game starts; a stretch goal of the project is to be able to automate this calibration at least partially. Much of the calibration data is stored client-side; we intend to relocate this to a separate ROS node on the Stretch so that information can be preserved if the user disconnects.
Once the Stretch is calibrated, gameplay actions can be performed autonomously. The Stretch can be commanded to move to a specific slot of the holder, pick up a tile, move to a designated square on the board, and place a tile. We use dead reckoning based on calibration data to move to holder and board positions, which works remarkably well. Non-assisted users at the table may need to nudge tiles slightly to the correct position, but based on our experimentation, the intended tile position is usually clear. Our custom tile pickup and drop routines are programmed as multi-point joint trajectory action goals. A demonstration of these actions (controlled by our back-end interface) is shown below, and the code for tile pickup is added for reference:
Tile Pickup and Placement Demo
Tile Pickup Code
To present the work we've done this quarter, we're working on a final presentation video. Here is our draft script:
This week was focused on putting all of the pieces of the project together and getting over the unexpected difficulties of integration. We also met with Dylan over Zoom this week so that he could try out the interface and the Stretch!
We began training and evaluating our custom model this week. While the model performed well on board data, when we initially ran the classifier on the hand data, it performed poorly. We hypothesized that this was due to the hand letters appearing a little more "squished" than the board letters due to less perfect perspective warp, while our original dataset only included letters from the board, and did not have meaningful perspective-related noise incorporated in it. We therefore extended to dataset to include 150 real images per letter, 100 from the board and 50 from the hand, hand-labeled the new set of data, and additionally generated synthetic images with realistic perspective warping and more noise, for a total of around 600 images per class. We then began tuning hyperparameters such as learning rate, batch size, number of layers, etc. for the best performance possible. Here are some results from one of our current better-performing models so far:
Unfortunately, when converting the vision code into a ROS node to run on the Stretch, we found that the required compute to classify the tiles on an image was too demanding and severely slowed down the rest of our system even when called once as a triggered service. We had no issues running the vision code locally, so we switched gears to run the vision code on the interface side. More on how we accomplished this is in the UI section!
During some testing this week, we realized some issues between using React and rosbridge in the same component. Every time React was notified that a variable within a component has been changed, it re-rendered that component. This caused the rosbridge connection to be restarted upon almost every UI interaction, which overwhelmed the Stretch. We ended up separating the site state and rosbridge into different components to bypass this issue. Doing so allowed us to get a much smoother camera feed on the site. Additionally, we also started working out how to incorporate the vision backend with the UI. We set up a backend python server using Flask to expose an API endpoint to the front end. We've checked that this connection works, but we're still having some difficulties figuring out how to pass the image data in an HTTP request.
Backend Python Vision Server Code
One of the first things the motion team worked on this week was bridging our existing robot functionality into one smooth trajectory that can play a whole word on the board. Unfortunately, error seems to accumulate with every added point of a trajectory goal, which meant the Stretch got progressively more off with every word played. In light of this, we switched to a version of the code that plays one letter at a time.
One of the first steps of setting up the Stretch to play Scrabble is to calibrate it to the board's corners and the position of the tile holder. Prior to this week, we stored this information client side; this was problematic if the client refreshed their webpage or was disconnected for any reason. Therefore, we created a new node to run to the Stretch that stores calibration information by listening to the corresponding topics and returns the current calibration on request as a service. This allows multiple clients to use the same calibration and allows for disconnect resilience.
We also added a new trajectory that stows the robot's arm in such as way that it does not obscure the board for vision tasks. This trajectory required careful adjustment to be mindful of the end-effector attachments not hitting the tabletop.
Alignment Math Node
(Partial) Play Word Code
This Sunday (June 1st), Dylan graciously lended his time to the project and tried out controlling the interface remotely. We were hoping to have our project more integrated by the time of this meeting, but we were still able to identify some points of future work and changes to the interface that would make playing Scrabble with the Stretch a lot smoother:
User Interface:
The buttons on the interface can be condensed; it's more important to fit the buttons on the screen at one time without needing to scroll. We updated the UI so that the elements became slightly smaller and switched to displaying the user's hand vertically instead of horizontally. Stacking the hand vertically made it possible for it to fit on screen with the board at the same time. Having a keyboard input for a word to play is easier than clicking and dragging. Dylan also preferred to refer to tiles in the hand with slot numbers as opposed to the letter in the slot.
Motion/Gameplay:
A way to bridge user experience with the inaccuracy of the Stretch is to have interface buttons to get the Stretch arm "close enough" to the required positions in the hand and grid, then allow for small manual adjustments to get the gripper into the correct position. This method requires live camera feedback, which we will include within the interface.
Thank you so much again to Dylan for your time and feedback! We'll be implementing these changes and moving towards a finished final product in the coming week.
Picking up and placing a tile with Dylan!
This video has audio.
We made it to the end of the quarter! This has been such a journey, and we're proud of what we accomplished over these 9 weeks. To wrap everything up, here are some of our thoughts on the course and our experience with it.
Naama: I had a great time seeing all of our hardware and environment modifications come together with the robot motion. It took a long time to understand the inner workings of ROS, and we fabricated several iterations of every part we wanted (thank you, Diana!), but the day we got consistent tile pickup from the custom holder was so joyful. Fine-tuning all of the joint trajectories for the robot was time-consuming but worth it. It was also great to be able to play a game of Scrabble with a user! I'm glad we picked a project that tapped into the social element of assistive robotics, since we were all really passionate about it.
Jeffrey: Successfully playing an entire game of Scrabble (with only a few suspect words played), with a user who had not controlled a Stretch before, was a lot of fun! It was very gratifying to see all of the systems working at the same time in the final project.
Diana: Overall, the course was full of many new and exciting experiences. Many computer science classes lack a physical component . We had a lot of fun when we finally got the robot into a somewhat usable state, especially when we played a game with Dylan teleoperating through the interface. Dylan had endless amounts of patience as we were struggling and provided many insights on the UI.
Karen: Every single milestone we accomplished was so rewarding, from seeing the Stretch pick up a tile, to it correctly recognizing the letters on the board, to the first time someone controlled the robot from our user interface... at every step, it felt like "wow, we're really doing this." As the others mentioned, I loved seeing the project coming together and finally playing an entire game of Scrabble with the Stretch. I felt like this is a project we were all passionate about and had an enthusiastic support team for, and that we are all celebrating our success!
Naama: The extra resources provided by TAs (like Michael's simple web teleop skeleton) helped us a lot when working on the project. In general, the availability of help from the course staff when needed was excellent. I'm also glad that there was a sense of collaboration and information sharing between teams. It made it feel like each group wasn't at it alone.
Jeffrey: The Stretch-specific labs were very useful for getting an understanding of the robot's capabilities and how to control and operate it, as were the insights provided by the TAs about the robot!
Diana: The course staff's knowledge was immensely useful, especially when we had to deal with outdated ROS docs :(. The labs of this class taught us the basics of interacting with the Stretch, while the final project let us perform a deep dive into robotics and develop skills individually. Despite not every team being able to complete the lab during the given timeslot, eavesdropping on other teams troubleshooting was helpful.
Karen: In addition to class resources, scouring the web occasionally proved useful in finding example projects that we could look at to understand the capabilities of the Stretch or to gain insight into how we could accomplish our tasks.
Naama: While some of the early labs came in handy for understanding ROS later on, the labs in class that required sharing the Stretch were a lot more difficult to get done during class time. Some of the documentation for ROS and the Stretch 3 were also lacking (I spent a long time trying to fix broken ArUco alignment code before Jeffrey helped by just doing the math by hand), which made programming the Stretch tedious at times. We also discovered a lot more unstated limitations of the Stretch hardware as we continued to work on the project, certainly exacerbated by our project's high precision requirements.
Jeffrey: Discovering and working around the Stretch's undocumented hardware limitations was very time consuming; for example, we were running into an issue where any trajectory that involved telling the base to move 0.0 meters would cause an error, and it was only by finding a years-old issue on the Stretch GitHub repository that we were able to get a workaround. Similarly, trajectories involving the wrist pitch would often error out, and necessitated creating a complex system to handle just a simple stow command for the arm.
Diana: The weekly robot news segments were interesting, but didn't contribute all that much to our learning. It was mostly surface level details of the robot features rather than innerworkings.
Karen: I suppose there isn't much we can do about it, but 10 weeks is such a short amount of time for a big project! I believe there are several more improvements we could make if we even had a few more days of time.
Naama: It would have been useful to have some course-created resources about how to work with the Stretch and ROS based on how teams in previous quarters used it. For example, finding documentation (not tutorials) on ROS2 functions was basically impossible--it would have been handy to have access to, for example, what every parameter in some ROS2 functions actually referred to. A lab specifically around launch files and how to navigate the stretch core library would have been helpful, too.
Jeffrey: Naama and I worked together closely on the robot controls, and so my thoughts echo hers; documentation for ROS2 and the Stretch was lacking, with quite a few of the Stretch tutorials either not updated for ROS2 or containing uncompleted TODOs.
Diana: Adding on to compiling external Stretch and ROS resources, an additional file or folder where teams could place their findings in one place could be nice.
Karen: Once we diverge into our own projects, I think it would be nice to have a quick weekly update where each team shares their progress (we basically did this once, but I mean as a regular occurrence!). I enjoy seeing how other teams are making progress--not only is it encouraging, but it can also help if multiple teams are having trouble with similar issues, and can strengthen the sense of community within the class as a whole.
Keeping up our teamwork was crucial in getting this project off the ground. Some strategies from our first week as a team stuck around and some didn't, but we're happy with the balance we struck.
Some things stayed the same: As ROS lead, Jeffrey's accrued ROS knowledge was invaluable to achieving our robot motion goals; as perception lead, Karen's hard work on the vision system yielded impressive results; as design and fabrication lead, Diana's fabrication skills (especially with the laser cutter) allowed us to have high-quality hardware modifications; as hardware lead, Naama became familiar with the environment setup required for the project and collaborated closely with Jeffrey on robot motion. We all contributed to the weekly blog posts on the site, especially when the posts pertained directly to our area of expertise on the project. Most importantly, we didn't ghost each other. We maintained an active Discord group chat and communicated updates about our progress and plans to meet up often. Decisions about changing the direction of the project (like the pivot in the week following the teamwork plans) felt democratic and fair.
As it turns out, not all of the strategies we came up with were helpful. We didn't rotate the "driver's seat" as much as we'd hoped in that original post, especially in the middle weeks of the project. The primary reason for this was time pressure. Each person (or two people) independently contributed work to different areas of the project that aligned with their role; collaboration across roles became more frequent during the last two weeks as everything was integrated together. We didn't end up creating a weekly task list, partially for this reason; each lead was best suited to determine their own weekly goals as long as they communicated those goals to the rest of the team. Doing this informally worked well for us. We also had a role swap! Diana's previous experience with TypeScript/React meant she had an easier time making progress on the user interface, and Karen's vision duties were consuming a lot of her time. As a result, Diana took over as user interface lead and Karen took over documentation and communications.
One of the strengths of this team was our flexibility. While our assigned roles helped guide our decisions when it came to work allocation, team members dropped in to lend a hand when another team member was struggling. We were also communicative; we kept our team chat active with progress updates, discussions about plans, and deadline reminders (and pictures of our cats!). While we worked hard every week to meet our goals, we also took the time to get to know each other and chat about things outside of the project, which helped the team feel cohesive.
While things were good, things weren't perfect, and there's always room for improvement. We wish we had been more clear about our goals regarding weekly deadlines and our plans for assignments like the project proposal. It might also have been better to expose each other to the other parts of the project and attempt integration earlier to avoid surprises about our system's capabilities near the end. We also wish we had shared our schedules with one another so that we could block out group work times more efficiently.
Alongside the final presentation video, here is the report for the Crabble project. This encapsulates everything we're worked on this quarter alongside user and system evaluations and where we see the project headed next.