Due: Feb 7 Wednesday, 11:59pm | Canvas link (rubric & submission) | Points: 7
For this assignment you will think more about perception capabilities needed for your project and you will get started on implementing perception capabilities on the Stretch robot.
Post #10: Last week in Post # 8 we asked you to identify the perception capabilities needed for your project by answering the following questions: What objects, people, or landmarks does the robot need to be able to detect? At what level of precision and temporal continuity do they need to be detected? Based on your answers (which you are welcome to revise now), make a plan for your project for how you will enable the Stretch robot to perceive the necessary items at the appropriate precision/frequency. Include at least two alternative approaches as noted below, for each different perceptual capability you need (e.g., you might need different solutions for perceiving a human user versus perceiving an inanimate object in the environment).
Target solution: What would be a feasible computer-vision-based perception system for your needs, without making any modifications in the environment? What models/libraries can you use? What data would you need to collect? What training procedures can you use?
Minimum viable fallback solution: Implementing autonomous, generalizable robot perception capabilities is challenging; it is hard to know in advance if an approach will work, and even if it works in some/most situations, there will always be failure cases. It is therefore important to think about consequences of a failing perception system and have a fallback solution to perception failures. We recommend two approaches:
Modify the environment for perception: You might have control on how the robot's environment or interacted objects are designed to make the perception problem easier for the robot. The canonical example for this approach is attaching AR Markers (a.k.a. fiducials, or ArUco Markers) to relevant landmarks in the environment and take advantage of existing algorithms for robustly detecting them. Others use bright, unique colors to make perception easier (e.g., green screen). For this approach, be specific about the proposed modifications and make an argument about their feasibility.
Human-in-the-loop/interactive perception: Another approach is to ask a human to "help" with perception, for instance, by clicking on the target object on an image or creating a bounding box around the object. For example, see the demo of Segment Anything by Meta. For this approach, be specific about the type of human input you will need and the feasibility of obtaining that input from the user or a caregiver.
Post #11: As part of your next set of labs you will develop a tool for programming the Stretch robot by demonstration. For your assignment you will demonstrate using this tool in the context of your project. Choose a manipulation capability that is important for your project and try to program a sequence of robot end-effector poses to achieve it. In your post include two separate videos; one showing the programming/demonstration process, and another showing the robot execute the programmed actions in at least two different configurations of the target object being manipulated. Your post should give context and describe what the robot is doing and how that fits into your project.
Post #12: Continue your literature review by reading the list of papers (or resources) you identified in Assignment 3 (and any additional ones recommended by the teaching staff) and writing a short summary (at least 2-3 sentences describing motivation, methods, and findings) for each. For each paper, your post should include the title, authors, publication venue, and year information, and a link to a PDF, in addition to your summary. As in Assignment 3, we expect to see at least 10 papers but you are welcome to include as many as you like.
Submit your response on Canvas as a link to the latest post on your team's website.