A Robotics Blog

481C Sp22 - Team Nark Weekly Blog

Mission of the project

We are humble students embarking on the journey that is robotics. We look forward to making a global impact on the field of robotics, in a major way. Join us in our adventure to this mysterious unknown land!

Weekly Updates!

Introducing Zucc!

Week 10: Fin

Speech-to-Text Team

Adjusting our speech-based model to the Amazon's use-case, we made a new version of our app. User has a list of items that need to be placed into the shelf. The user can place an object and say "next" to get the new object that needs to be placed onto the shelf.

- Tanish and Ritadhwaj

Incremental Modeling Team

Most of our deliverable side was complete last week, and our main focus this week was setting up communication with the application and other processes using network requests, versus applying the command line interface.

- Long and Mrigank

Team Reflection

As a whole, one of the strongest aspects of our team was constant communication between different sub-groups of the team and an extremely collaborative and friendly relationship. From a strategies perspective, most of the roles we decided in week 2 held up, with only big change being that Tanish did more of the perception and Ritadhwaj did more of the UI. One of the strategies that worked well for us was communicating our progress through messaging instead of group meetings which tended to be less productive and difficult to schedule. Our workflow throughout the quarter was formed fairly naturally, which meant that we never ran into many issues with strategies.

Course Reflection

What was most fun?

The novelty of actually using the robot to move and interact with the world never wore off. Even on the last day, seeing everything work was pretty cool. The course started fairly strong, with us making quick progress through the labs, and seeing multiple cool concepts like the monte-carlo estimation.

What was most useful?

Gripper Teleop, AR Tag perception, and Compute IK from move group were extremely helpful throughout the course, abstracting away a lot of the tedious parts of the course.

What was not so useful?

The web interfaces with polymer, specifically the map annotator, were too buggy to use properly at any scale. The labs, while providing a good starting point, were pretty outdated and had a lot of code that led us on the wrong path. For instance, the move_to_pose function wasn't very useful because of it's seemingly arbitrary path choice, but most of the later labs assumed we use it for moving the robot.

What would have been useful but was missing?

We think the focus for the final projects at the end of the quarter being specifically improving the robot for the amazon picking challenge was a little limiting, and could have been much more flexible. I think introducing some elements of creativity earlier in the quarter, parallel to the labs, may have been useful too, to allow us to develop a much more in-depth final project.

Week 9: Last Step

Speech-to-Text Team

We finished developing the speech-to-text app. Essentially the way it works is that it records your command, uses the google cloud API to convert it to text, then parses the text into a command which it sends to the incremental modeling team.

We implemented speech commands for placing an object onto the shelf and picking an object from the shelf.

- Tanish and Ritadhwaj

The ZuccApp!

Incremental Modeling Team

After locating the objects and building the incremental container model, Zucc now has an end-to-end picking objects functionality. When a user requests for an item, Zucc will:

Look it up in the container model that it built to find the existence of the object in the shelf.
Use the stored position of objects in the shelf to move the arms to approach the bin containing the objects, and pick up the requested item and drop it in the pin.

We are now able to pick up the objects with the accuracy nearly 100% without colliding with the shelf or picking up the wrong item.

- Long and Mrigank

Upcoming Video Script

A Look At The Ethics

Even though robotics can have really profound applications and can help a majority of people reliably, it is not to say that they do not come without their ethical dilemmas. Whenever a new piece of technology is implemented in a workspace, we have to consider how it might affect the population that is affected by it - both positively and negatively. Well-trained robots, which when compared to humans are much more efficient at performing mundane tasks, might pose a threat to people's jobs, especially in the warehouse industry which employs 1.1 million people in the US alone. One should question the need for pick-up robots in warehouses, or have a strategy in place to replace the lost jobs due to the pickup robots.

Robots not only pose threat to our society, but to our environment, too. The pickup robots would be working day and night in many warehouses throughout the year, leading to a massive increase in the consumption of energy by these robots. In times when climate change is affecting our planet beyond a point of total recovery, we have to be mindful of our energy consumption and, in turn, question its necessity. Using clean energy is important, but only while generating more clean energy (even, indirectly), since increasing your energy consumption, while buying from existing clean energy reserves only stretches our already thin energy bandwidth and would result in energy injustice.

On the other hand, the positives from robots, and robot pickers specifically are undeniable. Robot pickers are one of the reasons we have such fast delivery times. Automating the process to stock items (especially heavy objects), and find and pack them when requested saves a lot of time in the delivery pipeline.

Efficient robot pickers that work well in most situations will free up a large chunk of the population to pursue other, less labor-intensive jobs, while also making every service more efficient, and possibly more robust. Faster service, larger scope of abilities, and more... the potential of how complex a picker's task can be is infinite, especially with how fast machine learning is progressing. While it might hurt people for a short period of time, as most progress in technology does, it will lead to an overall improvement in the quality of life over time.

- Ritadhwaj with a little help from the others

Week 8: Steady Progress

Speech-to-Text Team

We developed a basic API that records speech and converts it to text which works pretty well! Beyond that, we've mostly been working on the backend of the application so there isn't much to show for that yet but it should all be up and running by next week!

- Tanish and Ritadhwaj

Incremental Modeling Team

We implemented a basic version of the incremental container modeling which essentially, while running has the following flow:

Place an object
Say you placed the object to the program
The program locates the object, draws a box around it and saves the state of the object

From an accuracy perspective, it works with nearly 100% accuracy for large objects, and we're working on improving it for the smaller objects!

- Long and Mrigank

Week 7: Where we're going next

The Picking Challenge (General Motivation)

Picking robots can have a variety of benefits, from automated packaging and delivery, to aiding the disabled population in their daily activities. For us the motivation of the challenge is understanding how the human life can be improved by these robots, either indirectly or directly. What makes this challenge so difficult though is that every object is different, and can be hard to both detect and pick up. Furthermore the environment in which a picking robot works differs extremely from robot to robot. A picker robot could be responsible for handling package inventory at Amazon or aiding in developing the next Iron Man suit (Dum-E).

Extending the Challenge (Specific Motivation)

Now that we had a baseline solution to the Amazon Picking Challenge (although with lots of possible improvements), we thought that we could make it more accessible for a larger percentage of possible users. Our vision was imagining how the robot might be helpful at home, and our first thought for where to go next was improving two main ideas. Firstly, how it communicate with the user, and secondly, how well it detects objects in the shelf. To see evidence for why we thought these improvements would be helpful, you can take a look at the results from the previous week. The robot moved using a command-line interface which is extremely inconvenient to the ordinary person, and to perform actions, we had to constantly communicate with each other (as you can hear), versus just one person telling the robot what to do. For object detection, we had an extremely large number of false positives and false negatives, both of which we want to improve.

Solution Roadmap (Technical Approach)

For the interface, we plan to develop a mobile application that uses one of many common speech-to-text libraries and performs a network request to our system based on the contents of speech. If the user states that they have placed an object, the system will incrementally model the new container, storing information about where the new object has been placed. If the user asks the robot to pick up an object, the system will use it's model of the container and segmentation information to detect, and pick up the object, placing it in a tote bag.

We plan to split into two teams. One will work primarily on the speech to text to command request. The other will work on converting the command to detecting and saving the new object's location. Once that's done, both teams will work together on connecting all the modules and improving the system as a whole.

System Design (System Figure)

To the right is a basic system-level diagram. The red nodes are in progress, the blue nodes are complete but could use improvements, and the green nodes are completely done. Here's a more in-depth description of the nodes:

Web/Mobile app: This converts speech requests by the humans into network requests to our system. (Worked on by Tanish/Ritadhwaj)
Robot Point Cloud: Saves the current robot point cloud. (Very old lab)
Container Modeler: Takes in <Put Object> requests and using point cloud data, calls the new object detecter. (Worked on by Mrigank/Long)
New Object Detection: Looks for difference in point clouds to incrementally update the container state. (Worked on by Mrigank/Long)
Container State: Stores the current state of the shelf at any given moment. (Worked on by Mrigank/Long)
Object Pickup Request Solver: Takes in <Pick up Object> requests and attempts to pick it up. (Worked on by Mrigank)
Smart Cropper: Crops the shelf smartly based on AR Tag locations (Worked on by Tanish/Ritadhwaj)
AR Tag Tracker: Tracks AR tags (Given)
Object Detection: Segments Objects based on point cloud data and trained feature sets (Worked on by Tanish/Ritadhwaj)
AutoPicker: Actually picks up the object based on previous info. (Worked on by Mrigank)
IKSolver: Calculates IK of a given position from it's current joint positions. (Given but improved by all of us)
Gripper/Arm Controller: Moves the respective parts of the robot (Given but improved by all of us incrementally)

Our minimum viable product consists of the implementing all the red items (so the entire system essentially) with our stretch goals being successfully improving all the blue items in the system design.

Storyboard (Finite State Machine)

Here's an ideal version of what our system will look like under either comand:

Stow Command

Human places an object and states to <put object>
Phone translates and transmits the command to the computer
Computer processes the command and detects the new object
Saves the object to the shelf state

Pick Up Command

Human states to <pick up object>
Phone translates and transmits the command to the computer
Computer processes the command and attempts to pick up the object by using a process similar to last week.

Upcoming Milestones (Plan)

Week 8: Develop baseline versions of every red module and get them to work individually

Week 9: Integrate all modules and get everything running together smoothly

Week 10: Improve on any remaining modules and formally evaluate our pipeline through quantitative metrics

Week 6: Segmentation

Zucc can find objects! (Segmentation Results)

Euclid Algorithm:

False Negatives: 2
False Positives: 2
Correctly segmented items: 7 / 9

Region Based Algorithm:

False Negatives: 3
False Positives: 9
Correctly segmented items: 6 / 9

Color-Region Based Algorithm:

False Negatives: 3
False Positives: 22
Correctly segmented items: 8 / 9

After analyzing all the metrics, we decided to go with Euclid algorithm as it had least amount of errors out of all the other algorithms and it correctly segmented almost all of the items. We believe that improving some of the parameters for Euclid algorithm and better lighting conditions might make it much more accurate.

Zucc can recognize objects..?! (Recognition Results)

Config 1: First Column, third row

We used Euclid to recognize the objects and it did not segment the gloves properly, resulting in a wrong label. It wasn't able to segment the pill box.

Accuracy: 0%

Config 2: First Column, first row

We used Euclid to recognize the objects and it did segment all the objects properly, but it only recognized one object properly.

Accuracy: 33%

Config 3: Third column, third row

We used Euclid to recognize the objects and it segmented two different parts of the the same object, recognizing one half of it correctly.

Accuracy: 0%

Hey Zucc, can you get me [object_name]? (Picking Objects by Segmentation)

Zucc can finally pick objects on its own! When told, it can accurately segment objects properly to find the correct objects within a shelf bin and pick them up.

In this video, we put two objects (a pill-box, and a pill bottle) in two different configurations on the shelf. In the first configuration, they are in different bins on the shelf, and in the latter, they are in the same bin on the shelf. The video showcases our attempts in simulation followed by real-life.

What should Zucc work on now? (Next Steps)

Now that our robot can pick items we tell it to, we've essentially hit upon the core of the amazon picking challenge, and so, inspired by a different amazon service, Alexa, we've decided to go in a slightly different direction.

Our final idea of what our project would look like is something like this. We begin with an empty shelf and some items to put on there, as we put the items on to the shelf, we tell the robot which item we are putting on the shelf. After stocking up the shelf, we want to be able to say "pick up the ball" and the robot picks it up and hands it to us (by placing it in the tote).

To achieve this vision, we need to mainly work on the following crucial features:

Speech-to-text software that tells the robot to save a new object's position, and which object to pick up.
Incremental container modeling to understand what the shelf looks like at any given moment as we put items into the shelf.
Improved image recognition software to locate the object in the shelf accurately to aid with the incremental container modeling as the objects can be moved around by us as we continue to fill up the shelf.

Week 5: Tunnel Vision

Robotics_lab5_part1.mov

Cropping the smart way

We wanted Zucc to focus. Focus only on the task at hand, and hence its vision only needed to see the shelf from which it was going to pick objects. We built an algorithm that would crop the input point cloud (what Zucc sees), and smartly and automatically crop it down to just the shelf that is in front of it. In the video, we demonstrate how even after moving the shelf around, we only see the shelf's point cloud.

Robotics_lab5_part2.mov

Let's pick! (much better this time)

Picking up for Zucc is easy when you don't have to worry about your surroundings, but Zucc would need to detect collisions and plan its (arm's) journey to be able to pick objects efficiently and in any setting. In this video, we demonstrate first the simulated planning based on real-life inputs (Part 1.1 and Part 1.2), and then it executes the planned path to the object (Part 1.2). Furthermore, we teach Zucc to pull objects out from the shelf (Part 2), and then finally pick an object and store it in the bin (Part 3).

Week 4: Picking up Progress

Zucc realizes its true potential (spoiler: it's picking)

This week we returned back to the arm, after having a fun map-based journey. To start with, we made an interactive marker that Zucc moves his arm to use inverse kinematics while avoiding collisions with other objects in the environment. On the left, we have a video that demonstrates using that interactive marker to pick up a blue cube much faster than we did a couple of weeks ago.

Materializing Zucc

It was the next step in Zucc's journey to perform whatever it had practiced on the simulator to demonstrate in real life. In the first part of the video is us giving instructions to pick an object from the shelf to Zucc while in the second part of the video, you can see Zucc perform its task in real life.

Zucc embraces its picking potential

The last interactive was cool, but we wanted to go further. Moving the arm into position, closing the gripper, and picking up the cube naturally(?) were all processes we could automate into a single smooth program, and thus was born our triple interactive marker. The marker displays the arm in pre-grasp, grasp, and pick-up/lift positions and lets us pick up the cube even faster!

Mastering how to pick (when told how)

Things are only so much fun in the simulator - so in this video we run the above program on the real-life Zucc with the first part of us giving it instructions, and in the second, it picking up an object from the shelf (this time, much faster!)

Week 3: Movement

Zucc likes to move it move it (Assignment 3 Part 1)

After successfully making the web interface to move the robot last week, we want to control it in RViz itself and track the path from where it started. Our path tracker marker uses a line marker, which shows a green line for the path it has traversed. The interactive marker (arrows in our case) help us to move the robot move forward/back (by 0.5m) and rotate left (counter-clockwise) or right (clockwise) by 30 degrees. We rotate and move around the robot in the video above, which generates its path from the point where we started our path tracker script. We increased the speed a little bit for the purposes of the video to demonstrate the robot's movement better.

Zucc going place to place ~ Edition: CLI (Assignment 3 Part 2)

Now that we can control Zucc interactively, the next step is to save positions of importance for Zucc. We do so using a command-line interface (CLI) that's on the left half of the recording. The CLI's functionality is fairly basic, and involves the following instructions, all of which we demonstrate the capability of during the recording:

List: List all currently saved positions
Save <name>: Save Zucc's current position under the name "name"
Delete <name>: Delete position called "name"
Goto <name>: Make Zucc go to a saved position with name "name"
Help: Display CLI commands
Quit: Quit

Zucc going place to place ~ Edition: UI (Assignment 3 Part 3)

CLI interfaces are great, but hard for a regular user to use, so we decided to make a more interactive web UI using an experimental library developed by University of Washington Human-Centered Robotics Lab. The UI is on the left half of the screen with RViz running on the right half. The UI's functionality allows the user to create, delete, and move around poses (positions), and also make Zucc go to any of the saved positions, and we demonstrate all of those capabilities in the recording to the left.

Note: We mostly moved the markers around in RViz because moving around the interactive markers in the UI is still highly prone to issues. Additionally you may notice that the robot is not visible on the left half of the screen, it seems to load in and out randomly, probably due to the nature of the package still being experimental.

Week 2: The Beginning

The Journey Begins (Picking a Team/Robot name)

After reading about the Amazon Picking challenge last week, we decided to take it on ourselves (well, a slightly easier version using the Fetch robot). Step 1, picking a team and robot name. After an inspired discussion we decided to call ourselves "Team NARK" and our robot "Zucc". The meaning behind the names is left as an exercise to the reader.

Our First Attempt (Part 1 of Assignment 2)

We decided that our first step would be to understand Zucc better through manually controlling a simulated version of Zucc that we like to call iZucc. Here's what our attempt at getting iZucc to pick up a blue cube looked like:

Our interface (the left half of the screen) has two main portions. On the left, we see what iZucc is seeing, and on the right, we have various controllers in our interface to control each part of the robot. Torso controller controls the height of the torso of the robot. Head controller controls the head pan and head tilt controller, to change our field of vision. Arm controller controls various joints on the arm, so that we can move it in any possible way to pick up the object. Grab controller controls the gripper, making it open or close to grab/release the object. Motor controller controls the base of the robot, with linear and angular speed to move it forward, back, left, and right. All of the controllers (except grab) have sliders to change the values.

Assigning Responsibilities (Part 2 of Assignment 2)

With our control of iZucc n0w proficient, we decided that it was time to split the roles for the rest of our project. Here are the loosely chosen roles and responsibilities we came up with for each person:

Long Nguyen
- ROS Guru: Understands the nitty gritty details of ROS (e.g. what's the difference between a topic and a message?) and knows how to use different ROS tools. Responsible for the back-end ROS-systems.
Mrigank Arora
- Design Guru: Responsible for the system design of Zucc, and how Zucc will interact with the real world.
- Manager: Makes sure every team member is on the same page at all times. Help make decisions (layout pros and cons). Enforce "interface agreements" so components can be easily integrated.
Tanish Kapur
- UI Guru: Understands human factors and usability, makes things look good
- Documentation and Communication: Responsible for taking notes, writing blog posts, making videos, etc. and acts as spokesperson.
Ritadhwaj Roy Choudhury
- Perception Guru: Understands sensor data and can write software for processing it.
- Hardware Guru: Has a deep understanding of the underlying hardware and sensors, and what the differences might be between a simulation and real life.

While we have assigned roles within our team for now, we expect all of us to have a hand in almost every area of our codebase, with everyone being responsible for a slightly greater amount of work in their assigned areas. On a week-to-week basis, we will try and understand the workload for each week and divide it among the team as evenly as possible while maintaining as much pair programming as possible.

Every team member's goal at the end of the quarter is to be able to single-handedly recreate the system we design over the quarter if they had to, which means that everyone has at least a moderate level of understanding of every component of the project.

Week 1: Exploration

The 2016 Amazon Picking Challenge was a robotics competition, with the following rules. We designed some performance metrics to judge participant robots on pick and stow tasks.

Pick Task

Performance Metrics:

Success Percentage: The fraction of pick attempts that result in the requested item to be picked and placed into a tote without any error
Average Speed Per Item: The time it takes to result in the requested item to be picked and placed into a tote averaged over all attempts without any error
Percentage of Work Order Fulfilled: The fraction of the work order that's been fulfilled within the time limit
Error Count: Number of each type of error listed below
Each of the above metrics is measured both over all bins, and also over the following bin categories:
- Bins with 1-2 items
- Bins with 3-4 items
- Bins with 5+ items

Errors:

Removing non-target items from the shelf and not replacing them
Selecting the incorrect target item from the correct bin (recognized the wrong target)
Selecting the correct target item from the incorrect bin
Removing items from the shelf but failing to place target items in the tote bag
Failing to remove target items from the shelf (recognized the target but cannot pick it up)
Damaging any item or the shelf
Dropping an item from a height of more than 0.3 meters
Leaving an item protruding out of its bin by more than 0.5cm (mainly for items moved by the picker on the shelf)

Stow Task

Performance Metrics:

Success Percentage: The fraction of stow attempts that result in the requested item to be picked and placed into a bin without any error
Average Speed Per Item: The time it takes to result in the requested item to be picked and placed into a bin averaged over all attempts without any error
Percentage of Work Order Fulfilled: The fraction of the work order that's been fulfilled within the time limit
Error Count: Number of each type of error listed below
Each of the above metrics is measured both over all bins, and also over the following bin categories:
- Bins with 1-2 items
- Bins with 3-4 items
- Bins with 5+ items

Errors:

Putting the target item into the incorrect bin
Failing to remove target items from the tote (recognized the target but cannot pick it up)
Damaging any item or the shelf
Dropping an item from a height of more than 0.3 meters
Leaving an item protruding out of its bin by more than 0.5cm (mainly for items moved by the picker on the shelf)

Using some these metrics, we wanted to judge the following video:

Judging performance metrics

Looking at the first five objects Team NimbRo attempted for the pick task, their performance metrics were:

Success Percentage: 40%
Average Speed Per Item: 17s

Clearly it is a difficult task! We wanted to look at the target object attributes to understand what made the challenge so difficult.

Object Attributes

Rigid/Non-Rigid: whether the object maintains its shape while being manipulated; e.g. the soap bar is rigid, the dog toy is non-rigid.
Irregularly Shaped: whether the object is a regular shape (cube, sphere, etc) or not; e.g. the book is not, but the dog toy is
Surface Material: material(s) that the surface of the object (plastic, glass, etc.); e.g. the dog toy has a surface material of silicone
Size: Physical dimensions of the shape of the object i.e. width, height, depth;
Weight: The weight of the object (in lbs);
Handle/No Handle: whether the object has a handle/hole which is used to hang the object; e.g. mugs, objects with sombrero holes.

Page updated

Report abuse