A user interacts with our proposed system. Upon starting, the user provides a high-level instruction "pick up the book and insert it into the bookshelf,'' which induces a low-dimensional control space [Left] for controlling the robot (depicted with the joystick, and shaded inputs). This control space is state and language-conditioned, resulting in meaningful axes: pressing down on the joystick brings the end-effector close to the book, while holding up and left after grasping the book moves the end-effector towards the shelf [Middle]. However, these coarse controls are not enough to perform the task, and the user gets stuck. The core of our approach is the ability to provide corrections [Right] such as "tilt down a little bit,'' refining the control space so that pressing left reorients the end-effector, allowing the user to complete the task.
clean-trash
transfer-pen
open-drawer
insert-book
water-plant
* Underlying method was not revealed to participants.
Control Method A (Language-conditioned Imitation Learning Model)
Broad Summary: The only control input you provide is a button press.
Joystick Buttons:
<start> button: Start the robot
Any other button: Stop the robot
Control Method B (LILA)
Broad Summary: You have two control inputs. You can control the input on the right toggle on the joystick. You can press the button X to return home.
Joystick Buttons:
Toggle: Moves in 2 degrees of freedom
B: Open/Close Gripper (binary value, just “press” not hold!)
X: Return robot arm to home state (and provide new instruction)
Start: End/Terminate the Session.
Tips for Control:
When “dropping” objects, you do not need to wait for the object to be perfectly close to the target - rely on gravity! :)
Be gentle near hard objects can result in an automatic fail of task completion
For gripping, make sure the robot’s gripper is fully surrounding the area you wish to grasp before pressing B.
You can return home to X at any time.
Please attempt the task in good faith! If you want to play around with inputs for fun, we can do so after :D
Control Method C (LILAC)
Broad Summary: This is the same as Method B but you get to provide language corrections whenever you want. You can control the input on the right toggle on the joystick. If you get stuck at any point during the task, you can press “A” to provide a correction (see list of corrections below). Once you press “A”, tell the proctor the language correction you wish to provide. After the proctor types in the correction, you can control the input on the right toggle on the joystick. Once you are done with the spoken correction, you can press “Y” to indicate the end of the correction. This will revert to the previous language instruction, and you can again control the input on the right toggle on the joystick. You can also press X at any time to return home.
Note: There is no limit to the amount of corrections that you can provide so feel free to enter as many corrections as you would like!
Corrections:
Moving in any direction (up/down/left/right/back/forward), tilt (up, down, left, right), and twist (left, right).
Moving relative to known objects on the table: book, bookshelf, marker, marker holder, drawers of the shelf.
Provide your correction using language (e.g., say “move left”)
Joystick Buttons:
Toggle: Moves in 2 degrees of freedom
A: Indicate that you want to provide a correction
B: Open/Close Gripper (binary value, just “press” not hold!)
X: Return robot arm to home state (and provide new instruction)
Y: Indicate that you want to end the current correction and revert to the previous language instruction.
Start: End/Terminate the Session.
insert-book: imitation
(failed by hitting the shelf first and then dropping the book)
insert-book: LILA
(failed to aim to precisely insert the book)
insert-book: LILAC
(succeeded with 2 corrections)
water-plant: imitation
(failed by hitting the cup holder)
water-plant: LILA
(failed to aim horizontally to grasp the cup)
water-plant: LILAC
(succeeded with 3 corrections)
open-drawer
transfer-pen
clean-trash
insert-book
water-plant