Learning Bimanual Scooping Policies for Food Acquisition

Paper [Link] | Video [Link] | Food Datasets [Link] | Scooper/Pusher CAD Models [Link]


A robotic feeding system must be able to acquire a variety of foods. Prior bite acquisition works consider single-arm spoon scooping or fork skewering, which do not generalize to foods with complex geometries and deformabilities. For example, when acquiring a group of peas, skewering could smoosh the peas while scooping without a barrier could result in chasing the peas on the plate. In order to acquire foods with such diverse properties, we propose stabilizing food items during scooping using a second arm, for example, by pushing peas against the spoon with a flat surface to prevent dispersion. The addition of this second stabilizing arm can lead to a new set of challenges. Critically, these strategies should stabilize the food scene without interfering with the acquisition motion, which is especially difficult for easily breakable high-risk food items, such as tofu. These high-risk foods can break between the pusher and spoon during scooping, which can lead to food waste falling onto the plate or out of the workspace. We propose a general bimanual scooping primitive and an adaptive stabilization strategy that enables successful acquisition of a diverse set of food geometries and physical properties. Our approach, CARBS, learns to stabilize without impeding task progress by identifying high-risk foods and robustly scooping them using closed-loop visual feedback. We find that CARBS is able to generalize across food shape, size, and deformability and is additionally able to manipulate multiple food items simultaneously. CARBS achieves high success rates on scooping a variety of food geometries and properties, and significantly outperforms an analytical and single-arm baseline.

To develop a generalizable scooping policy that can work across foods with difficult geometries and deformabilities, we need to use an additional robot arm holding a pusher to stabilize the food item. However, adding a second arm opens the door to a new set of complications and failures: A pushing arm needs to physically make contact with the food to stabilize it, which can easily break or deform food items. We posit that many breakage-prone, or “high-risk” foods will break under predictable scenarios when the pusher and scooper are squeezing the food. We employ this idea when scooping high-risk foods by detecting “breakage-imminent” states and adjusting our scooping policy to anticipate and prevent food breakage and waste.

A Bimanual Scooping Primitive

We introduce a novel three-phase bimanual primitive, which is parameterized by the distance of travel for the pusher and the location of the food item, and employs three bimanual stabilizing strategies.

During the Pushing phase, the pusher and scooper move towards each other along the x-axis. This primitive uses two stabilizing strategies: Angled Pusher and Cupping Motion, where the pusher is angled to push food into the spoon after contact and the pusher is concave to center the food into the spoon mouth respectively. We define a primitive input to determine how close the pusher and scooper get to each other.

In the Scooping phase, we rotate the scooper up about the y axis to “scoop” the food into the bowl of the spoon. During the Scooping phase, CARBS employs the Pinning stabilizing strategy where the pusher moves up with the scooper as it rotates to prevent foods in unstable poses from falling out of the spoon. Lastly, CARBS finishes with the Food Transfer phase by moving the pusher away from the scooper and rotating the scooper towards an end user to prepare for feeding.

Adapting to Breakage Failures

To learn the inputs of the bimanual scooping primitive, CARBS leverages the insight that foods with similar deformabilities encounter similar breakage failures, and posits that learning to identify high-risk settings, e.g. robust vs. breakage-prone foods, will help determine the optimal primitive pusher distance.

To differentiate between food deformabilities, we learn a Risk Classifier that identifies an initial overhead food image as “Robust” or “Fragile”. For robust foods, we set the primitive to be the maximum pushing distance, since the food is not in danger of breaking and can benefit from the added pushing stabilization. For fragile foods, it is nontrivial to select a pushing distance given only an initial observation, so we propose a closed loop system for determining this value.

CARBS uses closed loop visual feedback in the form of a Failure Classifier that identifies breakage-imminent states where the food item is in contact with the pusher and scooper, but not yet squeezed until breakage. This classifier is run at each state during Pushing. When a breakage-imminent state is detected, the Pushing phase is terminated and the pushing distance is cut short.


We design a series of experiments scooping 14 food items to demonstrate the advantage of a reactive bimanual strategy over hard-coded or single-arm actions for scooping foods of a wide range of sizes, shapes, and deformabilities. Of these foods, 3 are deformable (tofu, jello, and cheesecake) and require adaptive scooping strategies to prevent breakage.

We find CARBS is able to successfully scoop 87.0% of robust food items and reduce food breakage by 16.2% compared to a hard-coded bimanual baseline.

Single Robust Food Items

CARBS achieves 97.5% success scooping single robust food items due to the three stabilizing strategies (Angled Pushing, Cupping, and Pinning) in its bimanual scooping primitive.

Multiple Robust Food Items

CARBS achieves 77.8% success scooping two or three robust food items, which is 40.0% higher than the static barrier baseline without bimanual stabilizing strategies. However, this is lower performance than scooping single food items because multiple food items can interact with each other, which complicates stabilizing all items in the scene.

Fragile Food Items

CARBS reduces food loss due to breakage by 16.2% compared to a hard-coded baseline, suggesting that an adaptive a parameter is effective in avoiding breakage failures.

Hard-Coded Bimanual Baseline (a = 1, maximum pushing distance)

CARBS (adaptive a, learning to adjust pushing distance to prevent breakage)

Scooping Failures

We find three failure modes during scooping robust foods: (1) Roll, where the foods roll out of the scooping trajectory, (2) No Enter, where the foods touch the spoon mouth but do not enter the spoon bowl, and (3) Fall, where the foods enter the spoon bowl but fall out due to an unstable pose. We additionally observe a Breakage failure mode for deformable foods.

Roll Failure

No Enter Failure

Fall Failure

Additional OOD Experiments (New!) [Link]

We present new experiments on scooping three out of distribution food classes: 2 strawberries, rice, and couscous. We note that we assume both the rice and couscous are pre-grouped. Additionally, the two strawberries are larger than the bowl of the scooper. We find that CARBS is able to achieve 80.0% success on scooping 2 strawberries, and 79.0% success by weight on scooping rice and couscous. We present example rollouts below, and videos of all 5 trials for each of the foods can be found here: [Link].

2 Strawberries