Robotic Precision Pouring Carbonated Beverages into Transparent Containers

Feiya Zhu,  Shuo Hu,  Letian LengAlison BartschAbraham George, Amir Barati Farimani

Mechanical and Artificial Intelligence Laboratory, Carnegie Mellon University

[Link to paper]

Abstract

With the growing emphasis on the development and integration of service robots within household environments, we will need to endow robots with the ability to pour a variety of liquids reliably. However, liquid handling and pouring is a challenging task due to the complex dynamics and varying properties of different liquids, the exacting precision required to prevent spills and ensure accurate pouring, and the necessity for robots to adapt seamlessly to a multitude of containers in real-world scenarios. 

In response to these challenges, we propose a novel autonomous robotics pipeline that empowers robots to execute precision pouring tasks, encompassing both carbonated and non-carbonated liquids, as well as opaque and transparent liquids, into a variety of transparent containers. Our proposed approach maximizes the potential of RGB input alone, achieving zero-shot capability by harnessing existing pre-trained vision segmentation models. This eliminates the need for additional data collection, manual image annotations, or extensive training. Furthermore, our work integrates ChatGPT, facilitating seamless interaction between individuals without prior expertise in robotics and our pouring pipeline. This integration enables users to request and execute pouring actions effortlessly. 

Our experiments demonstrate the pipeline's capability to successfully pour a diverse range of carbonated and non-carbonated beverages into containers of varying sizes, relying solely on visual input.

Pour Me a Drink with 90% fill by using ChatGPT to control the input.

ChatGPT.mp4

 Methods Overview

In our study, we aim to achieve precision in pouring a diverse range of liquids, encompassing both carbonated and non-carbonated beverages. We present an innovative autonomous robotic pouring system that functions based on a single RGB input, as depicted in the accompanying figure. This figure provides a detailed overview of our Robotic Pouring Pipeline:

(A) Pouring objectives are generated through user interactions with ChatGPT. (B) Predefined pouring conditions are cross-referenced using the vision module. (C) The vision module identifies and segments the container and its content. (D) The robotic arm conducts the pouring process.

As demonstrated in the provided video, the process begins with the user engaging in a straightforward conversation with ChatGPT. From this interaction, ChatGPT determines the desired fill percentage for the beverage. The Franka robotic arm then executes the pour using a combined PID control and vision model system, pouring the exact amount specified by ChatGPT. Additionally, our volume estimation model displays the quantity of liquid dispensed.


Setup of the System

Our robotic pouring experiment employed a Franka-Emika Panda 7-DOF robotic arm with a custom 3D-printed gripper for pick-and-pour tasks. The pouring control module was integrated using the Franka robot control framework. The robot retrieved the pouring container from a fixed location and positioned it above the target container. An Intel RealSense D435 RGBD camera captured the target container's RGB image at 60 FPS, with a resolution of 640x480 pixels for vision segmentation. Our setup is visually detailed in the left figure. Carbonated beverage pouring includes manual can opening before robotic pouring, with each pour using a new can for consistent carbonation.

For more details, the Intel RealSense Camera captures RGB images for vision segmentation, with another camera recording the pouring process. A 3D-printed support assists the gripper in picking up the pouring container (Coke can). A scale measures liquid weight for volume estimation reference. Numbers 1 and 2 indicate additional camera angles, while 3 and 4 represent diverse target containers used in experiments.

Pouring Performance with Coke

The results indicate that our pipeline consistently achieved final fluid levels within 1% of the target, with volume deviations ranging from 2 ml to 7 ml, primarily centered around 4%. This level of error, considering the total volume of a Coke can (355 ml), is deemed acceptable within the context of pouring time optimization. The optimal performance was observed for target percentages ranging from 60% to 80%, with the longest pouring time occurring at a target percentage of 30%, primarily due to initial segmentation errors close to the target value, shown in the Pouring Performance Details Figure before the Initial Pour stage. 

Target: 90 percent fill

Coke_0.9.mp4

Target: 80 percent fill

Coke_0.8.mp4

Target: 70 percent fill

Coke_0.7.mp4

Target: 60 percent fill

Coke_0.6.mp4

Target: 50 percent fill

Coke_0.5.mp4

Target: 40 percent fill

Coke_0.4.mp4

Target: 30 percent fill

Coke_0.3.mp4

Pouring Performance Detail Figure

Illustration of robotic pouring at various time steps and the performance of our Coke pouring pipeline. 

(A) Visualizes the robotic pouring process at different stages. 

(B) Demonstrates the performance of our pipeline during Coke pouring, ranging from 30% (a) to 90% (b) of liquid in the target container. Red and green lines represent foam and liquid level changes, with shaded areas indicating standard deviations.

(C) Features sections showing a full Coke Can (a), the rotating can before liquid egress (b), a pouring spill due to surface tension (c), and a cross-sectional diagram defining variables used for volume estimation (d).

Pouring Performance for Different Beverage

Our system consistently achieved remarkably accurate pouring results across various beverages, with final fluid levels closely matching target percentages (average error <1%).  Low volume deviations (standard deviations) further highlight precision; at a 60% target, the deviation was only 0.22 ml. However, Sprite presented the most significant challenge, with its visual segmentation being easily affected by the bubbles within the beverage. This issue was less prominent in MTN DEW and Coke due to their distinct colors, providing better visual contrast compared to Sprite. Additionally, Water had the shortest average pouring time, primarily because it contains no bubbles to interfere with our vision module. As a result, our pipeline excelled in pouring water, demonstrating both adaptability and efficiency.

Differenct beverage A: Mountain Dew

Target: 80 percent fill

MTNDEW_0.8.mp4

Target: 60 percent fill

MTNDEW_0.6.mp4

Target: 40 percent fill

MTNDEW_0.4.mp4

Defferent beverage B: Spirit

Target: 80 percent fill

SPIRIT_0.8.mp4

Target: 60 percent fill

SPIRIT_0.6.mp4

Target: 40 percent fill

SPIRIT_0.4.mp4

Different beverage C: Water

Target: 80 percent fill

Water_0.8.mp4

Target: 60 percent fill

Water_0.6.mp4

Target: 40 percent fill

Water_0.4.mp4

Pouring Performance for Different Target Container

Due to the large diameter of the transparent bowl, it is not feasible to achieve fill levels of 80% or 60% using a single Coke can. As a result, we tested the bowl with target fill levels of 30%, 35%, and 40%. Our pipeline exhibited exceptional performance with the measuring cup, achieving precise fluid levels that closely matched the target percentages. In contrast, the transparent bowl, owing to its larger diameter, presented a unique challenge. As shown in the table, achieving precise fill levels required slightly more time due to the PID controller's slower angle adjustments.

Different Contianer A: Measuring cup

Target: 80 percent fill

DiffCon2_0.8.mp4

Target: 60 percent fill

DiffCon2_0.6.mp4

Target: 40 percent fill

DiffCon2_0.4.mp4

Different Contianer B: Transparent Bowl

Target: 40 percent fill

DiffCon3_0.4.mp4

Target: 35 percent fill

DiffCon3_0.35.mp4

Target: 30 percent fill

DiffCon3_0.3.mp4

Pouring Performance for Different Camera Locations

Notably, at "location 2", our system consistently achieved accurate pouring results with minimal deviations. The final fluid levels closely matched the target percentages, and our volume estimation method proved highly effective. For instance, at an 80% target, the predicted volume was 285.7 ml, and the final volume achieved was 278.1 ml. Additionally, pouring times were efficient, ranging from 34.6 seconds to 41.0 seconds for an 80% target. In contrast, at "location 3", the system faced slightly greater challenges. While it still achieved accurate pouring, the deviations in final fluid levels and pouring times were slightly higher compared to "location 2".

Different camera location A: location 2

Target: 80 percent fill

DiffLoc2_0.8.mp4

Target: 60 percent fill

DiffLoc2_0.6.mp4

Target: 40 percent fill

DiffLoc2_0.4.mp4

Different camera location B: location 3

Target: 80 percent fill

DiffLoc3_0.8.mp4

Target: 60 percent fill

DiffLoc3_0.6.mp4

Target: 40 percent fill

DiffLoc3_0.4.mp4

Conclusion

In this paper, we introduce a novel autonomous robotic pipeline designed for real-time pouring using a single RGB input. Our pipeline demonstrates zero-shot robotic pouring capabilities with pre-trained vision models, achieving exceptional accuracy in both pouring control and volume estimation. We validate its performance across various scenarios, pouring liquids with different carbonation and transparency into diverse transparent containers from different camera positions. Moreover, by integrating ChatGPT, our approach becomes user-friendly for individuals with varying expertise. In this work, we focus only on a single pour into a target container with a fixed location; in the future, we aim to expand our pipeline to conduct multi-pour tasks involving multiple target containers in random locations. This extension will challenge our system to handle complex scenarios and interactions, paving the way for more versatile applications. Furthermore, the integration of ChatGPT could be deepened within the pouring pipeline, enabling it to assist in creating task plans for multi-pour scenarios. We believe that our approach represents an exciting direction for the future of robotic applications, especially in the development of household robots.

BibTeX

@article{zhu_pour_2023,

  title={Pour me a drink: Robotic Precision Pouring Carbonated Beverages into Transparent Containers},

  author={Zhu, Feiya and Hu, Shuo and Letian, Leng and Bartsch, Alison and George, Abraham and Farimani, Amir Barati},

  journal={arXiv preprint arXiv:2309.08892v2},

  year={2023}

}