QUAR-VLA: Vision-Language-Action Model for Quadruped Robots 


______________________________________________________________________________________________


Pengxiang Ding Han Zhao, Wenxuan Song, Wenjie Zhang, Siteng Huang, Ningxi Yang, Donglin Wang

Westlake University, Zhejiang University

ECCV 2024

Here, we offered demos of both simulation and real scenarios to

show the effectiveness and generalization of our work, including:

1. Effectiveness in  seen scenes

2. Sim2Real transfer capabilities

3. Rubustness in different localization

4. Rubustness in different workspace

5. Rubustness in unseen scenes

1. Effectiveness in  seen scenes

We show both simulation and real scenarios of all six tasks:

Go through, Crawl, Distinguish, Go avoid, Unload and Go to

Go through

go_through_square.mp4
61_1715934244.mp4

Correct: Go through the square tunnel

go_through_tri.mp4

Crawl the bar

crawl_2_1_无音轨版.mp4
crawl_2_1_仿真.mp4

Distinguish

distinguish_letter_A.mp4
63_1715934558.mp4

Go avoid

avoid_1.mp4
62_1715934457.mp4

Unload

unload_5_5.mp4
69_1715934784.mp4

Go to

goto - 使用 Clipchamp_1715952896350 制作.mp4
54_1715934121.mp4

2. Sim2Real transfer capabilities

(Failure Cases Analysis)

We here show results in different sim2real training paradigms and failure case analysis.

go_1_5.mp4
go_1_6.mp4

2. 10% Simulation + Real Data

go_1_7.mp4

3. Simulation Data

go_1_8.mp4

4. Real Data

3. Rubustness in different localization

We compare the results in different initial localization.

It shows that our model is robust to different initial localization.

实验大腿们群! 2024-05-05 00.08.45.mp4
实验大腿们群! 2024-05-05 00.08.48.mp4
实验大腿们群! 2024-05-05 00.08.41.mp4

4. Rubustness in a different workspace

We compare the results in different size of workspace.

It shows that our model is robust to different size of workspace.

67_1715934747.mp4

large workspace

goto - 使用 Clipchamp_1715952896350 制作.mp4

small workspace

5. Rubustness in unseen scene

We compare the results in the unseen object and verbal information

Unseen Verbal Information

77_1715951094.mp4
74_1715939378.mp4

Unseen Object

76_1715940284.mp4
73_1715938060.mp4