Vision-language-action models (VLAs) have shown great potential as generalist robot policies. However, these models pose urgent safety challenges during deployment, including the risk of physical harm to the environment, the robot itself, and humans. How can safety be explicitly incorporated into VLAs? In this work, we propose SafeVLA, a novel algorithm designed to integrate safety into VLAs, ensuring the protection of the environment, robot hardware and humans in real-world settings. SafeVLA effectively balances safety and task performance by employing large-scale constrained learning within simulated environments. We demonstrate that SafeVLA outperforms the current state-of-the-art method in both safety and task performance, achieving average improvements of 83.58% and 3.85%, respectively, in simulation. By prioritizing safety, our approach eliminates high-risk behaviors and reduces the upper bound of unsafe behaviors to 1/35 of that in the current state-of-the-art, thereby significantly mitigating long-tail risks. Furthermore, the learned safety constraints generalize to diverse, unseen scenarios, including multiple out-of-distribution perturbations and tasks.
During routine navigation, the robot operates safely and stably, while demonstrating calmness and flexibility in more complex environments. During long-distance exploration, it maintains caution around all objects in the environment, ensuring a safe distance. When entering narrow corners, the robot slows down, maintains a safe motion space, and adjusts its direction promptly.
Robots can flexibly explore narrow spaces while maintaining a safe distance from the surrounding environment.
The robot is capable of anticipating small environmental objects in advance, avoiding stuttering or collisions when passing by them.
The robot has achieved greater stability, slowing down when observing the environment and not rapidly approaching potential target objects.
The robot's environmental perception capability has improved, allowing it to approach and grasp objects in a safe manner.
SafeVLA maintains safety and task performance in the presence of OOD perturbations.
The target was correctly identified; however, during the execution, environmental objects (e.g., the coffee machine) were ignored. While attempting to grasp the target (the mug), repeated and significant collisions with environmental objects occurred. Although the task was successfully completed, the environment suffered substantial damage, indicating a very low level of safety in task execution.
The incorrect target (gas cylinder) was mistaken for the actual target (vase), leading the robot to repeatedly interact with the wrong target, including actions such as shaking and significant collisions. As a result, not only was the task not completed, but the environment was also placed in a state of high danger.
During the execution of the task, the robot interacted with hazardous items (e.g., a knife). While exploring the path, it also made contact with non-essential environmental objects (e.g., a chair), causing the knife on the table to fall.
Though it identifies the bed correctly, the robot becomes stuck on a door handle and fails to recognize its hazard, continuing to collide for over 20 seconds.
When the target is lost, the robot searches repeatedly within a small area, increasing the risk of unsafe interactions.
The robot incorrectly identifies the target (a cabinet instead of a bed) and, lacking safety reinforcement learning, follows the shortest path, ignoring obstacles like doorframes, especially when they are outside the camera's view.
Our experiments demonstrate that SafeVLA successfully decouples the two optimization objectives of safety and task performance, optimizing safety as an independent dimension. This results in the highest task performance and the lowest cumulative cost across all tasks, outperforming existing state-of-the-art methods in both safety and task performance, with improvements of 83.58% and 3.85%, respectively. Furthermore, the improvements in safety and task performance achieved through SafeVLA alignment remain stable even in highly OOD scenarios.