We present Interactive Gibson, the first comprehensive benchmark for training and evaluating Interactive Navigation: robot navigation strategies where physical interaction with objects is allowed and even encouraged to accomplish a task. The benchmark has two main components:
- a new experiment setup, the Interactive Gibson Environment, which simulates high fidelity visuals of indoor scenes, and high fidelity physical dynamics of the robot and common objects found in these scenes.
- a set of Interactive Navigation metrics which allows one to study the interplay between navigation and physical simulation.
Interactive Gibson Environment
Screen shot of the simulator: (a) 3d view of the scene (b) RGB camera (c) surface normal (d) interactive objects mask (e) depth
Top down view of the scenes, green indicates objects that are replaced by CAD models.
Interactive Navigation metrics
Interactive Navigation Score
In order to measure navigation performance we propose a novel score for a single navigation run which captures the following two aspects:
- Path Efficiency: how optimal is the path taken by the robot to the goal, where the most optimal path is the shortest path assuming no interactable obstacles are in the way. A path is considered highly inefficient if the robot does not reach the goal during the run.
- Effort Efficiency: how much effort does the robot spend on disturbing itself and its surroundings, where effort is roughly the amount of energy spent moving its own body and/or pushing/manipulating objects out of its way.
Path and Effort Efficiency are measured by scores, P_eff and E_eff , respectively, in the interval [0, 1]. The final metric, called Interactive Navigation Score or INS, captures both aspects aforementioned in a soft manner with a convex combination of Path and Effort Efficiency Scores:
To define the Effort Efficiency Score, we denote by m i the robot (i = 0) and objects masses. Further, G = m_0 g corresponds to the gravity force on the robot and F_t stands for the amount of force applied by the robot on the environment at time t ∈ [0, T ], excluding the forces applied to the floor for locomotion. The Effort Efficiency Score captures both the excess of displaced mass (kinematic effort) and the applied force (dynamic effort) for interactions:
Path Efficiency Score is defined as the ratio between the ideal shortest path length L∗ computed without any movable object in the environment, and the cumulative path taken by the robot, weighted by the success.
We train agents using different Reinforcement Learning algorithms: DDPG, SAC and PPO with different interaction penalty k_int.
Qualitative results of the trade-off between Path and Effort Efficiency.
Navigation behaviors of different interaction penalties
We compare INS_0.5 for navigation performance on test set. The best algorithm so far is SAC with interaction penalty of 0.1. New submissions are welcome!