A 2-DOF finger robot is tasked with getting a spinner to spin above a desired angular velocity.
These visualizations help illustrate how the VoSI metric is realized by comparing the difference in performance of mixed loop (MixL) execution strategies (that forgo sensory information for different horizons h and subsequently operate closed loop) from a closed loop execution strategy (denoted as CL). The open loop phase of the rollouts are represented with a gray indicator on the top-right, which the closed loop phase represented with a green indicator.
Media 1(a). This visualization represents the VoSI profile for a state in the reset distribution when the finger has to kickstart the spin. We observe that sensing is quite critical in this phase of the task, as forgoing sensing results in delays in getting the spinner to the desirable spin velocity which limits how much task reward can be accumulated. From the illustrative visualize of trajectory from MixL(h=100) it is also evident that the open-loop action plans are ineffective at very long horizon in this phase.
Media 1(b). This visualizes trajectories for a state at which the spinner has exceeded the desired threshold. We observe that open-loop action sequences of upto 50 timesteps does not significantly degrade the task performance, but above that extended periods of repeated contact results in divergence in expected states the agent might be at and performance degrades.Â
Media 2. Visualization of the VoSI profiles over the course of a close loop execution starting from the reset distribution. Observe that the degradation is stepper as the finger reaches the spinner and kickstarts the spin, but once the spinner starts spinning and motion of the finger enters a periodic phase the VoSI profiles exhibits tolerance (close to no-performance degradation) to periodic sensing of every 30 timesteps.
To visualize how the VoSI profiles change in different states we take a similar approach as the one discussed in the swingup section and project the VoSI profiles to a 1D component with PCA. (90% variance explained) Since the state-space in finger-spin is also high-dimensional we perform a 2D PCA of the states (explains 91% variance) and visualize for these projections the associated characteristic VoSI profile in Figure 1(c). Figure 1(b), serves to highlight the timestamp along the closed-loop execution that the evaluated state was encountered and is therefore quite indicative of the phase of such a periodic task.
Figure 1(a). Reconstructed VoSI profiles from the principal component as it is linearly varied from the smallest value (mapped to 0.0) observed to the largest (mapped to 1.0).
Figure 1(b). As the time-instance of the evalauted state is indicatative of the periodic nature of the finger-spin task we visualize on the projected states the timestamp along the data-collection rollouts that the state was visited. Post 0.5 seconds spinner is typically set into the desired spinning motion.
Figure 1(c). We observe that the VoSI profiles for the periodic phase of the task exhibit very similar degradation characteristics, while the phase involving the finger reaching the spinner and kick-starting the spin involves more severe performance degradation -- making immediate sensing more valuable at those states.