We evaluated BeigeMaps as a drop-in modification of existing behavioral distance based RL algorithms, and compared their performance on 7 environments from the Deep Mind for Control Suite. Here are some of the results. Check the paper for full details.
We perform experiments using the following baseline algorithms:Â
Deep Bisimulation for Control (DBC), Robust-DBC, Kernel Similarity Metric (KSME), and Reducing Approximation Gap (RAP).
Here are some aggregate performance metrics for all algorithms, averaged over 3 training seeds, 30 evaluation seeds and 7 environments.Â
Higher values are better for Median, Interquartile-Median (IQM), and Mean. Lower values are better for the Optimality Gap (OptGap). Error bars correspond to 95% CI.Â
Select icons (Shift+Click) in the legend to focus on specific models.
Here are performance profiles for all algorithms showing the proportion of runs where an algorithm's average return was above a given threshold.
If the curve for a model is strictly above another, the former is said to statistically dominate the latter.
Select icons (Shift+Click) in the legend to focus on individual curves.
Here are some videos of trained agents for baseline models. In each video, three different evaluation seeds have been stacked together.
DBC
Robust DBC
KSME
RAP
Here are videos for the BeigeMap counterparts of the baseline models above.
DBC+
BeigeMaps
Robust DBC+
BeigeMaps
KSME+
BeigeMaps
RAP+
BeigeMaps