We perform analysis on a sim2real dataset, where expert videos are only available in simulation.
Translating agent domain (real robot) frames into expert domain (simluation).
Original, translated, and predicted frames on a random dataset. The trend of XIPER rewards is closely aligned with ground-truth task rewards.
Original, translated, and predicted frames on a random dataset. XIPER rewards are consistently high and stable since all the frames are well-predictable.