CDMPC: Cross Domain Imitation Learning via MPC

We introduce CDMPC, an approach for learning new skill combinations from long-horizon skill trajectories. CDMPC enables agents to chain skills from diverse source domains and integrates them with a low-level policy in the target domain.

CDMPC learns to chain short-horizon skills from long-horizon trajectories across demonstrations from diverse source domains, including various skill combinations. The policy learned from CDMPC adapts to tasks from any source domain and makes the agent able to tackle new tasks that require novel skill combinations.

D4RL Maze Environment:

Source Domain: Discrete Room ID, Demonstrations from Minigrid.

We implement CDMPC in a 2x2 maze environment with randomly generated obstacles. The agent must avoid obstacles and follow demonstrations to reach target rooms in sequence. The demonstrations in the datasets from two source domains contain varied skill combinations, with some overlapping and others unique to each domain.

Source Domain: Room ID

Room Order: [2,1,3]

Target Domain: D4RL Maze

Room Order: [2,1,3]

Source Domain: MiniGrid

Room Order: [4,2,1]

Target Domain: D4RL Maze

Room Order: [4,2,1]

CDMPC is able to enable the agents follow Room ID and MiniGrid demostration avoid obstivales and go to the correct rooms.

Page updated

Google Sites

Report abuse