Goal-oriented Trajectories for Efficient Exploration

πŸ” Exploration (episode 300)

DQN with epsilon-greedy exploration

DQN with Q-map exploration

🚩 First flag

DQN with epsilon-greedy exploration

DQN with Q-map exploration

πŸ† Highest score (with flag)

DQN with epsilon-greedy exploration

DQN with Q-map exploration

πŸ“ˆ Results

Random walk (red) and Q-map walk (green)

DQN with epsilon-greedy (red) and DQN with Q-map (green)

Performance comparison