Goal-oriented Trajectories for Efficient Exploration
π Exploration (episode 300)
π Exploration (episode 300)
DQN with epsilon-greedy exploration
DQN with epsilon-greedy exploration
DQN with Q-map exploration
DQN with Q-map exploration
π© First flag
π© First flag
DQN with epsilon-greedy exploration
DQN with epsilon-greedy exploration
DQN with Q-map exploration
DQN with Q-map exploration
π Highest score (with flag)
π Highest score (with flag)
DQN with epsilon-greedy exploration
DQN with epsilon-greedy exploration
DQN with Q-map exploration
DQN with Q-map exploration
π Results
π Results
Random walk (red) and Q-map walk (green)
Random walk (red) and Q-map walk (green)
DQN with epsilon-greedy (red) and DQN with Q-map (green)
DQN with epsilon-greedy (red) and DQN with Q-map (green)
Performance comparison
Performance comparison