強化學習演算法

Reinforcement Learning Algorithm : Q-learning and Sarsa

As a branch of machine learning, reinforcement learning is similar to supervised and unsupervised learning and has advantages of both. By combining Monte Carlo methods and dynamic programming (DP) ideas, temporal difference (TD) learning is a brilliant solution to the problems of reinforcement learning. In the literature, two TD control methods, Q-learning and Sarsa, were proposed on the basis of the TD learning method. In this research, the design algorithms of Q-learning and Sarsa are studied, the property and performance are compared, and the reinforcement - learning-based control designs are proposed. Moreover, the real-world applications are conducted to illustrate the effectiveness of the proposed control designs relative to competing algorithms.

Page updated

Google Sites

Report abuse