Reinforcement Learning involves training agents to make decisions in an environment to maximize a cumulative reward. It is based on the concept of learning from interactions and feedback, where the agent takes actions, receives feedback, and adjusts its strategy to achieve optimal performance.
Model-based Learning in reinforcement learning involves the utilization or creation of a model of the environment to simulate possible scenarios and plan actions accordingly. This helps the agent make informed decisions based on its understanding of the environment.
Model-based RL can be divided into two main categories:
RL with a learned model - Experiences are collected and used to update an internal model.
Dynamic programming methods (policy iteration, value iteration).
Supervised learning methods for approximating model behavior using function approximation.
RL with a given model - The future states are simulated before choosing the action from the current state.
Model-free Learning in reinforcement learning focuses on learning a policy directly from interactions with the environment, without explicitly modeling the environment dynamics. It includes approaches like Q-learning and policy gradient methods.
Some model-free RL techniques include:
Monte Carlo Control
SARSA
Q-learning
Actor-Critic