03ReinforcementLearning2.4, Monte-Carlo Methods for Reinforcement Learning
From Wulfram Gerstner
views
comments
From Wulfram Gerstner
Whereas TD-methods exploit the Bellman equation to arrive at an estimation of Q-values or V-Values, Monte-Carlo methods directly estimate values by averaging over the empirically measured returns. The two approaches are compared.