04ReinforcementLearning3.1, First steps toward deep reinforcement learning
From Wulfram Gerstner
views
comments
From Wulfram Gerstner
After a rapid review of reinforcement learning with TD methods, exploiting Q-values and V-values in deep networks, we pose the central question of this lecture series: can we learn the policy directly, i.e., without a detour via Q-values and V-values?
EPFL video portal by SWITCH | Terms of service | Disclaimer | EPFL Privacy policy |