DeepRL1.3, Actor-Critic Architecture and Advantage-Actor-Critic
From Wulfram Gerstner
views
comments
From Wulfram Gerstner
The standard actor-critic network (in the narrow sense) combines TD learning of the value function (critic) with policy gradient for the actor. The combination of TD learning with the actor-critic architecture is also called the advantage actor critic.
EPFL video portal by SWITCH | Terms of service | Disclaimer | EPFL Privacy policy |