DeepRL2.2A, Proximal Policy Optimization for Continuous Control.
From Wulfram Gerstner
views
comments
From Wulfram Gerstner
Algorithms of Proximal Policy Optimization take a gradient step of maximally possible size. What this means is explained in this video.
EPFL video portal by SWITCH | Terms of service | Disclaimer | EPFL Privacy policy |