DeepRL2.2A, Proximal Policy Optimization for Continuous Control.

views comments

Algorithms of Proximal Policy Optimization take a gradient step of maximally possible size. What this means is explained in this video.