DeepRL1.4B, How do eligibility traces arise in policy gradient algorithms?
From Wulfram Gerstner
views
comments
From Wulfram Gerstner
Optimizing the return by policy-gradient in a multi-step environment naturally leads to eligibility traces. A few important mathematical steps are sketched here.
EPFL video portal by SWITCH | Terms of service | Disclaimer | EPFL Privacy policy |