04ReinforcementLearning3.4, Log-likelihood trick: from batch to online

From Wulfram Gerstner  

views comments