Skip to content

Commit

Permalink
Merge pull request openai#63 from marcinic/patch-1
Browse files Browse the repository at this point in the history
Changed use to using
  • Loading branch information
jachiam authored Nov 29, 2018
2 parents f67828a + 4cb53a2 commit 8b92b8a
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion docs/algorithms/vpg.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ The policy gradient algorithm works by updating policy parameters via stochastic
\theta_{k+1} = \theta_k + \alpha \nabla_{\theta} J(\pi_{\theta_k})
Policy gradient implementations typically compute advantage function estimates based on the infinite-horizon discounted return, despite otherwise use the finite-horizon undiscounted policy gradient formula.
Policy gradient implementations typically compute advantage function estimates based on the infinite-horizon discounted return, despite otherwise using the finite-horizon undiscounted policy gradient formula.

Exploration vs. Exploitation
----------------------------
Expand Down

0 comments on commit 8b92b8a

Please sign in to comment.