Skip to content

Commit

Permalink
Changed use to using
Browse files Browse the repository at this point in the history
  • Loading branch information
marcinic authored Nov 28, 2018
1 parent f67828a commit 4cb53a2
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion docs/algorithms/vpg.rst
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ The policy gradient algorithm works by updating policy parameters via stochastic
\theta_{k+1} = \theta_k + \alpha \nabla_{\theta} J(\pi_{\theta_k})
Policy gradient implementations typically compute advantage function estimates based on the infinite-horizon discounted return, despite otherwise use the finite-horizon undiscounted policy gradient formula.
Policy gradient implementations typically compute advantage function estimates based on the infinite-horizon discounted return, despite otherwise using the finite-horizon undiscounted policy gradient formula.

Exploration vs. Exploitation
----------------------------
Expand Down

0 comments on commit 4cb53a2

Please sign in to comment.