From 66327be8996f15e06d5959ce3eee27e80dc702dc Mon Sep 17 00:00:00 2001 From: Alexis David Jacq Date: Tue, 27 Feb 2018 17:01:01 +0100 Subject: [PATCH] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 32bc330..c148de6 100644 --- a/README.md +++ b/README.md @@ -4,7 +4,7 @@ Using PPO with clip loss (from https://arxiv.org/pdf/1707.06347.pdf). I finally fixed what was wrong with the gradient descent step, using previous log-prob from rollout batches. At least ppo.py is fixed, the rest is going to be corrected as well very soon. -On the following example I was not patient enough to wait for million iterations, I just wanted to check if the model is properly learning: +In the following example I was not patient enough to wait for million iterations, I just wanted to check if the model is properly learning: Progress of single PPO: -----------------------