PPO Breakout Score #988

eaplatanios · 2019-08-13T17:37:01Z

I tried running the PPO2 example using the breakout game on my MacBook, without modifying anything in the scripts or configurations and I am only able to get up to 19.6 score. Why is that? Is there an implementation bug or do I need to tune PPO differently on my machine to get up to 400?

DanielTakeshi · 2019-08-13T23:18:04Z

You might need to be more specific in your request. See for example this issue: #938 When you are able to get to 19.6 score, how many environment steps does that correspond to?

eaplatanios · 2019-08-15T23:07:53Z

@DanielTakeshi sorry for not clarifying. I am executing the provided run.py script directly with default arguments. This sets nenv = 6 based on my CPU and runs for 10^6 steps. I haven't modified anything so running that script and explicitly setting nenv = 6 should give you the same result. The command I use to run the experiment is:

python3.7 -m baselines.run --alg=ppo2 --env=BreakoutNoFrameskip-v4

christopherhesse · 2019-10-25T22:47:42Z

Looking at the published graphs: http://htmlpreview.github.io/?https://github.com/openai/baselines/blob/master/benchmarks_atari10M.htm breakout doesn't get to 400 until 1e7 steps, have you tried training that long?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PPO Breakout Score #988

PPO Breakout Score #988

eaplatanios commented Aug 13, 2019

DanielTakeshi commented Aug 13, 2019

eaplatanios commented Aug 15, 2019

christopherhesse commented Oct 25, 2019

PPO Breakout Score #988

PPO Breakout Score #988

Comments

eaplatanios commented Aug 13, 2019

DanielTakeshi commented Aug 13, 2019

eaplatanios commented Aug 15, 2019

christopherhesse commented Oct 25, 2019