reinforcement learning with Microsoft CNTK.
add the heading information to the input features, now the training is more efficient.
now, the pretrained model can get an average of 7.7 scores for each game.
old log:
Episode: 10000, Average reward and score for episode: -0.424200, 0.076.
Episode: 20000, Average reward and score for episode: -0.417600, 0.082.
Episode: 30000, Average reward and score for episode: -0.415300, 0.085.
Episode: 40000, Average reward and score for episode: -0.415300, 0.085.
Episode: 50000, Average reward and score for episode: -0.416800, 0.083.
......
Episode: 2520000, Average reward and score for episode: 7.137500, 7.638.
Episode: 2530000, Average reward and score for episode: 7.141700, 7.642.
Episode: 2540000, Average reward and score for episode: 7.214600, 7.715.
Episode: 2550000, Average reward and score for episode: 7.213900, 7.714.
Episode: 2560000, Average reward and score for episode: 7.105500, 7.606.
new one:
Episode: 600000, Average reward and score for episode: 6.429200, 6.929.
Episode: 610000, Average reward and score for episode: 6.677300, 7.177.
Episode: 620000, Average reward and score for episode: 6.735800, 7.236.
Episode: 630000, Average reward and score for episode: 6.844000, 7.344.
Episode: 640000, Average reward and score for episode: 6.891000, 7.391.
Visual Studio Code an awesome editor.
src/train.py train with 640k episodes.
src/load.py load pretrained model, proceed training.
src/pref.py load pretrained model, show how it acts. (requires pygame)
[email protected] cpu-only