Skip to content

smayru/Averaged-DQN

Repository files navigation

Averaged-DQN

chainer implementatio of Averaged-DQN. This code is partly based on here.

Abstract

By taking the average of the latst k parameters for estimaing the Q-function, Averaged-DQN stablizes the performance. If k is 1, this is essentially the same as standard DQN.

How to use

python averaged_dqn.py --K=k --Episode=episode  

Analysis

I check the estimation error of Q-function varying the value of k.

k=1 k=2 k=3 k=5 k=10
53.98 10.27 1.43 1.42 0.69

By increasing the value of k, you can reduce estimation error.

Next, I checked the average reward for each episode.

k=1 k=2 k=3 k=5 k=10
152.36 151.85 149.69 165.04 130.29

When setting the value of k to be 5, it shows the best performance.

The detail is described in averaged_dqn_analysis.ipynb.

About

Averaged-DQN implemented by Chainer

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published