-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path#README.md#
23 lines (15 loc) · 990 Bytes
/
#README.md#
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
# Q-learning
Using reinforcement learning to play snake and similar games.
An environment is created that simulates a game . This environment takes actions and returns the resulting screen of those actions, plus the resulting reward (-1 if the player dies, 0 if nothing happens and 1 if the player scores).
A player with a neural-network provides actions and learns from the environment responses using Q-learning per advantage learning. The network should learn what actions provide the best value.
# Techniques used
* Advantage learning, particularized to Q-learning.
* Double Q-learning
* (Prioritized) memory replay
* Progressive discount rate growth
* Progressive exploration rate growth
# Results
For such a simple game the player should be able to learn to play for much longer, but it is clearly working:
[Neural network playing catch][catchgame]
# More info
[Demystifying Deep Reinforcement Learning](https://www.intelnervana.com/demystifying-deep-reinforcement-learning/)