-
Notifications
You must be signed in to change notification settings - Fork 1
/
Copy pathNOTES.txt
22 lines (16 loc) · 1.01 KB
/
NOTES.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
PR idea notes for tensorforce:
had to convert my input to float32 from int.
had to write batchnorm and permute
can't control ordering for conv2d
hard to visualize the network architecture and understand it / inputs / outputs of each layer
have to reshape in order to get my data to work with conv2d (adding channel)
when there's a problem, hard to know which spot in the network spec was the issue
why are you not using keras or tf?
ConvDQN learning issues
TODO: Find better ways to debug the network to see what is happening.
TODO: add an environment without auction - just draft.
use self play and win / loss rewards only rather than in game rewards for winning a bid
Maybe the problem is that it cannot see the relationship between its action and the reward because of the
simultaneous action selection, done by the random opponents.
Next target - find an environment it CAN learn. Make it harder and harder from there. Be very gradual.
If it stops being learnable, try to go in between the last one it could learn.