Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batched mcts #147

Open
wants to merge 167 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
167 commits
Select commit Hold shift + click to select a range
c81b3ba
Add private/ in .gitignore
Whojo Jun 29, 2022
7147ca5
Add tests on oracle of SimpleMctsTests.jl
Whojo Jun 29, 2022
390c40e
Add tests on `completed_qvalues()` in SimpleMctsTests.jl
Whojo Jun 29, 2022
28113e4
Add test for `explore()` in SimpleMctsTests.jl
Whojo Jun 29, 2022
5912561
fixup! Add test for `explore()` in SimpleMctsTests.jl
Whojo Jun 29, 2022
18e6b93
Add test on `gumbel_explore()` in SimpleMctsTests.jl
Whojo Jun 29, 2022
83950e5
fixup! Add test on `gumbel_explore()` in SimpleMctsTests.jl
Whojo Jun 29, 2022
2387f16
fixup! fixup! Add test on `gumbel_explore()` in SimpleMctsTests.jl
Whojo Jun 29, 2022
115df4c
Homogeneise interface between SimpleMcts and BatchedMctsAos
Whojo Jun 29, 2022
6321f56
Add inference test in BatchedMctsAosTests.jl
Whojo Jun 29, 2022
e9d33bb
fixup! Add inference test in BatchedMctsAosTests.jl
Whojo Jun 29, 2022
cbb0e7d
fixup! fixup! Add inference test in BatchedMctsAosTests.jl
Whojo Jun 29, 2022
a2a1a77
Add test on `explore` in BatchedMctsAosTests.jl
Whojo Jun 29, 2022
8f70518
Few renamings for convenience
Whojo Jun 29, 2022
5db9817
Add test on exploration part of gumbel_explore in SimpleMcts
Whojo Jul 3, 2022
7fb8041
Fix issue with terminal nodes on valid_actions list
Whojo Jul 3, 2022
865f7fa
Remove useless test
Whojo Jul 3, 2022
4434011
Correctly save children and bacpropagate in BatchedMctsAos
Whojo Jul 4, 2022
aa97bee
Add tests in BatchedMctsAosTests
Whojo Jul 4, 2022
1a15149
Add oracle for BatchedMctsAos
Whojo Jul 5, 2022
8eb1b41
Add num_simulations arguments to uniform_mcts_tic_tac_toe
Whojo Jul 5, 2022
c66772c
Add validity check on prior move evaluation by oracles in BatchedMctsAos
Whojo Jul 7, 2022
c3cef3d
Fix completed_qvalues calculation
Whojo Jul 8, 2022
21960d7
Removed RolloutOracle from BatchedMctsAos
Whojo Jul 8, 2022
8a0ddd5
Remove `valid_actions`
Whojo Jul 8, 2022
0cb8dad
fixup! Removed RolloutOracle from BatchedMctsAos
Whojo Jul 9, 2022
f5218de
Return, in `completed_qvalues`, -Inf for invalid actions
Whojo Jul 11, 2022
a3cd59d
Merge remote-tracking branch 'Jonathan-laurent/AlphaZero.jl/master' i…
Whojo Jul 11, 2022
457400b
Remove deadcode in BatchedMctsAosTests
Whojo Jul 13, 2022
c2da628
Add assert for action validity in BatchedMctsAos
Whojo Jul 14, 2022
4abf7bd
Fix type stability in BatchedMctsAos
Whojo Jul 16, 2022
42229fa
Fix Blue Style issue
Jul 16, 2022
240f80d
SimpleMcts now stops exploration at the first unexplored node
Whojo Jul 16, 2022
6313cc1
Homogenize number of simulations between SimpleMcts and BatchedMctsAos
Whojo Jul 17, 2022
ce649a9
Correct total_rewards on leaf node of BatchedMctsAos
Whojo Jul 17, 2022
1bf890b
Add equivalence test on BatchedMctsAosTests
Whojo Jul 17, 2022
e4d6498
Add `Gumbel_explore` on BatchedMctsAos
Whojo Jul 17, 2022
36f12bd
Fix in Gumbel on BatchMctsAos + add tests
Whojo Jul 18, 2022
501eba3
Add compilation test for Gumbel on BatchMctsAos
Whojo Jul 18, 2022
c8bc4fd
Refacto Gumbel on BatchMctsAos
Whojo Jul 18, 2022
0f9acc9
Refacto sorting in Gumbel on BatchMctsAos
Whojo Jul 18, 2022
2278bfe
Refacto considered matrix in Gumbel on BatchMctsAos
Whojo Jul 18, 2022
387ef11
Add code squeleton for BatchedMcts
Whojo Jul 21, 2022
95e4d15
Add a few tests + fix
Whojo Jul 25, 2022
7ab1d5c
Removed `terminal` from the returned named-tuple of `init_fn`
Whojo Jul 25, 2022
78ee0a0
Merge branch 'Add-Mcts-Tests' into BatchedMcts
Whojo Jul 25, 2022
704f281
Add exploration test for BatchedMcts
Whojo Jul 25, 2022
8ba50c5
Add equivalence tests in BatchedMcts
Whojo Jul 25, 2022
05c0f78
Improve GPU support on BatchedMcts.jl
Jul 27, 2022
813c54f
WIP: Gumbel on BatchedMcts
Whojo Jul 27, 2022
d798353
Add action to the tuple returned in `select!` of BatchedMcts
Whojo Jul 27, 2022
2b4374c
Add tests for Gumbel explore
Whojo Jul 27, 2022
79a5c2d
Fix gumbel_explore of BatchedMcts
Whojo Jul 28, 2022
dfa00b2
Replace `typeof(...[1])` by `eltype(...)`
Jul 29, 2022
c6cc425
Remove `aids` argument from `check_oracles`
Whojo Jul 28, 2022
b169f8c
Check action size coherence in `check_oracle`
Whojo Jul 29, 2022
ae33493
Use global variable to improve readibility
Whojo Jul 29, 2022
fc70563
fixup! Check action size coherence in `check_oracle`
Whojo Jul 29, 2022
1f1a59e
fixup! fixup! Check action size coherence in `check_oracle`
Whojo Jul 29, 2022
69322f2
Experiment with MCTX formulas & remove useless comments
Whojo Jul 29, 2022
0b5141e
Adapt BatchedMCts's `explore` to compile on GPU
Aug 2, 2022
e017a92
Adapt BatchedMcts's `gumbel_explore` to compile on GPU
Aug 2, 2022
6abc086
Merge branch 'BatchedMcts' of github.com:Whojo/AlphaZero.jl into Batc…
Whojo Aug 2, 2022
03cffda
Merge branch 'Add-Mcts-Tests' into Gumbel_on_BatchMctsAos
Whojo Aug 2, 2022
0082e65
Removed GPU test for gumbel_explore of BatchedAos
Whojo Aug 2, 2022
bbe37fd
Merge branch 'Gumbel_on_BatchMctsAos' into benchmark_mcts
Whojo Aug 2, 2022
cc2f217
Removed useless dependencies
Whojo Aug 3, 2022
e2da1e7
Fixed typos in documentation of BatchedMcts
Whojo Aug 3, 2022
7d32e35
Formated BatchedMcts + refacto `valid_actions` in `UniformTicTacToeEn…
Whojo Aug 3, 2022
db72f97
Add a small documentation for `Tree` in BatchedMcts
Whojo Aug 3, 2022
f32bdde
Replace a loop with eachindex
Whojo Aug 3, 2022
3d8e0f5
WIP: convert `eval!` to GPU
Whojo Aug 3, 2022
9a0d0e6
Merge branch 'BatchedMcts' of github.com:Whojo/AlphaZero.jl into Batc…
Whojo Aug 3, 2022
f5926cb
Adapt `transition_fn` for GPU
Whojo Aug 4, 2022
715149e
Merge branch 'BatchedMcts' into benchmark_mcts
Whojo Aug 4, 2022
cfffb3a
Copy directly `gumbel` and `table_of_considered_visits` to GPU in Bat…
Whojo Aug 4, 2022
f0a448f
Included the dimensions of `internal_states` in `tree.state`
Whojo Aug 5, 2022
40f8749
Add usage documentation
Whojo Aug 6, 2022
ac59571
fixup! Add usage documentation
Whojo Aug 6, 2022
132895c
fixup! Add usage documentation
Whojo Aug 6, 2022
bda7125
Fix `tree.state` access on GPU
Whojo Aug 6, 2022
f716eac
Add comments to `EnvOracle`
Whojo Aug 7, 2022
37887f1
Add comments for `UniformTicTacToeEnvOracle`
Whojo Aug 7, 2022
4ec1f43
Add comments to `Policy` of BatchedMcts
Whojo Aug 7, 2022
f56e682
Add comments for `Tree` in BatchedMcts
Whojo Aug 7, 2022
0d496a9
Add docstring for utility functions of MCTS
Whojo Aug 8, 2022
006e97b
fixup! Add docstring for utility functions of MCTS
Whojo Aug 8, 2022
cbdfd81
Add MCTS documentation
Whojo Aug 9, 2022
f16bd14
Add a comment about doubly-linked tree
Whojo Aug 9, 2022
c71a61b
Add Gumbel documentation
Whojo Aug 9, 2022
ecf7028
Add a part on namming conventions in the documentation
Whojo Aug 9, 2022
b294a8e
Correct typos and grammar error.
Whojo Aug 9, 2022
7dd0ecc
fixup! Correct typos and grammar error.
Whojo Aug 9, 2022
30b4979
`transition_fn` on GPU at 100%
Whojo Aug 13, 2022
f2a6270
Change `parent_frontier` to Matrix
Whojo Aug 13, 2022
a8fa4a3
Replace boolean indexing by direct int indexing
Whojo Aug 13, 2022
124e229
Add @inbounds in `eval!`
Whojo Aug 13, 2022
458f2bd
Improved Data preparation performance in `eval!` of BatchedMcs
Whojo Aug 13, 2022
fc41318
Minor typos fixes
Whojo Aug 20, 2022
7f97caa
Add utility `completed_qvalues`
Whojo Aug 20, 2022
36f0aae
Add GameHistory
Whojo Aug 21, 2022
f7500f2
Add ReplayBuffer
Whojo Aug 21, 2022
eaf26ce
Add TrainableEnvOracle
Whojo Aug 21, 2022
ca5923b
Add Train.jl
Whojo Aug 21, 2022
e755014
Add `make_image` to BitwiseTicTacToe
Whojo Aug 22, 2022
d2ad581
fixup! Add TrainableEnvOracle
Whojo Aug 22, 2022
f394ec2
Rename `train` to `train_network`
Whojo Aug 22, 2022
2895810
fixup! Add `make_image` to BitwiseTicTacToe
Whojo Aug 22, 2022
ce0e09d
fixup! Add `make_image` to BitwiseTicTacToe
Whojo Aug 22, 2022
1f85b5f
First working MuZero (without DL)
Whojo Aug 23, 2022
4fc69c2
Refacto `validate_prior` with `l1_normalise`
Whojo Aug 25, 2022
e362624
Fixed "Usage" documentation of `BatchedMcts`
Whojo Aug 26, 2022
bd25e88
fixup! Refacto `validate_prior` with `l1_normalise`
Whojo Aug 26, 2022
d171929
fixup! Add `make_image` to BitwiseTicTacToe
Whojo Aug 26, 2022
c368ab3
Group batches in `Train.jl`
Whojo Aug 26, 2022
adef141
Validate prior after `transition_fn`
Whojo Aug 26, 2022
ec90bf6
Removed terminated environment from self-play
Whojo Aug 28, 2022
351f581
WIP: DL part of `update_weights`
Whojo Aug 28, 2022
19d5347
Update `BatchedMcts` Documentation on `value_prior` & `policy_prior`
Whojo Aug 28, 2022
4bc385e
fixup! Refacto `validate_prior` with `l1_normalise`
Whojo Aug 28, 2022
589c937
Invalid actions in `completed_qvalues` now returns `-1`
Whojo Aug 28, 2022
0fcccea
Rescale `completed_qvalues` to [0; 1]
Whojo Aug 28, 2022
c1ca400
Revert "Validate prior after `transition_fn`"
Whojo Aug 29, 2022
1bf70cd
Rename a few variable for clarity
Whojo Aug 29, 2022
5bbb4f7
MCTS's `explore` manipulates states instead of envs
Whojo Aug 30, 2022
4cd729b
Correct issue with multiple dimensions state in `BatchedMcts.jl`
Whojo Aug 30, 2022
ff862e3
WIP: inference in get_env_oracle
Whojo Aug 30, 2022
5bcf553
`Explore` in `Train.jl` is based on environments (not states)
Whojo Aug 31, 2022
36d869a
Add logging information to `Train.jl`
Whojo Aug 31, 2022
9ecd1bd
fixup! `Explore` in `Train.jl` is based on environments (not states)
Whojo Aug 31, 2022
82148e7
Add a dummy-run on TicTacToe_MuZero script
Whojo Sep 1, 2022
b41a43e
fixup! fixup! `Explore` in `Train.jl` is based on environments (not s…
Whojo Sep 1, 2022
ab6670e
Refactored dummy run of TicTacToe_MuZero script
Whojo Sep 1, 2022
ef2355b
Refacto BitwiseTicTacToe's vectorize_state
Whojo Sep 2, 2022
dce9f1d
Improved MuZero.jl's `transition_fn` performance
Whojo Sep 2, 2022
c07bd66
fixup! Add logging information to `Train.jl`
Whojo Sep 7, 2022
067d7f1
Log losses with `@info`
Whojo Sep 7, 2022
25c1798
fixup! Log losses with `@info`
Whojo Sep 7, 2022
c466bb3
Fixed a typo in a `create_tree` assert
Whojo Sep 8, 2022
de44286
Add a win-rates metrics logging to `Train.jl`
Whojo Sep 8, 2022
f4fd3c8
MuZero's `get_env_oracle` now return the real invalid actions in `ini…
Whojo Sep 11, 2022
a23f258
Temporarily set `invalid_actions_value` to -2 before `rescale` inside…
Whojo Sep 11, 2022
024399b
Add seed in `TicTacToe_MuZero`
Whojo Sep 11, 2022
5f0d77d
Updated description of `BatchedMcts.jl`
Whojo Sep 11, 2022
e322d3c
Replace use of `mcts` in kernel inside `BatchedMcts.jl`
Whojo Sep 11, 2022
1afd001
Code Style
Whojo Sep 11, 2022
b95c400
Factored `Policy` & `EnvOracle` separatly from `BatchedMcts.jl`
Whojo Sep 12, 2022
4d58e66
fixup! Factored `Policy` & `EnvOracle` separatly from `BatchedMcts.jl`
Whojo Sep 12, 2022
b90737a
Temporary removing `gumbel_explore` from AoS
Whojo Sep 12, 2022
ba89bae
fixup! Factored `Policy` & `EnvOracle` separatly from `BatchedMcts.jl`
Whojo Sep 12, 2022
bd14a97
Add `EnvOracle` to `BatchedMctsAos.jl`
Whojo Sep 12, 2022
1abfcfa
Removed `mcts` from kernels in `BatchedMctsAos.jl`
Whojo Sep 13, 2022
e145030
Move `get_considered_visits_table` to `BatchedMctsUtility`
Whojo Sep 13, 2022
32e7c66
Add `gumbel_explore` to `BatchedMctsAos`
Whojo Sep 13, 2022
86b76d3
Fix GPU Compilation
Whojo Sep 15, 2022
bf82e9e
fixup! Fix GPU Compilation
Whojo Sep 19, 2022
89dc43b
Removed MuZero related code for PR
Whojo Sep 19, 2022
90338dd
BatchedMcts validate_prior on CuArray
Whojo Sep 24, 2022
65c5574
Merge branch 'BatchedMctsAos' into 2_batched_mcts_PR
Whojo Sep 24, 2022
3d43a8e
Revert "BatchedMcts validate_prior on CuArray"
Whojo Sep 25, 2022
9f37342
Handle embedding in BatchedMctsAos's internal_states
Whojo Sep 25, 2022
13d9b56
Removed scalar indexing inside `BatchedMctsAos`'s `eval_state!`
Whojo Sep 25, 2022
33dcc5a
fixup! Removed scalar indexing inside `BatchedMctsAos`'s `eval_state!`
Whojo Sep 25, 2022
32747c0
Merge branch 'BatchedMctsAos' into 2_batched_mcts_PR
Whojo Sep 25, 2022
1528fcb
Adapt `validate_prior` to process `CuArray`
Whojo Sep 26, 2022
157d7a6
`vectorize_state` GPU friendly
Whojo Sep 26, 2022
60239a0
Merge branch 'BatchedMctsAos' into 2_batched_mcts_PR
Whojo Sep 26, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
/docs/build/
/docs/site/

private
sessions
archive

Expand Down
4 changes: 4 additions & 0 deletions redesign/Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -7,13 +7,17 @@ version = "0.1.0"
Adapt = "79e6a3ab-5dfb-504d-930d-738a2a938a0e"
CUDA = "052768ef-5323-5732-b1bb-66c8b64840ba"
Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
EllipsisNotation = "da5c29d0-fa7d-589e-88eb-ea29b0a81949"
Flux = "587475ba-b771-5e3f-ad9e-33799f191a9c"
JET = "c3a54625-cd67-489e-a8e7-0a5a0ff4e31b"
ParameterSchedulers = "d7d3b36b-41b8-4d0d-a2bf-768c6151755e"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"
Reexport = "189a3867-3050-52da-a836-e630ba90ab69"
ReinforcementLearningBase = "e575027e-6cd6-5018-9292-cdc6200d2b44"
ReinforcementLearningEnvironments = "25e41dd2-4622-11e9-1641-f1adca772921"
Setfield = "efcf1570-3423-57d1-acb7-fd33fddbac46"
StaticArrays = "90137ffa-7385-5640-81b9-e52037218182"
Statistics = "10745b16-79ce-11e8-11f9-7d13ad32a3b2"
StatsBase = "2913bbd2-ae8a-5f71-8c99-4fb6c76f3a91"
Test = "8dfed614-e22c-5e08-85e1-65c5234f0b40"
Zygote = "e88e6eb3-aa80-5325-afca-941959d7151f"
4 changes: 3 additions & 1 deletion redesign/src/BatchedEnvs.jl
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ Interface for batchable environements that can be run on the GPU.
"""
module BatchedEnvs

export num_actions, valid_action, act, terminated
export num_actions, valid_action, act, terminated, vectorize_state

function num_actions end

Expand All @@ -13,4 +13,6 @@ function act end

function terminated end

function vectorize_state end

end
Loading