Releases: jonathan-laurent/AlphaZero.jl
Releases · jonathan-laurent/AlphaZero.jl
v0.5.4
AlphaZero v0.5.4
Closed issues:
- Deprecate Util.mapreduce in favor of something more standard (#54)
- Use Base.Logging and ProgressLogging (#55)
- How to use multiple GPUs on a single node (#69)
- Cloud computing (#80)
- Iterative vs continuous learning (#81)
- Does Alpha Zero require a static representation of a scenario (#83)
- Does it make sense to attempt to apply AlphaZero to "Agricola" (#100)
- Multiplayer capability (#101)
- Issue while running this in local (#107)
- The strength of the Mancala bot (#110)
- 6 dependencies errored (#112)
- Issue with dummy_run() (#114)
- StackOverflowError (during training) (#116)
- To continue a training (#118)
- How important is GI.vectorize_state function? (#119)
- When exploring a position, what these abbreviations mean? (#121)
- Cloud service for AlphaZero.jl (#122)
- Number of network parameters (#126)
- Scripts.explore (#136)
- How to disable benchmarks? (#137)
- A log of played games during training (#138)
- Which hyperparameters? (#139)
- Cannot run sample (#143)
- GPU vs CPU (#144)
- MCTS.RolloutOracle(gspec) (#145)
- num_filters=128 (#150)
- NVIDIA GeForce GTX 1650 isn't good? (#151)
- What's the best OS for AlphaZero.jl ? (#153)
- Does it work with Tesla? (#154)
Merged pull requests:
- CompatHelper: bump compat for Distributions to 0.25, (keep existing compat) (#98) (@github-actions[bot])
- CompatHelper: bump compat for Flux to 0.13, (keep existing compat) (#108) (@github-actions[bot])
- Fix typo in readme (#109) (@LilithHafner)
- Update report.jl (#111) (@gwario)
- CompatHelper: bump compat for Setfield to 1, (keep existing compat) (#128) (@github-actions[bot])
- Fix parameter access (#130) (@gwario)
- CompatHelper: bump compat for LoggingExtras to 1, (keep existing compat) (#152) (@github-actions[bot])
v0.5.3
AlphaZero v0.5.3
Closed issues:
- CUDA Error (#57)
- τ=0.5 errors (#67)
- Using continuous rewards (i.e., non ternary games) (#77)
- Support for singleplayer games (#79)
- Debugging in VS Code (#84)
- Desired Type hierarchy for adding GNN's (#85)
- MCTS.explore! must be called before MCTS.policy (#86)
Merged pull requests:
- Extend documentation in CommonRLInterface (#70) (@johannes-fischer)
- Fix error with discounting in RolloutOracle (#73) (@johannes-fischer)
- Replace mkdir by mkpath (#74) (@johannes-fischer)
- Use joinpath to make code more robust on Windows machines (#75) (@johannes-fischer)
- Update experiment.md (#82) (@yutaizhou)
- Fix memory analysis (#89) (@johannes-fischer)
- call batch on vectors (not generators) (#91) (@CarloLucibello)
- add CompatHelper (#92) (@CarloLucibello)
- CompatHelper: bump compat for Setfield to 0.8, (keep existing compat) (#94) (@github-actions[bot])
- CompatHelper: bump compat for ExprTools to 0.1, (keep existing compat) (#95) (@github-actions[bot])
- CompatHelper: bump compat for Documenter to 0.27, (keep existing compat) (#96) (@github-actions[bot])
- CompatHelper: bump compat for Distributions to 0.25, (keep existing compat) (#97) (@github-actions[bot])
v0.5.2
AlphaZero v0.5.2
Closed issues:
- Support for OpenSpiel games? (#15)
- How to use AlphaZero.jl for Openspiel games? (#46)
- Current status of Multi-threading MCTS Benchmarking? (#56)
- Performance Docs (#58)
- isprobvec(p) error? (#59)
- Do these readout look correct? (#60)
- Benchmark Questions? (#61)
- Does Scripts.play("connect-four") cheat? (#62)
- isprobvec(p) whenever using Benchmark.NetworkOnly(τ=0.5) (#63)
- How exactly does Alphazero's MCTS work? (#64)
- Any idea what's causing this? (#65)
Merged pull requests:
- OpenSpiel.jl support (#68) (@michelangelo21)
v0.5.1
AlphaZero v0.5.1
Closed issues:
- API discussion (#4)
- self play takes more and more time (#41)
- Supervised learning (#48)
- MCTS Optimization for sparse actions (#49)
- Training on the cloud / multiple instances / clusters (#50)
- Any Tips for per-player tracking? (#51)
- Sanity Checks (#52)
- Speed issues? (#53)
Merged pull requests:
- Mancala - fixed set_state!() (#44) (@michelangelo21)
- Invert temperature in formula (documentation) (#45) (@johannes-fischer)
v0.5.0
AlphaZero v0.5.0
- Improved the inference server so that it is now possible to keep MCTS workers
running while a batch of requests is being processed by the GPU. Concretely,
this translates intoSimParams
now having two separatenum_workers
and
batch_size
parameters. - The inference server is now spawned on a separate thread to ensure minimal latency.
Together, the two aforementioned improvements result in a 30% global speedup on the
connect-four benchmark.
v0.4.0
AlphaZero v0.4.0
This release brings many new features to AlphaZero.jl including:
- Added support for CommonRLInterface.jl.
- Added a grid-world MDP example illustrating this new interface.
- Added support for distributed training: it is now equally easy to train an agent on
a cluster of machines than on a single computer. - Replaced the async MCTS implementation by a more straightforward synchronous
implementation. Network inference requests are now batched across game simulations. - Added the Experiment and Scripts module to simplify common tasks.
See CHANGELOD.md for details.
Closed issues:
- Connect Four training must be restarted about every 24 hours due to an OOM error (#1)
- The Flux backend is currently broken (#2)
- Importation of training parameters from JSON is broken (#3)
- UndefVarError: lib not defined when training a connect four agent (#5)
- Possibility to skip initial benchmark (#6)
- Assertion error during
apply_symmetry
(#7) - Checkpoint evaluation randomly fails (#8)
- MDP Version (#9)
- Suggestion: replace Oracle with just a function (#10)
- @unimplemented (#11)
- Some issues with installing the package (#12)
- Register package with General registry (#13)
- Missing repository's website (#16)
- fail to explore (#17)
- CuDNN error (#18)
- using AlphaZero (#19)
- UndefVarError: lib not defined (#20)
- LoadError: CUBLASError (#21)
- Error building
Knet
(#22) - LoadError: InitError: CUDA.jl does not yet support CUDA with nvdisasm 11.1.74; (#23)
- CuDNN error 8 on Ubuntu 18.04, Julia 1.5.2 (#24)
- Stateful Game-structs throw errors (#25)
- LSTM support (#28)
- CUDA vs CUDAnative? (#29)
- Embed trained network in javascript web app for browser-based inference? (#30)
- Connect Four iteration training time is taking a long time (#31)
- Question about symmetries (#32)
- Question about function test_symmetry (#33)
- Migrate neural net agents across AlphaZero.jl instances? (#34)
- Can a game know its players' types? (#35)
- Exploit several CPU (#36)
- Exploit multiple GPUs (#37)
- Enumerating actions without state (#38)
- fatal: Remote branch v0.4.0 not found in upstream origin (#39)
Merged pull requests:
- Mancala (#42) (@michelangelo21)