Releases: Eclectic-Sheep/sheeprl
Releases · Eclectic-Sheep/sheeprl
v0.3.0
v0.3.0 Release Notes
This new release introduces hydra as the default configuration manager. In particular it fixes #74 and automatically #75, since now the cnn_keys
and mlp_keys
can be specified separately for both the encoder and decoder.
The changes are mainly the following:
- Dreamer-V3 initialization follows directly Hafner's implementation (adapted from https://github.com/NM512/dreamerv3-torch/blob/main/tools.py)
- all
args.py
and theHFArgumentParser
have been removed. Configs are now specified under thesheeprl/configs
folder and hydra is the default configuration manager - Every environment wrapper is directly instantiated through the
hydra.utils.instantiate
inside themake_env
ormake_dict_env
method: in this way one can easily modify the wrapper passing whatever parameters to customize the env. Every wrapper must take as input theid
parameter, which must be specified in the relative config - Every optimizer is directly instantiated through the
hydra.utils.instantiate
and can be modified through the CLI on the experiment run - The
howto/configs.md
has been added in which explain how the configs are organized inside the repo
v0.2.2
v0.2.1
v0.2.1 Release Notes
- Added Dreamer-V3 algorithm from https://arxiv.org/abs/2301.04104
- Added
RestartOnException
wrapper, which recreates and restarts the environments whenever somethig bad has happened during thestep
orreset
. This has been added only on Dreamer-V3 algorithm - Renamed classes and functions (in particular the
Player
classes fro both Dreamer-V1/V2)
v0.2
v0.2 Release notes
- Added DiambraWrapper
- Added Multi-encoder/decoder to all the algorithms, but Droq, Sac and PPO Recurrent
- Added Multi-discrete support to PPO, DreamerV1 and P2E-DV1
- Modified the make_env function to be able to train the agents on environments that return both pixel-like and vector-like observations
- Modified the ReplayBuffer class to handle multiple observations
- Updated howtos
- Fixed #66
- Logger creation is moved to
sheeprl.utils.logger
- Env creation is moved to
sheeprl.utils.env
- PPO algo is now a single-folder algorithm (removed
ppo_pixel
andppo_continuous
folder) sac_pixel
has been renamed tosac_ae
- Added support to
gymnasium==0.29.0
,mujoco>=2.3.3
anddm_control>=1.0.12
v0.1
v0.1 Release notes
Algorithms implemented:
- Dreamer-V1 (https://arxiv.org/abs/1912.01603)
- Dreamer-V2 (https://arxiv.org/abs/2010.02193)
- Plan2Explore Dreamer-V1-based (https://arxiv.org/abs/2005.05960)
- Plan2Explore Dreamer-V2-based (https://arxiv.org/abs/2005.05960)
- DroQ (https://arxiv.org/abs/2110.02034)
- PPO (https://arxiv.org/abs/1707.06347)
- PPO Recurrent (https://arxiv.org/abs/2205.11104)
- SAC (https://arxiv.org/abs/1812.05905)
- SAC-AE (https://arxiv.org/abs/1910.01741)