Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/episode buffer np #121

Merged
merged 11 commits into from
Oct 12, 2023
Merged

Conversation

michele-milesi
Copy link
Member

Summary

Describe the purpose of the pull request, including:

  • Add EpisodeBuffer Numpy
  • Add tests

Type of Change

Please select the one relevant option below:

  • New feature (non-breaking change that adds functionality)

Checklist

Please confirm that the following tasks have been completed:

  • I have tested my changes locally and they work as expected. (Please describe the tests you performed.)
  • I have added unit tests for my changes, or updated existing tests if necessary.
  • I have updated the documentation, if applicable.
  • I have installed pre-commit and run locally for my code changes.

@belerico belerico merged commit b134b4f into feature/buffer-np Oct 12, 2023
@belerico belerico deleted the feature/episode_buffer_np branch October 12, 2023 08:03
belerico added a commit that referenced this pull request Dec 19, 2023
* Add first PPO numpy buffer implementation

* Add distribution cfg to agent

* No need for tensordict

* Add SAC numpy

* Improve sample_next_obs

* Add DV1 with numpy buffer

* Too much reshapes

* Add Sequential and EnvIndipendent np buffers

* Fewer number of reshapes

* Faster indexing + from_numpy parameter

* Dreamer-V2 numpy

* Fix buffer add

* Better indexing

* Fix indexes to sample

* Fix metrics when they are nan

* Fix reshape when bootstrapping + fix normalization

* Guard timer metrics

* np.intp for indexing

* Change dtype after creating the tensor

* Fix buf[key] after __getstate__ is called upon checkpoint

* Securely close fd on __getstate__()

* Add MemmapArray

* Add __len__ function

* Fix len

* Better array setter and __del__ now controls ownership

* Do not transfer ownership upon array setter

* Add properties

* Feature/episode buffer np (#121)

* feat: added episode buffer numpy

* fix: memmap episode buffer numpy

* fix: checkpoint when memmap=True EpisodeBufferNumpy

* fix: memmap episode buffer np

* tests: added tests for episode buffer np

* feat: update episode buffer, added MemmapArray

* Fix not use self._obs_keys

* Sample only if n > 0

* Fix shapes

* feat: added possibility to specify sequence length in sample() + added possibility to add data only to some env

* tests: update episode buffer numpy tests

* tests: added replay buffer np tests

* tests: added sequential replay buffer np tests

* fix: env independent repla buffer name

* fix: replay buffer + add tests

* Safely release buffer on Windows

* Safely delets memmaps

* Del buffer

* Safer array setter

* Add Memmap.from_array

* Fix ReplayBuffer __set_item__

* fix: sac_np sample

* tests: update tests

* tests: update

* fix: sequential replay buffer sample clone

* Add tests + Fix MemmapArray on Windows

* Add tests to run only on Linux

* Fix tests

* Fix skip test on Windows

* Dreamer-V2 with EpisodeBuffer np

* Add user warning if file exists when creating a new MemmapArray

* feat: added dreamer v3 np

* Add docstrings + Fix array setter if shapes differ

* Fix tests

* Add docstring

* Docstrings

* fix: sample of env independent buffer

* Fix locked tensordict

* Add configs

* feat: update np algorithms with new specifications

* fix: mypy

* PokemonRed env from https://github.com/PWhiddy/PokemonRedExperiments/blob/master/baselines/red_gym_env.py

* Update dreamer_v3 with main

* Update dreamer_v2 with main

* Update dreamer_v1 with main

* Update ppo with main

* Update sac with main

* Amend numpy to torch dtype and back dicts

* feat: added np callback

* fix: np callback

* feat: add support functions in np checkpoint callback

* feat: added droq np

* feat: added ppo recurrent np

* feat: added sac-ae np

* Update dreamer algos with main

* feat: added p2e dv1 np

* feat: added p2e dv2 np

* feat: add p2e dv3 np

* feat: added ppo decoupled np

* feat: add sac decoupled

* np.tanh instead of torch.tanh

* feat: from tensordict to buffers np

* from td to np

* exclude mlflow from tests

* No more tensordict

* Updated howto

* Fix tests

* .cpu().numpy() just one time

* Removed old cfgs

* Convert all when hydra instantiating

* convert all on instantiate

* [skip-ci] Removed pokemon files

* fix: git merge related errors

* Fix get absolute path

* Amend dreamer-v3 pokemon config

---------

Co-authored-by: michele-milesi <[email protected]>
Co-authored-by: Michele Milesi <[email protected]>
belerico added a commit that referenced this pull request Dec 19, 2023
* Add first PPO numpy buffer implementation

* Add distribution cfg to agent

* No need for tensordict

* Add SAC numpy

* Improve sample_next_obs

* Add DV1 with numpy buffer

* Too much reshapes

* Add Sequential and EnvIndipendent np buffers

* Fewer number of reshapes

* Faster indexing + from_numpy parameter

* Dreamer-V2 numpy

* Fix buffer add

* Better indexing

* Fix indexes to sample

* Fix metrics when they are nan

* Fix reshape when bootstrapping + fix normalization

* Guard timer metrics

* np.intp for indexing

* Change dtype after creating the tensor

* Fix buf[key] after __getstate__ is called upon checkpoint

* Securely close fd on __getstate__()

* Add MemmapArray

* Add __len__ function

* Fix len

* Better array setter and __del__ now controls ownership

* Do not transfer ownership upon array setter

* Add properties

* Feature/episode buffer np (#121)

* feat: added episode buffer numpy

* fix: memmap episode buffer numpy

* fix: checkpoint when memmap=True EpisodeBufferNumpy

* fix: memmap episode buffer np

* tests: added tests for episode buffer np

* feat: update episode buffer, added MemmapArray

* Fix not use self._obs_keys

* Sample only if n > 0

* Fix shapes

* feat: added possibility to specify sequence length in sample() + added possibility to add data only to some env

* tests: update episode buffer numpy tests

* tests: added replay buffer np tests

* tests: added sequential replay buffer np tests

* fix: env independent repla buffer name

* fix: replay buffer + add tests

* Safely release buffer on Windows

* Safely delets memmaps

* Del buffer

* Safer array setter

* Add Memmap.from_array

* Fix ReplayBuffer __set_item__

* fix: sac_np sample

* tests: update tests

* tests: update

* fix: sequential replay buffer sample clone

* Add tests + Fix MemmapArray on Windows

* Add tests to run only on Linux

* Fix tests

* Fix skip test on Windows

* Dreamer-V2 with EpisodeBuffer np

* Add user warning if file exists when creating a new MemmapArray

* feat: added dreamer v3 np

* Add docstrings + Fix array setter if shapes differ

* Fix tests

* Add docstring

* Docstrings

* fix: sample of env independent buffer

* Fix locked tensordict

* Add configs

* feat: update np algorithms with new specifications

* fix: mypy

* PokemonRed env from https://github.com/PWhiddy/PokemonRedExperiments/blob/master/baselines/red_gym_env.py

* Update dreamer_v3 with main

* Update dreamer_v2 with main

* Update dreamer_v1 with main

* Update ppo with main

* Update sac with main

* Amend numpy to torch dtype and back dicts

* feat: added np callback

* fix: np callback

* feat: add support functions in np checkpoint callback

* feat: added droq np

* feat: added ppo recurrent np

* feat: added sac-ae np

* Update dreamer algos with main

* feat: added p2e dv1 np

* feat: added p2e dv2 np

* feat: add p2e dv3 np

* feat: added ppo decoupled np

* feat: add sac decoupled

* np.tanh instead of torch.tanh

* feat: from tensordict to buffers np

* from td to np

* exclude mlflow from tests

* No more tensordict

* Updated howto

* Fix tests

* .cpu().numpy() just one time

* Removed old cfgs

* Convert all when hydra instantiating

* convert all on instantiate

* [skip-ci] Removed pokemon files

* fix: git merge related errors

* Fix get absolute path

* Amend dreamer-v3 pokemon config

* feat: added user choice from as_tensor and from_numpy in sample_tensors and to_tensor

---------

Co-authored-by: belerico <[email protected]>
Co-authored-by: belerico_t <[email protected]>
belerico added a commit that referenced this pull request Dec 19, 2023
* Add first PPO numpy buffer implementation

* Add distribution cfg to agent

* No need for tensordict

* Add SAC numpy

* Improve sample_next_obs

* Add DV1 with numpy buffer

* Too much reshapes

* Add Sequential and EnvIndipendent np buffers

* Fewer number of reshapes

* Faster indexing + from_numpy parameter

* Dreamer-V2 numpy

* Fix buffer add

* Better indexing

* Fix indexes to sample

* Fix metrics when they are nan

* Fix reshape when bootstrapping + fix normalization

* Guard timer metrics

* np.intp for indexing

* Change dtype after creating the tensor

* Fix buf[key] after __getstate__ is called upon checkpoint

* Securely close fd on __getstate__()

* Add MemmapArray

* Add __len__ function

* Fix len

* Better array setter and __del__ now controls ownership

* Do not transfer ownership upon array setter

* Add properties

* Feature/episode buffer np (#121)

* feat: added episode buffer numpy

* fix: memmap episode buffer numpy

* fix: checkpoint when memmap=True EpisodeBufferNumpy

* fix: memmap episode buffer np

* tests: added tests for episode buffer np

* feat: update episode buffer, added MemmapArray

* Fix not use self._obs_keys

* Sample only if n > 0

* Fix shapes

* feat: added possibility to specify sequence length in sample() + added possibility to add data only to some env

* tests: update episode buffer numpy tests

* tests: added replay buffer np tests

* tests: added sequential replay buffer np tests

* fix: env independent repla buffer name

* fix: replay buffer + add tests

* Safely release buffer on Windows

* Safely delets memmaps

* Del buffer

* Safer array setter

* Add Memmap.from_array

* Fix ReplayBuffer __set_item__

* fix: sac_np sample

* tests: update tests

* tests: update

* fix: sequential replay buffer sample clone

* Add tests + Fix MemmapArray on Windows

* Add tests to run only on Linux

* Fix tests

* Fix skip test on Windows

* Dreamer-V2 with EpisodeBuffer np

* Add user warning if file exists when creating a new MemmapArray

* feat: added dreamer v3 np

* Add docstrings + Fix array setter if shapes differ

* Fix tests

* Add docstring

* Docstrings

* fix: sample of env independent buffer

* Fix locked tensordict

* Add configs

* feat: update np algorithms with new specifications

* fix: mypy

* PokemonRed env from https://github.com/PWhiddy/PokemonRedExperiments/blob/master/baselines/red_gym_env.py

* Update dreamer_v3 with main

* Update dreamer_v2 with main

* Update dreamer_v1 with main

* Update ppo with main

* Update sac with main

* Amend numpy to torch dtype and back dicts

* feat: added np callback

* fix: np callback

* feat: add support functions in np checkpoint callback

* feat: added droq np

* feat: added ppo recurrent np

* feat: added sac-ae np

* Update dreamer algos with main

* feat: added p2e dv1 np

* feat: added p2e dv2 np

* feat: add p2e dv3 np

* feat: added ppo decoupled np

* feat: add sac decoupled

* np.tanh instead of torch.tanh

* feat: from tensordict to buffers np

* from td to np

* exclude mlflow from tests

* No more tensordict

* Updated howto

* Fix tests

* .cpu().numpy() just one time

* Removed old cfgs

* Convert all when hydra instantiating

* convert all on instantiate

* [skip-ci] Removed pokemon files

* fix: git merge related errors

* Fix get absolute path

* Amend dreamer-v3 pokemon config

* feat: added keep_last parameter

* docs: update

* Removed dict from config

---------

Co-authored-by: belerico <[email protected]>
Co-authored-by: belerico_t <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants