Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor to use tyro #424

Merged
merged 44 commits into from
Nov 28, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
cd4851e
Refactor to use tyro
vwxyzjn Oct 16, 2023
b97d54f
push
vwxyzjn Oct 16, 2023
b87a015
psuh
vwxyzjn Oct 16, 2023
896f346
refactor
vwxyzjn Oct 16, 2023
6220645
fix pre-commit
vwxyzjn Oct 16, 2023
adbf836
fix pre-commit
vwxyzjn Oct 16, 2023
8af1e13
fix commend
vwxyzjn Oct 16, 2023
0b61550
Merge branch 'master' into refactor-tyro
sdpkjc Oct 16, 2023
96a56b8
refactor
vwxyzjn Oct 16, 2023
a8795a9
Merge branch 'refactor-tyro' of https://github.com/vwxyzjn/cleanrl in…
vwxyzjn Oct 16, 2023
cb6b47a
update poetry
vwxyzjn Oct 16, 2023
cfeedb0
fix test case
vwxyzjn Oct 16, 2023
9c0959c
quick fix
vwxyzjn Oct 16, 2023
5f3f716
fix
vwxyzjn Oct 17, 2023
08f4392
update optuna
vwxyzjn Oct 17, 2023
de6c829
quick change
vwxyzjn Oct 17, 2023
b09e088
fix ppg
vwxyzjn Oct 17, 2023
e92cf57
quick fix
vwxyzjn Oct 17, 2023
57b05fb
fix optuna
vwxyzjn Oct 17, 2023
17f49db
quick change
vwxyzjn Oct 17, 2023
cbbdc8b
fix
vwxyzjn Oct 17, 2023
e69b317
quick change
vwxyzjn Oct 17, 2023
f83a218
quick change
vwxyzjn Oct 17, 2023
86e6275
fix bug in multi-gpu
vwxyzjn Nov 8, 2023
bf5368a
refactor benchmark, support slurm
vwxyzjn Nov 8, 2023
aec360b
remove mujoco_py stuff
vwxyzjn Nov 9, 2023
46efc25
add slurm template
vwxyzjn Nov 9, 2023
072eafb
pre-commit
vwxyzjn Nov 9, 2023
b2542e0
update ddpg docs
vwxyzjn Nov 9, 2023
33a5609
update td3 docs
vwxyzjn Nov 9, 2023
4d8c3da
update sac
vwxyzjn Nov 9, 2023
70702cf
bug fix
vwxyzjn Nov 13, 2023
4c09502
Merge branch 'refactor-tyro' of https://github.com/vwxyzjn/cleanrl in…
vwxyzjn Nov 13, 2023
60b71f7
update docs
vwxyzjn Nov 27, 2023
7a96de2
update ppo docs
vwxyzjn Nov 27, 2023
89846df
bump version
vwxyzjn Nov 27, 2023
4f0dc48
bump version
vwxyzjn Nov 27, 2023
d821748
bump test cases
vwxyzjn Nov 27, 2023
7880155
add benchmark utility docs
vwxyzjn Nov 27, 2023
50ec155
bump test
vwxyzjn Nov 27, 2023
940595a
fix #418
vwxyzjn Nov 27, 2023
b0caf45
update requirements.txt
vwxyzjn Nov 27, 2023
aaf7dd0
test
vwxyzjn Nov 27, 2023
2fb4814
add numpy
vwxyzjn Nov 28, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@
If you need to run benchmark experiments for a performance-impacting changes:

- [ ] I have contacted @vwxyzjn to obtain access to the [openrlbenchmark W&B team](https://wandb.ai/openrlbenchmark).
- [ ] I have used the [benchmark utility](/get-started/benchmark-utility/) to submit the tracked experiments to the [openrlbenchmark/cleanrl](https://wandb.ai/openrlbenchmark/cleanrl) W&B project, optionally with `--capture-video`.
- [ ] I have used the [benchmark utility](/get-started/benchmark-utility/) to submit the tracked experiments to the [openrlbenchmark/cleanrl](https://wandb.ai/openrlbenchmark/cleanrl) W&B project, optionally with `--capture_video`.
- [ ] I have performed RLops with `python -m openrlbenchmark.rlops`.
- For new feature or bug fix:
- [ ] I have used the RLops utility to understand the performance impact of the changes and confirmed there is no regression.
Expand Down
54 changes: 16 additions & 38 deletions .github/workflows/tests.yaml
Original file line number Diff line number Diff line change
@@ -1,10 +1,5 @@
name: tests
on:
push:
paths-ignore:
- '**/README.md'
- 'docs/**/*'
- 'cloud/**/*'
pull_request:
paths-ignore:
- '**/README.md'
Expand All @@ -15,8 +10,8 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: [3.8]
poetry-version: [1.3.1]
python-version: ["3.8", "3.9", "3.10"]
poetry-version: ["1.7"]
os: [ubuntu-22.04, macos-latest, windows-latest]
runs-on: ${{ matrix.os }}
steps:
Expand Down Expand Up @@ -58,8 +53,8 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: [3.8]
poetry-version: [1.3.1]
python-version: ["3.8", "3.9", "3.10"]
poetry-version: ["1.7"]
os: [ubuntu-22.04, macos-latest, windows-latest]
runs-on: ${{ matrix.os }}
steps:
Expand Down Expand Up @@ -94,8 +89,8 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: [3.8]
poetry-version: [1.3.1]
python-version: ["3.8", "3.9", "3.10"]
poetry-version: ["1.7"]
os: [ubuntu-22.04, macos-latest, windows-latest]
runs-on: ${{ matrix.os }}
steps:
Expand All @@ -120,8 +115,8 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: [3.8]
poetry-version: [1.3.1]
python-version: ["3.8", "3.9", "3.10"]
poetry-version: ["1.7"]
os: [ubuntu-22.04]
runs-on: ${{ matrix.os }}
steps:
Expand Down Expand Up @@ -180,8 +175,8 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: [3.8]
poetry-version: [1.3.1]
python-version: ["3.8", "3.9", "3.10"]
poetry-version: ["1.7"]
os: [ubuntu-22.04]
runs-on: ${{ matrix.os }}
steps:
Expand All @@ -194,29 +189,12 @@ jobs:
with:
poetry-version: ${{ matrix.poetry-version }}

# mujoco_py tests
- name: Install dependencies
run: poetry install -E "pytest mujoco_py mujoco jax"
- name: Run gymnasium migration dependencies
run: poetry run pip install "stable_baselines3==2.0.0a1"
- name: Downgrade setuptools
run: poetry run pip install setuptools==59.5.0
- name: install mujoco_py dependencies
run: |
sudo apt-get update && sudo apt-get -y install wget unzip software-properties-common \
libgl1-mesa-dev \
libgl1-mesa-glx \
libglew-dev \
libosmesa6-dev patchelf
- name: Run mujoco_py tests
run: poetry run pytest tests/test_mujoco_py.py

test-envpool-envs:
strategy:
fail-fast: false
matrix:
python-version: [3.8]
poetry-version: [1.3.1]
python-version: ["3.8", "3.9", "3.10"]
poetry-version: ["1.7"]
os: [ubuntu-22.04]
runs-on: ${{ matrix.os }}
steps:
Expand All @@ -241,8 +219,8 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: [3.8]
poetry-version: [1.3.1]
python-version: ["3.8", "3.9", "3.10"]
poetry-version: ["1.7"]
os: [ubuntu-22.04]
runs-on: ${{ matrix.os }}
steps:
Expand All @@ -267,8 +245,8 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: [3.8]
poetry-version: [1.3.1]
python-version: ["3.8", "3.9", "3.10"]
poetry-version: ["1.7"]
os: [ubuntu-22.04]
runs-on: ${{ matrix.os }}
steps:
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/utils_test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,8 @@ jobs:
strategy:
fail-fast: false
matrix:
python-version: [3.8]
poetry-version: [1.3.1]
python-version: ["3.8", "3.9", "3.10"]
poetry-version: ["1.7"]
os: [ubuntu-22.04]
runs-on: ${{ matrix.os }}
steps:
Expand Down
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
slurm
.aim
runs
balance_bot.xml
cleanrl/ppo_continuous_action_isaacgym/isaacgym/examples
Expand Down
4 changes: 0 additions & 4 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,10 +56,6 @@ repos:
name: poetry-export requirements-dm_control.txt
args: ["--without-hashes", "-o", "requirements/requirements-dm_control.txt", "-E", "dm_control"]
stages: [manual]
- id: poetry-export
name: poetry-export requirements-mujoco_py.txt
args: ["--without-hashes", "-o", "requirements/requirements-mujoco_py.txt", "-E", "mujoco_py"]
stages: [manual]
- id: poetry-export
name: poetry-export requirements-procgen.txt
args: ["--without-hashes", "-o", "requirements/requirements-procgen.txt", "-E", "procgen"]
Expand Down
5 changes: 5 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -191,3 +191,8 @@ If you use CleanRL in your work, please cite our technical [paper](https://www.j
url = {http://jmlr.org/papers/v23/21-1342.html}
}
```


## Acknowledgement

We thank [Hugging Face](https://huggingface.co/)'s cluster for providing GPU computational resources to this project.
12 changes: 6 additions & 6 deletions benchmark/c51.sh
Original file line number Diff line number Diff line change
@@ -1,29 +1,29 @@
poetry install
OMP_NUM_THREADS=1 xvfb-run -a poetry run python -m cleanrl_utils.benchmark \
--env-ids CartPole-v1 Acrobot-v1 MountainCar-v0 \
--command "poetry run python cleanrl/c51.py --cuda False --track --capture-video" \
--command "poetry run python cleanrl/c51.py --no_cuda --track --capture_video" \
--num-seeds 3 \
--workers 9

poetry install -E atari
OMP_NUM_THREADS=1 xvfb-run -a poetry run python -m cleanrl_utils.benchmark \
--env-ids PongNoFrameskip-v4 BeamRiderNoFrameskip-v4 BreakoutNoFrameskip-v4 \
--command "poetry run python cleanrl/c51_atari.py --track --capture-video" \
--command "poetry run python cleanrl/c51_atari.py --track --capture_video" \
--num-seeds 3 \
--workers 1

poetry install -E "jax"
poetry run pip install --upgrade "jax[cuda]==0.3.17" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
poetry run pip install --upgrade "jax[cuda11_cudnn82]==0.4.8" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
CUDA_VISIBLE_DEVICES=-1 xvfb-run -a python -m cleanrl_utils.benchmark \
--env-ids CartPole-v1 Acrobot-v1 MountainCar-v0 \
--command "poetry run python cleanrl/c51_jax.py --track --capture-video" \
--command "poetry run python cleanrl/c51_jax.py --track --capture_video" \
--num-seeds 3 \
--workers 1

poetry install -E "atari jax"
poetry run pip install --upgrade "jax[cuda]==0.3.17" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
poetry run pip install --upgrade "jax[cuda11_cudnn82]==0.4.8" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
xvfb-run -a python -m cleanrl_utils.benchmark \
--env-ids PongNoFrameskip-v4 BeamRiderNoFrameskip-v4 BreakoutNoFrameskip-v4 \
--command "poetry run python cleanrl/c51_atari_jax.py --track --capture-video" \
--command "poetry run python cleanrl/c51_atari_jax.py --track --capture_video" \
--num-seeds 3 \
--workers 1
21 changes: 21 additions & 0 deletions benchmark/cleanrl_1gpu.slurm_template
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
#!/bin/bash
#SBATCH --job-name=low-priority
#SBATCH --partition=production-cluster
#SBATCH --gpus-per-task={{gpus_per_task}}
#SBATCH --cpus-per-gpu={{cpus_per_gpu}}
#SBATCH --ntasks={{ntasks}}
#SBATCH --output=slurm/logs/%x_%j.out
#SBATCH --array={{array}}
#SBATCH --mem-per-cpu=12G
#SBATCH --exclude=ip-26-0-146-[33,100,122-123,149,183,212,249],ip-26-0-147-[6,94,120,141],ip-26-0-152-[71,101,119,178,186,207,211],ip-26-0-153-[6,62,112,132,166,251],ip-26-0-154-[38,65],ip-26-0-155-[164,174,187,217],ip-26-0-156-[13,40],ip-26-0-157-27
##SBATCH --nodelist=ip-26-0-147-204
{{nodes}}

env_ids={{env_ids}}
seeds={{seeds}}
env_id=${env_ids[$SLURM_ARRAY_TASK_ID / {{len_seeds}}]}
seed=${seeds[$SLURM_ARRAY_TASK_ID % {{len_seeds}}]}

echo "Running task $SLURM_ARRAY_TASK_ID with env_id: $env_id and seed: $seed"

srun {{command}} --env-id $env_id --seed $seed #
32 changes: 19 additions & 13 deletions benchmark/ddpg.sh
Original file line number Diff line number Diff line change
@@ -1,16 +1,22 @@
poetry install -E "mujoco_py"
python -c "import mujoco_py"
xvfb-run -a python -m cleanrl_utils.benchmark \
--env-ids HalfCheetah-v2 Walker2d-v2 Hopper-v2 InvertedPendulum-v2 Humanoid-v2 Pusher-v2 \
--command "poetry run python cleanrl/ddpg_continuous_action.py --track --capture-video" \
poetry install -E "mujoco"
python -m cleanrl_utils.benchmark \
--env-ids HalfCheetah-v4 Walker2d-v4 Hopper-v4 InvertedPendulum-v4 Humanoid-v4 Pusher-v4 \
--command "poetry run python cleanrl/ddpg_continuous_action.py --track" \
--num-seeds 3 \
--workers 1
--workers 18 \
--slurm-gpus-per-task 1 \
--slurm-ntasks 1 \
--slurm-total-cpus 10 \
--slurm-template-path benchmark/cleanrl_1gpu.slurm_template

poetry install -E "mujoco_py jax"
poetry run pip install --upgrade "jax[cuda]==0.3.17" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
poetry run python -c "import mujoco_py"
xvfb-run -a poetry run python -m cleanrl_utils.benchmark \
--env-ids HalfCheetah-v2 Walker2d-v2 Hopper-v2 \
--command "poetry run python cleanrl/ddpg_continuous_action_jax.py --track --capture-video" \
poetry install -E "mujoco jax"
poetry run pip install --upgrade "jax[cuda11_cudnn82]==0.4.8" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
poetry run python -m cleanrl_utils.benchmark \
--env-ids HalfCheetah-v4 Walker2d-v4 Hopper-v4 InvertedPendulum-v4 Humanoid-v4 Pusher-v4 \
--command "poetry run python cleanrl/ddpg_continuous_action_jax.py --track" \
--num-seeds 3 \
--workers 1
--workers 18 \
--slurm-gpus-per-task 1 \
--slurm-ntasks 1 \
--slurm-total-cpus 10 \
--slurm-template-path benchmark/cleanrl_1gpu.slurm_template
20 changes: 20 additions & 0 deletions benchmark/ddpg_plot.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
python -m openrlbenchmark.rlops \
--filters '?we=openrlbenchmark&wpn=cleanrl&ceik=env_id&cen=exp_name&metric=charts/episodic_return' \
'ddpg_continuous_action?tag=pr-424' \
--env-ids HalfCheetah-v4 Walker2d-v4 Hopper-v4 InvertedPendulum-v4 Humanoid-v4 Pusher-v4 \
--no-check-empty-runs \
--pc.ncols 3 \
--pc.ncols-legend 2 \
--output-filename benchmark/cleanrl/ddpg \
--scan-history

python -m openrlbenchmark.rlops \
--filters '?we=openrlbenchmark&wpn=cleanrl&ceik=env_id&cen=exp_name&metric=charts/episodic_return' \
'ddpg_continuous_action?tag=pr-424' \
'ddpg_continuous_action_jax?tag=pr-424' \
--env-ids HalfCheetah-v4 Walker2d-v4 Hopper-v4 InvertedPendulum-v4 Humanoid-v4 Pusher-v4 \
--no-check-empty-runs \
--pc.ncols 3 \
--pc.ncols-legend 2 \
--output-filename benchmark/cleanrl/ddpg_jax \
--scan-history
12 changes: 6 additions & 6 deletions benchmark/dqn.sh
Original file line number Diff line number Diff line change
@@ -1,29 +1,29 @@
poetry install
OMP_NUM_THREADS=1 xvfb-run -a poetry run python -m cleanrl_utils.benchmark \
--env-ids CartPole-v1 Acrobot-v1 MountainCar-v0 \
--command "poetry run python cleanrl/dqn.py --cuda False --track --capture-video" \
--command "poetry run python cleanrl/dqn.py --no_cuda --track --capture_video" \
--num-seeds 3 \
--workers 9

poetry install -E atari
OMP_NUM_THREADS=1 xvfb-run -a poetry run python -m cleanrl_utils.benchmark \
--env-ids PongNoFrameskip-v4 BeamRiderNoFrameskip-v4 BreakoutNoFrameskip-v4 \
--command "poetry run python cleanrl/dqn_atari.py --track --capture-video" \
--command "poetry run python cleanrl/dqn_atari.py --track --capture_video" \
--num-seeds 3 \
--workers 1

poetry install -E jax
poetry run pip install --upgrade "jax[cuda]==0.3.17" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
poetry run pip install --upgrade "jax[cuda11_cudnn82]==0.4.8" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
xvfb-run -a python -m cleanrl_utils.benchmark \
--env-ids CartPole-v1 Acrobot-v1 MountainCar-v0 \
--command "poetry run python cleanrl/dqn_jax.py --track --capture-video" \
--command "poetry run python cleanrl/dqn_jax.py --track --capture_video" \
--num-seeds 3 \
--workers 1

poetry install -E "atari jax"
poetry run pip install --upgrade "jax[cuda]==0.3.17" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
poetry run pip install --upgrade "jax[cuda11_cudnn82]==0.4.8" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
xvfb-run -a python -m cleanrl_utils.benchmark \
--env-ids PongNoFrameskip-v4 BeamRiderNoFrameskip-v4 BreakoutNoFrameskip-v4 \
--command "poetry run python cleanrl/dqn_atari_jax.py --track --capture-video" \
--command "poetry run python cleanrl/dqn_atari_jax.py --track --capture_video" \
--num-seeds 3 \
--workers 1
2 changes: 1 addition & 1 deletion benchmark/ppg.sh
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,6 @@
poetry install -E procgen
xvfb-run -a poetry run python -m cleanrl_utils.benchmark \
--env-ids starpilot bossfight bigfish \
--command "poetry run python cleanrl/ppg_procgen.py --track --capture-video" \
--command "poetry run python cleanrl/ppg_procgen.py --track --capture_video" \
--num-seeds 3 \
--workers 1
Loading
Loading