Skip to content

Commit

Permalink
Add Gymnasium support (#1327)
Browse files Browse the repository at this point in the history
* Fix failing set_env test

* Fix test failiing due to deprectation of env.seed

* Adjust mean reward threshold in failing test

* Fix her test failing due to rng

* Change seed and revert reward threshold to 90

* Pin gym version

* Make VecEnv compatible with gym seeding change

* Revert change to VecEnv reset signature

* Change subprocenv seed cmd to call reset instead

* Fix type check

* Add backward compat

* Add `compat_gym_seed` helper

* Add goal env checks in env_checker

* Add docs on  HER requirements for envs

* Capture user warning in test with inverted box space

* Update ale-py version

* Fix randint

* Allow noop_max to be zero

* Update changelog

* Update docker image

* Update doc conda env and dockerfile

* Custom envs should not have any warnings

* Fix test for numpy >= 1.21

* Add check for vectorized compute reward

* Bump to gym 0.24

* Fix gym default step docstring

* Test downgrading gym

* Revert "Test downgrading gym"

This reverts commit 0072b77.

* Fix protobuf error

* Fix in dependencies

* Fix protobuf dep

* Use newest version of cartpole

* Update gym

* Fix warning

* Loosen required scipy version

* Scipy no longer needed

* Try gym 0.25

* Silence warnings from gym

* Filter warnings during tests

* Update doc

* Update requirements

* Add gym 26 compat in vec env

* Fixes in envs and tests for gym 0.26+

* Enforce gym 0.26 api

* format

* Fix formatting

* Fix dependencies

* Fix syntax

* Cleanup doc and warnings

* Faster tests

* Higher budget for HER perf test (revert prev change)

* Fixes and update doc

* Fix doc build

* Fix breaking change

* Fixes for rendering

* Rename variables in monitor

* update render method for gym 0.26 API

backwards compatible (mode argument is allowed) while using the gym 0.26 API (render mode is determined at environment creation)

* update tests and docs to new gym render API

* undo removal of render modes metatadata check

* set rgb_array as default render mode for gym.make

* undo changes & raise warning if not 'rgb_array'

* Fix type check

* Remove recursion and fix type checking

* Remove hacks for protobuf and gym 0.24

* Fix type annotations

* reuse existing render_mode attribute

* return tiled images for 'human' render mode

* Allow to use opencv for human render, fix typos

* Add warning when using non-zero start with Discrete (fixes #1197)

* Fix type checking

* Bug fixes and handle more cases

* Throw proper warnings

* Update test

* Fix new metadata name

* Ignore numpy warnings

* Fixes in vec recorder

* Global ignore

* Filter local warning too

* Monkey patch not needed for gym 26

* Add doc of VecEnv vs Gym API

* Add render test

* Fix return type

* Update VecEnv vs Gym API doc

* Fix for custom render mode

* Fix return type

* Fix type checking

* check test env test_buffer

* skip render check

* check env test_dict_env

* test_env test_gae

* check envs in remaining tests

* Update tests

* Add warning for Discrete action space with non-zero (#1295)

* Fix atari annotation

* ignore get_action_meanings [attr-defined]

* Fix mypy issues

* Add patch for gym/gymnasium transition

* Switch to gymnasium

* Rely on signature instead of version

* More patches

* Type ignore because of Farama-Foundation/Gymnasium#39

* Fix doc build

* Fix pytype errors

* Fix atari requirement

* Update env checker due to change in dtype for Discrete

* Fix type hint

* Convert spaces for saved models

* Ignore pytype

* Remove gitlab CI

* Disable pytype for convert space

* Fix undefined info

* Fix undefined info

* Upgrade shimmy

* Fix wrappers type annotation (need PR from Gymnasium)

* Fix gymnasium dependency

* Fix dependency declaration

* Cap pygame version for python 3.7

* Point to master branch (v0.28.0)

* Fix: use main not master branch

* Rename done to terminated

* Fix pygame dependency for python 3.7

* Rename gym to gymnasium

* Update Gymnasium

* Fix test

* Fix tests

* Forks don't have access to private variables

* Fix linter warnings

* Update read the doc env

* Fix env checker for GoalEnv

* Fix import

* Update env checker (more info) and fix dtype

* Use micromamab for Docker

* Update dependencies

* Clarify VecEnv doc

* Fix Gymnasium version

* Copy file only after mamba install

* [ci skip] Update docker doc

* Polish code

* Reformat

* Remove deprecated features

* Ignore warning

* Update doc

* Update examples and changelog

* Fix type annotation bundle (SAC, TD3, A2C, PPO, base class) (#1436)

* Fix SAC type hints, improve DQN ones

* Fix A2C and TD3 type hints

* Fix PPO type hints

* Fix on-policy type hints

* Fix base class type annotation, do not use defaults

* Update version

* Disable mypy for python 3.7

* Rename Gym26StepReturn

* Update continuous critic type annotation

* Fix pytype complain

---------

Co-authored-by: Carlos Luis <[email protected]>
Co-authored-by: Quentin Gallouédec <[email protected]>
Co-authored-by: Thomas Lips <[email protected]>
Co-authored-by: tlips <[email protected]>
Co-authored-by: tlpss <[email protected]>
Co-authored-by: Quentin GALLOUÉDEC <[email protected]>
  • Loading branch information
7 people authored Apr 14, 2023
1 parent 15c9daa commit 40e0b9d
Show file tree
Hide file tree
Showing 94 changed files with 1,333 additions and 733 deletions.
9 changes: 5 additions & 4 deletions .github/ISSUE_TEMPLATE/custom_env.yml
Original file line number Diff line number Diff line change
Expand Up @@ -49,15 +49,16 @@ body:
self.observation_space = spaces.Box(low=-np.inf, high=np.inf, shape=(14,))
self.action_space = spaces.Box(low=-1, high=1, shape=(6,))
def reset(self):
return self.observation_space.sample()
def reset(self, seed=None):
return self.observation_space.sample(), {}
def step(self, action):
obs = self.observation_space.sample()
reward = 1.0
done = False
terminated = False
truncated = False
info = {}
return obs, reward, done, info
return obs, reward, terminated, truncated, info
env = CustomEnv()
check_env(env)
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,8 @@ jobs:
- name: Type check
run: |
make type
# skip mypy type check for python3.7 (result is different to all other versions)
if: "!(matrix.python-version == '3.7')"
- name: Test with pytest
run: |
make pytest
38 changes: 11 additions & 27 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,41 +1,25 @@
ARG PARENT_IMAGE
FROM $PARENT_IMAGE
ARG PYTORCH_DEPS=cpuonly
ARG PYTHON_VERSION=3.7
ARG PYTHON_VERSION=3.8
ARG MAMBA_DOCKERFILE_ACTIVATE=1 # (otherwise python will not be found)

RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
cmake \
git \
curl \
ca-certificates \
libjpeg-dev \
libpng-dev \
libglib2.0-0 && \
rm -rf /var/lib/apt/lists/*
# Install micromamba env and dependencies
RUN micromamba install -n base -y python=$PYTHON_VERSION \
pytorch $PYTORCH_DEPS -c conda-forge -c pytorch -c nvidia && \
micromamba clean --all --yes

# Install Anaconda and dependencies
RUN curl -o ~/miniconda.sh https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh && \
chmod +x ~/miniconda.sh && \
~/miniconda.sh -b -p /opt/conda && \
rm ~/miniconda.sh && \
/opt/conda/bin/conda install -y python=$PYTHON_VERSION numpy pyyaml scipy ipython mkl mkl-include && \
/opt/conda/bin/conda install -y pytorch $PYTORCH_DEPS -c pytorch && \
/opt/conda/bin/conda clean -ya
ENV PATH /opt/conda/bin:$PATH

ENV CODE_DIR /root/code
ENV CODE_DIR /home/$MAMBA_USER

# Copy setup file only to install dependencies
COPY ./setup.py ${CODE_DIR}/stable-baselines3/setup.py
COPY ./stable_baselines3/version.txt ${CODE_DIR}/stable-baselines3/stable_baselines3/version.txt
COPY --chown=$MAMBA_USER:$MAMBA_USER ./setup.py ${CODE_DIR}/stable-baselines3/setup.py
COPY --chown=$MAMBA_USER:$MAMBA_USER ./stable_baselines3/version.txt ${CODE_DIR}/stable-baselines3/stable_baselines3/version.txt

RUN \
cd ${CODE_DIR}/stable-baselines3 3&& \
RUN cd ${CODE_DIR}/stable-baselines3 && \
pip install -e .[extra,tests,docs] && \
# Use headless version for docker
pip uninstall -y opencv-python && \
pip install opencv-python-headless && \
rm -rf $HOME/.cache/pip
pip cache purge

CMD /bin/bash
6 changes: 6 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,12 @@ pytype:
mypy:
mypy ${LINT_PATHS}

missing-annotations:
mypy --disallow-untyped-calls --disallow-untyped-defs --ignore-missing-imports stable_baselines3

# missing docstrings
# pylint -d R,C,W,E -e C0116 stable_baselines3 -j 4

type: pytype mypy

lint:
Expand Down
6 changes: 3 additions & 3 deletions docs/conda_env.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,11 @@ channels:
- defaults
dependencies:
- cpuonly=1.0=0
- pip=21.1
- pip=22.3.1
- python=3.7
- pytorch=1.11=py3.7_cpu_0
- pytorch=1.11.0=py3.7_cpu_0
- pip:
- gym==0.21
- gymnasium
- cloudpickle
- opencv-python-headless
- pandas
Expand Down
10 changes: 5 additions & 5 deletions docs/guide/callbacks.rst
Original file line number Diff line number Diff line change
Expand Up @@ -210,7 +210,7 @@ It will save the best model if ``best_model_save_path`` folder is specified and

.. code-block:: python
import gym
import gymnasium as gym
from stable_baselines3 import SAC
from stable_baselines3.common.callbacks import EvalCallback
Expand Down Expand Up @@ -260,7 +260,7 @@ Alternatively, you can pass directly a list of callbacks to the ``learn()`` meth

.. code-block:: python
import gym
import gymnasium as gym
from stable_baselines3 import SAC
from stable_baselines3.common.callbacks import CallbackList, CheckpointCallback, EvalCallback
Expand Down Expand Up @@ -290,7 +290,7 @@ It must be used with the :ref:`EvalCallback` and use the event triggered by a ne

.. code-block:: python
import gym
import gymnasium as gym
from stable_baselines3 import SAC
from stable_baselines3.common.callbacks import EvalCallback, StopTrainingOnRewardThreshold
Expand Down Expand Up @@ -322,7 +322,7 @@ An :ref:`EventCallback` that will trigger its child callback every ``n_steps`` t

.. code-block:: python
import gym
import gymnasium as gym
from stable_baselines3 import PPO
from stable_baselines3.common.callbacks import CheckpointCallback, EveryNTimesteps
Expand Down Expand Up @@ -379,7 +379,7 @@ It must be used with the :ref:`EvalCallback` and use the event triggered after e

.. code-block:: python
import gym
import gymnasium as gym
from stable_baselines3 import SAC
from stable_baselines3.common.callbacks import EvalCallback, StopTrainingOnNoModelImprovement
Expand Down
6 changes: 3 additions & 3 deletions docs/guide/checking_nan.rst
Original file line number Diff line number Diff line change
Expand Up @@ -100,8 +100,8 @@ It will monitor the actions, observations, and rewards, indicating what action o

.. code-block:: python
import gym
from gym import spaces
import gymnasium as gym
from gymnasium import spaces
import numpy as np
from stable_baselines3 import PPO
Expand Down Expand Up @@ -129,7 +129,7 @@ It will monitor the actions, observations, and rewards, indicating what action o
def reset(self):
return [0.0]
def render(self, mode="human", close=False):
def render(self, close=False):
pass
# Create environment
Expand Down
8 changes: 4 additions & 4 deletions docs/guide/custom_env.rst
Original file line number Diff line number Diff line change
Expand Up @@ -26,9 +26,9 @@ That is to say, your environment must implement the following methods (and inher

.. code-block:: python
import gym
import gymnasium as gym
import numpy as np
from gym import spaces
from gymnasium import spaces
class CustomEnv(gym.Env):
Expand All @@ -54,7 +54,7 @@ That is to say, your environment must implement the following methods (and inher
...
return observation # reward, done, info can't be included
def render(self, mode="human"):
def render(self):
...
def close(self):
Expand Down Expand Up @@ -91,7 +91,7 @@ Optionally, you can also register the environment with gym, that will allow you

.. code-block:: python
from gym.envs.registration import register
from gymnasium.envs.registration import register
# Example for the CartPole environment
register(
# unique identifier for the env `name-version`
Expand Down
8 changes: 4 additions & 4 deletions docs/guide/custom_policy.rst
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ using ``policy_kwargs`` parameter:
.. code-block:: python
import gym
import gymnasium as gym
import torch as th
from stable_baselines3 import PPO
Expand Down Expand Up @@ -143,7 +143,7 @@ that derives from ``BaseFeaturesExtractor`` and then pass it to the model when t
import torch as th
import torch.nn as nn
from gym import spaces
from gymnasium import spaces
from stable_baselines3 import PPO
from stable_baselines3.common.torch_layers import BaseFeaturesExtractor
Expand Down Expand Up @@ -208,7 +208,7 @@ downsampling and "vector" with a single linear layer.

.. code-block:: python
import gym
import gymnasium as gym
import torch as th
from torch import nn
Expand Down Expand Up @@ -308,7 +308,7 @@ If your task requires even more granular control over the policy/value architect
from typing import Callable, Dict, List, Optional, Tuple, Type, Union
from gym import spaces
from gymnasium import spaces
import torch as th
from torch import nn
Expand Down
Loading

0 comments on commit 40e0b9d

Please sign in to comment.