Farama-Foundation · elliottower · Apr 26, 2023 · Apr 8, 2023 · Apr 8, 2023 · Apr 8, 2023
diff --git a/docs/README.md b/docs/README.md
@@ -6,7 +6,7 @@ For more information about how to contribute to the documentation go to our [CON
 
 ## Editing an environment page
 
-Environemnts' documentation can be found at the top of the file python file where the environment is declared, for example, the documentation for the chess environment can be at [/pettingzoo/classic/chess/chess.py](https://github.com/Farama-Foundation/PettingZoo/blob/master/pettingzoo/classic/chess/chess.py)
+Environments' documentation can be found at the top of the file python file where the environment is declared, for example, the documentation for the chess environment can be at [/pettingzoo/classic/chess/chess.py](https://github.com/Farama-Foundation/PettingZoo/blob/master/pettingzoo/classic/chess/chess.py)
 
 To generate the environments pages you need to execute the `docs/_scripts/gen_envs_mds.py` script:
 

diff --git a/docs/api/aec.md b/docs/api/aec.md
@@ -1,3 +1,8 @@
+---
+title: AEC
+---
+
+
 # AEC API
 
 By default, PettingZoo models games as [*Agent Environment Cycle*](https://arxiv.org/abs/2009.13051) (AEC) environments. This allows it to support any type of game multi-agent RL can consider.
@@ -7,6 +12,22 @@ By default, PettingZoo models games as [*Agent Environment Cycle*](https://arxiv
 AEC environments can be interacted with as follows:
 
 ``` python
+from pettingzoo.classic import rps_v2
+env = rps_v2.env(render_mode="human")
+
+env.reset()
+for agent in env.agent_iter():
+    observation, reward, termination, truncation, info = env.last()
+    if termination or truncation:
+        action = None
+    else:
+        action = env.action_space(agent).sample()  # this is where you would insert your policy
+    env.step(action)
+env.close()
+```
+
+Note: for environments with illegal actions in the action space, actions can be sampled according to an action mask as follows:
+``` python 
 from pettingzoo.classic import chess_v5
 env = chess_v5.env(render_mode="human")
 
@@ -19,6 +40,7 @@ for agent in env.agent_iter():
         action = env.action_space(agent).sample(observation["action_mask"])  # this is where you would insert your policy
     env.step(action)
 env.close()
+
 ```
 
 ## AECEnv

diff --git a/docs/api/parallel.md b/docs/api/parallel.md
@@ -1,3 +1,8 @@
+---
+title: Parallel
+---
+
+
 # Parallel API
 
 In addition to the main API, we have a secondary parallel API for environments where all agents have simultaneous actions and observations. An environment with parallel API support can be created via `<game>.parallel_env()`. This API is based around the paradigm of *Partially Observable Stochastic Games* (POSGs) and the details are similar to [RLLib's MultiAgent environment specification](https://docs.ray.io/en/latest/rllib-env.html#multi-agent-and-hierarchical), except we allow for different observation and action spaces between the agents.

diff --git a/docs/api/pz_wrappers.md b/docs/api/pz_wrappers.md
diff --git a/docs/api/utils.md b/docs/api/utils.md
@@ -1,3 +1,8 @@
+---
+title: Utils
+---
+
+
 # Utils
 
 PettingZoo has some utilities to help make simple interactions with the environment trivial to implement. Utilities which are designed to help make environments easier to develop are in the developer documentation.

diff --git a/docs/api/wrappers.md b/docs/api/wrappers.md
@@ -0,0 +1,27 @@
+---
+title: Wrapper
+---
+
+# Wrappers
+
+## Using Wrappers
+
+A wrapper is an environment transformation that takes in an environment as input, and outputs a new environment that is similar to the input environment, but with some transformation or validation applied. 
+
+The following wrappers can be used with PettingZoo environments:
+
+
+
+[PettingZoo Wrappers](/api/wrappers/pz_wrappers/) include [conversion wrappers](/api/wrappers/pz_wrappers#conversion-wrappers) to convert between the [AEC](/api/aec/) and [Parallel](/api/parallel/) APIs, and a set of simple [utility wrappers](/api/wrappers/pz_wrappers#utility-wrappers) which provide input validation and other convenient reusable logic.
+
+[Supersuit Wrappers](/api/wrappers/supersuit_wrappers/) include commonly used pre-processing functions such as frame-stacking and color reduction, compatible with both PettingZoo and Gymnasium.
+
+[Shimmy Compatibility Wrappers](/api/wrappers/shimmy_wrappers/) allow commonly used external reinforcement learning environments to be used with PettingZoo and Gymnasium. 
+
+
+```{toctree}
+:hidden:
+wrappers/pz_wrappers
+wrappers/supersuit_wrappers
+wrappers/shimmy_wrappers
+```
diff --git a/docs/api/wrappers/pz_wrappers.md b/docs/api/wrappers/pz_wrappers.md
@@ -0,0 +1,108 @@
+---
+title: PettingZoo Wrappers
+---
+
+# PettingZoo Wrappers
+
+PettingZoo includes the following types of wrappers: 
+* [Conversion Wrappers](#conversion-wrappers): wrappers for converting environments between the [AEC](/api/aec/) and [Parallel](/api/parallel/) APIs
+* [Utility Wrappers](#utility-wrappers): a set of wrappers which provide convenient reusable logic, such as enforcing turn order or clipping out-of-bounds actions.
+
+## Conversion wrappers
+
+### AEC to Parallel
+
+```{eval-rst}
+.. currentmodule:: pettingzoo.utils.conversions
+
+.. automodule:: pettingzoo.utils.conversions
+   :members: aec_to_parallel
+   :undoc-members:
+```
+
+An environment can be converted from an AEC environment to a parallel environment with the `aec_to_parallel` wrapper shown below. Note that this wrapper makes the following assumptions about the underlying environment:
+
+1. The environment steps in a cycle, i.e. it steps through every live agent in order.
+2. The environment does not update the observations of the agents except at the end of a cycle.
+
+Most parallel environments in PettingZoo only allocate rewards at the end of a cycle. In these environments, the reward scheme of the AEC API an the parallel API is equivalent.  If an AEC environment does allocate rewards within a cycle, then the rewards will be allocated at different timesteps in the AEC environment an the Parallel environment. In particular, the AEC environment will allocate all rewards from one time the agent steps to the next time, while the Parallel environment will allocate all rewards from when the first agent stepped to the last agent stepped.
+
+To convert an AEC environment into a parallel environment:
+``` python
+from pettingzoo.utils.conversions import aec_to_parallel
+from pettingzoo.butterfly import pistonball_v6
+env = pistonball_v6.env()
+env = aec_to_parallel(env)
+```
+
+### Parallel to AEC
+
+```{eval-rst}
+.. currentmodule:: pettingzoo.utils.conversions
+
+.. automodule:: pettingzoo.utils.conversions
+   :members: parallel_to_aec
+   :undoc-members:
+```
+
+Any parallel environment can be efficiently converted to an AEC environment with the `parallel_to_aec` wrapper.
+
+To convert a parallel environment into an AEC environment:
+``` python
+from pettingzoo.utils import parallel_to_aec
+from pettingzoo.butterfly import pistonball_v6
+env = pistonball_v6.parallel_env()
+env = parallel_to_aec(env)
+```
+
+
+## Utility Wrappers
+
+We wanted our pettingzoo environments to be both easy to use and easy to implement. To combine these, we have a set of simple wrappers which provide input validation and other convenient reusable logic.
+
+You can apply these wrappers to your environment in a similar manner to the below examples:
+
+To wrap a Parallel environment.
+```python
+from pettingzoo.utils import CaptureStdoutWrapper
+from pettingzoo.butterfly import pistonball_v6
+parallel_env = pistonball_v6.env()
+parallel_env = CaptureStdoutWrapper(parallel_env)
+
+observations = parallel_env.reset()
+
+while parallel_env.agents:
+    actions = {agent: parallel_env.action_space(agent).sample() for agent in parallel_env.agents}  # this is where you would insert your policy
+    observations, rewards, terminations, truncations, infos = parallel_env.step(actions)
+```
+
+To wrap an AEC environment:
+```python
+from pettingzoo.utils import TerminateIllegalWrapper
+from pettingzoo.classic import rps_v2
+env = rps_v2.env()
+env = TerminateIllegalWrapper(env, illegal_reward=-1)
+
+env.reset()
+for agent in env.agent_iter():
+    observation, reward, termination, truncation, info = env.last()
+    if termination or truncation:
+        action = None
+    else:
+        action = env.action_space(agent).sample()  # this is where you would insert your policy
+    env.step(action)
+env.close()
+```
+Note: Most AEC environments include TerminateIllegalWrapper in their initialization, so this code does not change the environment's behavior.
+
+```{eval-rst}
+.. currentmodule:: pettingzoo.utils.wrappers
+
+.. autoclass:: BaseWrapper
+.. autoclass:: TerminateIllegalWrapper
+.. autoclass:: CaptureStdoutWrapper
+.. autoclass:: AssertOutOfBoundsWrapper
+.. autoclass:: ClipOutOfBoundsWrapper
+.. autoclass:: OrderEnforcingWrapper
+
+```
diff --git a/docs/api/shimmy_wrappers.md → docs/api/wrappers/shimmy_wrappers.md b/docs/api/shimmy_wrappers.md → docs/api/wrappers/shimmy_wrappers.md
@@ -37,9 +37,7 @@ while env.agents:
     actions = {agent: env.action_space(agent).sample() for agent in env.agents}  # this is where you would insert your policy
     observations, rewards, terminations, truncations, infos = env.step(actions)
 ```
-For more information, see [Shimmy DM Control Multi-Agent documentation](https://shimmy.farama.org/contents/dm_multi/)
 
----
 
 To load an OpenSpiel game of [backgammon](https://github.com/deepmind/open_spiel/blob/master/docs/games.md#backgammon):
 ```python
@@ -58,11 +56,8 @@ for agent in env.agent_iter():
         action = env.action_space(agent).sample(info["action_mask"])  # this is where you would insert your policy
     env.step(action)
     env.render()
-
 ```
-For more information, see [Shimmy OpenSpiel documentation](https://shimmy.farama.org/contents/open_spiel/)
 
----
 
 To load a Melting Pot [prisoner's dilemma in the matrix](https://github.com/deepmind/meltingpot/blob/main/docs/substrate_scenario_details.md#prisoners-dilemma-in-the-matrix) substrate:
 
@@ -77,12 +72,13 @@ while env.agents:
 env.close()
 ```
 
-For more information, see [Shimmy Melting Pot documentation](https://shimmy.farama.org/contents/meltingpot/)
+
+For more information, see [Shimmy documentation](https://shimmy.farama.org).
 
 ## Multi-Agent Compatibility Wrappers:
 ```{eval-rst}
 - :external:py:class:`shimmy.dm_control_multiagent_compatibility.DmControlMultiAgentCompatibilityV0`
-- :external:py:class:`shimmy.openspiel_compatibility.OpenspielCompatibilityV0`
+- :external:py:class:`shimmy.openspiel_compatibility.OpenSpielCompatibilityV0`
 - :external:py:class:`shimmy.meltingpot_compatibility.MeltingPotCompatibilityV0`
 ```
 
@@ -92,10 +88,10 @@ If you use this in your research, please cite:
 
 ```
 @software{shimmy2022github,
-  author = {{Jun Jet Tai, Mark Towers} and Elliot Tower and Jordan Terry},
-  title = {Shimmy: Gymnasium and Pettingzoo Wrappers for Commonly Used Environments},
-  url = {http://github.com/Farama-Foundation/Shimmy},
-  version = {0.2.0},
+  author = {{Jun Jet Tai, Mark Towers, Elliot Tower} and Jordan Terry},
+  title = {Shimmy: Gymnasium and PettingZoo Wrappers for Commonly Used Environments},
+  url = {https://github.com/Farama-Foundation/Shimmy},
+  version = {1.0.0},
   year = {2022},
-}```
+}
 ```
diff --git a/docs/api/supersuit_wrappers.md → docs/api/wrappers/supersuit_wrappers.md b/docs/api/supersuit_wrappers.md → docs/api/wrappers/supersuit_wrappers.md
@@ -4,22 +4,15 @@ title: Supersuit Wrappers
 
 # Supersuit Wrappers
 
-PettingZoo include wrappers via the SuperSuit companion package (`pip install supersuit`). These can be applied to both AECEnv and ParallelEnv environments. Using it to convert space invaders to have a grey scale observation space and stack the last 4 frames looks like:
+The [SuperSuit](https://github.com/Farama-Foundation/SuperSuit) companion package (`pip install supersuit`) includes a collection of pre-processing functions which can applied to both [AEC](/api/aec/) and [Parallel](/api/parallel/) environments. 
+
+To convert [space invaders](https://pettingzoo.farama.org/environments/atari/space_invaders/) to a greyscale observation space and stack the last 4 frames:
 
 ``` python
-import gymnasium as gym
+from pettingzoo.atari import space_invaders_v2
 from supersuit import color_reduction_v0, frame_stack_v1
 
-env = gym.make('SpaceInvaders-v0')
-
-env = frame_stack_v1(color_reduction_v0(env, 'full'), 4)
-```
-
-Similarly, using SuperSuit with PettingZoo environments looks like
-
-``` python
-from pettingzoo.butterfly import pistonball_v0
-env = pistonball_v0.env()
+env = space_invaders_v2.env()
 
 env = frame_stack_v1(color_reduction_v0(env, 'full'), 4)
 ```

diff --git a/docs/index.md b/docs/index.md
@@ -19,9 +19,7 @@ content/environment_tests
 
 api/aec
 api/parallel
-api/pz_wrappers
-api/supersuit_wrappers
-api/shimmy_wrappers
+api/wrappers
 api/utils
 ```
 

diff --git a/pettingzoo/utils/conversions.py b/pettingzoo/utils/conversions.py
@@ -45,6 +45,11 @@ def aec_fn(**kwargs):
 
 
 def aec_to_parallel(aec_env):
+    """Converts an aec environment to a parallel environment.
+
+    In the case of an existing parallel environment wrapped using a `parallel_to_aec_wrapper`, this function will return the original parallel environment.
+    Otherwise, it will apply the `aec_to_parallel_wrapper` to convert the environment.
+    """
     if isinstance(aec_env, OrderEnforcingWrapper) and isinstance(
         aec_env.env, parallel_to_aec_wrapper
     ):
@@ -55,6 +60,11 @@ def aec_to_parallel(aec_env):
 
 
 def parallel_to_aec(par_env):
+    """Converts an aec environment to a parallel environment.
+
+    In the case of an existing aec environment wrapped using a `aec_to_prallel_wrapper`, this function will return the original AEC environment.
+    Otherwise, it will apply the `parallel_to_aec_wrapper` to convert the environment.
+    """
     if isinstance(par_env, aec_to_parallel_wrapper):
         return par_env.aec_env
     else:
@@ -86,6 +96,8 @@ def from_parallel(par_env):
 
 
 class aec_to_parallel_wrapper(ParallelEnv):
+    """Converts an AEC environment into a Parallel environment."""
+
     def __init__(self, aec_env):
         assert aec_env.metadata.get("is_parallelizable", False), (
             "Converting from an AEC environment to a parallel environment "
@@ -205,6 +217,8 @@ def close(self):
 
 
 class parallel_to_aec_wrapper(AECEnv):
+    """Converts a parallel environment into an AEC environment."""
+
     def __init__(self, parallel_env):
         self.env = parallel_env