Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update Wrappers Documentation #942

Merged
2 changes: 1 addition & 1 deletion docs/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ For more information about how to contribute to the documentation go to our [CON

## Editing an environment page

Environemnts' documentation can be found at the top of the file python file where the environment is declared, for example, the documentation for the chess environment can be at [/pettingzoo/classic/chess/chess.py](https://github.com/Farama-Foundation/PettingZoo/blob/master/pettingzoo/classic/chess/chess.py)
Environments' documentation can be found at the top of the file python file where the environment is declared, for example, the documentation for the chess environment can be at [/pettingzoo/classic/chess/chess.py](https://github.com/Farama-Foundation/PettingZoo/blob/master/pettingzoo/classic/chess/chess.py)

To generate the environments pages you need to execute the `docs/_scripts/gen_envs_mds.py` script:

Expand Down
22 changes: 22 additions & 0 deletions docs/api/aec.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
---
title: AEC
---


# AEC API

By default, PettingZoo models games as [*Agent Environment Cycle*](https://arxiv.org/abs/2009.13051) (AEC) environments. This allows it to support any type of game multi-agent RL can consider.
Expand All @@ -7,6 +12,22 @@ By default, PettingZoo models games as [*Agent Environment Cycle*](https://arxiv
AEC environments can be interacted with as follows:

``` python
from pettingzoo.classic import rps_v2
env = rps_v2.env(render_mode="human")

env.reset()
for agent in env.agent_iter():
observation, reward, termination, truncation, info = env.last()
if termination or truncation:
action = None
else:
action = env.action_space(agent).sample() # this is where you would insert your policy
env.step(action)
env.close()
```

Note: for environments with illegal actions in the action space, actions can be sampled according to an action mask as follows:
``` python
from pettingzoo.classic import chess_v5
env = chess_v5.env(render_mode="human")

Expand All @@ -19,6 +40,7 @@ for agent in env.agent_iter():
action = env.action_space(agent).sample(observation["action_mask"]) # this is where you would insert your policy
env.step(action)
env.close()

```

## AECEnv
Expand Down
5 changes: 5 additions & 0 deletions docs/api/parallel.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
---
title: Parallel
---


# Parallel API

In addition to the main API, we have a secondary parallel API for environments where all agents have simultaneous actions and observations. An environment with parallel API support can be created via `<game>.parallel_env()`. This API is based around the paradigm of *Partially Observable Stochastic Games* (POSGs) and the details are similar to [RLLib's MultiAgent environment specification](https://docs.ray.io/en/latest/rllib-env.html#multi-agent-and-hierarchical), except we allow for different observation and action spaces between the agents.
Expand Down
55 changes: 0 additions & 55 deletions docs/api/pz_wrappers.md

This file was deleted.

5 changes: 5 additions & 0 deletions docs/api/utils.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,8 @@
---
title: Utils
---


# Utils

PettingZoo has some utilities to help make simple interactions with the environment trivial to implement. Utilities which are designed to help make environments easier to develop are in the developer documentation.
Expand Down
27 changes: 27 additions & 0 deletions docs/api/wrappers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
---
title: Wrapper
---

# Wrappers

## Using Wrappers

A wrapper is an environment transformation that takes in an environment as input, and outputs a new environment that is similar to the input environment, but with some transformation or validation applied.

The following wrappers can be used with PettingZoo environments:



[PettingZoo Wrappers](/api/wrappers/pz_wrappers/) include [conversion wrappers](/api/wrappers/pz_wrappers#conversion-wrappers) to convert between the [AEC](/api/aec/) and [Parallel](/api/parallel/) APIs, and a set of simple [utility wrappers](/api/wrappers/pz_wrappers#utility-wrappers) which provide input validation and other convenient reusable logic.

[Supersuit Wrappers](/api/wrappers/supersuit_wrappers/) include commonly used pre-processing functions such as frame-stacking and color reduction, compatible with both PettingZoo and Gymnasium.

[Shimmy Compatibility Wrappers](/api/wrappers/shimmy_wrappers/) allow commonly used external reinforcement learning environments to be used with PettingZoo and Gymnasium.


```{toctree}
:hidden:
wrappers/pz_wrappers
wrappers/supersuit_wrappers
wrappers/shimmy_wrappers
```
108 changes: 108 additions & 0 deletions docs/api/wrappers/pz_wrappers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
---
title: PettingZoo Wrappers
---

# PettingZoo Wrappers

PettingZoo includes the following types of wrappers:
* [Conversion Wrappers](#conversion-wrappers): wrappers for converting environments between the [AEC](/api/aec/) and [Parallel](/api/parallel/) APIs
* [Utility Wrappers](#utility-wrappers): a set of wrappers which provide convenient reusable logic, such as enforcing turn order or clipping out-of-bounds actions.

## Conversion wrappers

### AEC to Parallel

```{eval-rst}
.. currentmodule:: pettingzoo.utils.conversions

.. automodule:: pettingzoo.utils.conversions
:members: aec_to_parallel
:undoc-members:
```

An environment can be converted from an AEC environment to a parallel environment with the `aec_to_parallel` wrapper shown below. Note that this wrapper makes the following assumptions about the underlying environment:

1. The environment steps in a cycle, i.e. it steps through every live agent in order.
2. The environment does not update the observations of the agents except at the end of a cycle.

Most parallel environments in PettingZoo only allocate rewards at the end of a cycle. In these environments, the reward scheme of the AEC API an the parallel API is equivalent. If an AEC environment does allocate rewards within a cycle, then the rewards will be allocated at different timesteps in the AEC environment an the Parallel environment. In particular, the AEC environment will allocate all rewards from one time the agent steps to the next time, while the Parallel environment will allocate all rewards from when the first agent stepped to the last agent stepped.

To convert an AEC environment into a parallel environment:
``` python
from pettingzoo.utils.conversions import aec_to_parallel
from pettingzoo.butterfly import pistonball_v6
env = pistonball_v6.env()
env = aec_to_parallel(env)
```

### Parallel to AEC

```{eval-rst}
.. currentmodule:: pettingzoo.utils.conversions

.. automodule:: pettingzoo.utils.conversions
:members: parallel_to_aec
:undoc-members:
```

Any parallel environment can be efficiently converted to an AEC environment with the `parallel_to_aec` wrapper.

To convert a parallel environment into an AEC environment:
``` python
from pettingzoo.utils import parallel_to_aec
from pettingzoo.butterfly import pistonball_v6
env = pistonball_v6.parallel_env()
env = parallel_to_aec(env)
```


## Utility Wrappers

We wanted our pettingzoo environments to be both easy to use and easy to implement. To combine these, we have a set of simple wrappers which provide input validation and other convenient reusable logic.

You can apply these wrappers to your environment in a similar manner to the below examples:

To wrap a Parallel environment.
```python
from pettingzoo.utils import CaptureStdoutWrapper
from pettingzoo.butterfly import pistonball_v6
parallel_env = pistonball_v6.env()
parallel_env = CaptureStdoutWrapper(parallel_env)

observations = parallel_env.reset()

while parallel_env.agents:
actions = {agent: parallel_env.action_space(agent).sample() for agent in parallel_env.agents} # this is where you would insert your policy
observations, rewards, terminations, truncations, infos = parallel_env.step(actions)
```

To wrap an AEC environment:
```python
from pettingzoo.utils import TerminateIllegalWrapper
from pettingzoo.classic import rps_v2
env = rps_v2.env()
env = TerminateIllegalWrapper(env, illegal_reward=-1)

env.reset()
for agent in env.agent_iter():
observation, reward, termination, truncation, info = env.last()
if termination or truncation:
action = None
else:
action = env.action_space(agent).sample() # this is where you would insert your policy
env.step(action)
env.close()
```
Note: Most AEC environments include TerminateIllegalWrapper in their initialization, so this code does not change the environment's behavior.

```{eval-rst}
.. currentmodule:: pettingzoo.utils.wrappers

.. autoclass:: BaseWrapper
.. autoclass:: TerminateIllegalWrapper
.. autoclass:: CaptureStdoutWrapper
.. autoclass:: AssertOutOfBoundsWrapper
.. autoclass:: ClipOutOfBoundsWrapper
.. autoclass:: OrderEnforcingWrapper

```
Original file line number Diff line number Diff line change
Expand Up @@ -37,9 +37,7 @@ while env.agents:
actions = {agent: env.action_space(agent).sample() for agent in env.agents} # this is where you would insert your policy
observations, rewards, terminations, truncations, infos = env.step(actions)
```
For more information, see [Shimmy DM Control Multi-Agent documentation](https://shimmy.farama.org/contents/dm_multi/)

---

To load an OpenSpiel game of [backgammon](https://github.com/deepmind/open_spiel/blob/master/docs/games.md#backgammon):
```python
Expand All @@ -58,11 +56,8 @@ for agent in env.agent_iter():
action = env.action_space(agent).sample(info["action_mask"]) # this is where you would insert your policy
env.step(action)
env.render()

```
For more information, see [Shimmy OpenSpiel documentation](https://shimmy.farama.org/contents/open_spiel/)

---

To load a Melting Pot [prisoner's dilemma in the matrix](https://github.com/deepmind/meltingpot/blob/main/docs/substrate_scenario_details.md#prisoners-dilemma-in-the-matrix) substrate:

Expand All @@ -77,12 +72,13 @@ while env.agents:
env.close()
```

For more information, see [Shimmy Melting Pot documentation](https://shimmy.farama.org/contents/meltingpot/)

For more information, see [Shimmy documentation](https://shimmy.farama.org).

## Multi-Agent Compatibility Wrappers:
```{eval-rst}
- :external:py:class:`shimmy.dm_control_multiagent_compatibility.DmControlMultiAgentCompatibilityV0`
- :external:py:class:`shimmy.openspiel_compatibility.OpenspielCompatibilityV0`
- :external:py:class:`shimmy.openspiel_compatibility.OpenSpielCompatibilityV0`
- :external:py:class:`shimmy.meltingpot_compatibility.MeltingPotCompatibilityV0`
```

Expand All @@ -92,10 +88,10 @@ If you use this in your research, please cite:

```
@software{shimmy2022github,
author = {{Jun Jet Tai, Mark Towers} and Elliot Tower and Jordan Terry},
title = {Shimmy: Gymnasium and Pettingzoo Wrappers for Commonly Used Environments},
url = {http://github.com/Farama-Foundation/Shimmy},
version = {0.2.0},
author = {{Jun Jet Tai, Mark Towers, Elliot Tower} and Jordan Terry},
title = {Shimmy: Gymnasium and PettingZoo Wrappers for Commonly Used Environments},
url = {https://github.com/Farama-Foundation/Shimmy},
version = {1.0.0},
year = {2022},
}```
}
```
Original file line number Diff line number Diff line change
Expand Up @@ -4,22 +4,15 @@ title: Supersuit Wrappers

# Supersuit Wrappers

PettingZoo include wrappers via the SuperSuit companion package (`pip install supersuit`). These can be applied to both AECEnv and ParallelEnv environments. Using it to convert space invaders to have a grey scale observation space and stack the last 4 frames looks like:
The [SuperSuit](https://github.com/Farama-Foundation/SuperSuit) companion package (`pip install supersuit`) includes a collection of pre-processing functions which can applied to both [AEC](/api/aec/) and [Parallel](/api/parallel/) environments.

To convert [space invaders](https://pettingzoo.farama.org/environments/atari/space_invaders/) to a greyscale observation space and stack the last 4 frames:

``` python
import gymnasium as gym
from pettingzoo.atari import space_invaders_v2
from supersuit import color_reduction_v0, frame_stack_v1

env = gym.make('SpaceInvaders-v0')

env = frame_stack_v1(color_reduction_v0(env, 'full'), 4)
```

Similarly, using SuperSuit with PettingZoo environments looks like

``` python
from pettingzoo.butterfly import pistonball_v0
env = pistonball_v0.env()
env = space_invaders_v2.env()

env = frame_stack_v1(color_reduction_v0(env, 'full'), 4)
```
Expand Down
4 changes: 1 addition & 3 deletions docs/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,7 @@ content/environment_tests

api/aec
api/parallel
api/pz_wrappers
api/supersuit_wrappers
api/shimmy_wrappers
api/wrappers
api/utils
```

Expand Down
14 changes: 14 additions & 0 deletions pettingzoo/utils/conversions.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,11 @@ def aec_fn(**kwargs):


def aec_to_parallel(aec_env):
"""Converts an aec environment to a parallel environment.

In the case of an existing parallel environment wrapped using a `parallel_to_aec_wrapper`, this function will return the original parallel environment.
Otherwise, it will apply the `aec_to_parallel_wrapper` to convert the environment.
"""
if isinstance(aec_env, OrderEnforcingWrapper) and isinstance(
aec_env.env, parallel_to_aec_wrapper
):
Expand All @@ -55,6 +60,11 @@ def aec_to_parallel(aec_env):


def parallel_to_aec(par_env):
"""Converts an aec environment to a parallel environment.

In the case of an existing aec environment wrapped using a `aec_to_prallel_wrapper`, this function will return the original AEC environment.
Otherwise, it will apply the `parallel_to_aec_wrapper` to convert the environment.
"""
if isinstance(par_env, aec_to_parallel_wrapper):
return par_env.aec_env
else:
Expand Down Expand Up @@ -86,6 +96,8 @@ def from_parallel(par_env):


class aec_to_parallel_wrapper(ParallelEnv):
"""Converts an AEC environment into a Parallel environment."""

def __init__(self, aec_env):
assert aec_env.metadata.get("is_parallelizable", False), (
"Converting from an AEC environment to a parallel environment "
Expand Down Expand Up @@ -205,6 +217,8 @@ def close(self):


class parallel_to_aec_wrapper(AECEnv):
"""Converts a parallel environment into an AEC environment."""

def __init__(self, parallel_env):
self.env = parallel_env

Expand Down
Loading