Skip to content

Commit

Permalink
Update wording for SB3 tutorial (#1027)
Browse files Browse the repository at this point in the history
  • Loading branch information
elliottower authored Jul 15, 2023
1 parent c7841cb commit 3daf8c6
Show file tree
Hide file tree
Showing 3 changed files with 7 additions and 7 deletions.
3 changes: 1 addition & 2 deletions docs/tutorials/sb3/connect_four.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ This tutorial shows how to train a agents using Maskable [Proximal Policy Optimi

It creates a custom Wrapper to convert to a [Gymnasium](https://gymnasium.farama.org/)-like environment which is compatible with [SB3 action masking](https://sb3-contrib.readthedocs.io/en/master/modules/ppo_mask.html).

After training and evaluation, this script will launch a demo game using human rendering. Trained models are saved and loaded from disk (see SB3's [documentation](https://stable-baselines3.readthedocs.io/en/master/guide/save_format.html) for more information).

```{eval-rst}
.. note::
Expand All @@ -21,8 +22,6 @@ It creates a custom Wrapper to convert to a [Gymnasium](https://gymnasium.farama
This wrapper assumes that the action space and observation space is the same for each agent, this assumption may not hold for custom environments.
```

After training and evaluation, this script will launch a demo game using human rendering. Trained models are saved and loaded from disk (see SB3's [documentation](https://stable-baselines3.readthedocs.io/en/master/guide/save_format.html) for more information).


## Environment Setup
To follow this tutorial, you will need to install the dependencies shown below. It is recommended to use a newly-created virtual environment to avoid dependency conflicts.
Expand Down
8 changes: 4 additions & 4 deletions docs/tutorials/sb3/kaz.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,9 @@ title: "SB3: PPO for Knights-Archers-Zombies"

This tutorial shows how to train agents using [Proximal Policy Optimization](https://stable-baselines3.readthedocs.io/en/master/modules/ppo.html) (PPO) on the [Knights-Archers-Zombies](https://pettingzoo.farama.org/environments/butterfly/knights_archers_zombies/) environment ([AEC](https://pettingzoo.farama.org/api/aec/)).

It converts the environment into a Parallel environment and uses SuperSuit to create vectorized environments, leveraging multithreading to speed up training.
We use SuperSuit to create vectorized environments, leveraging multithreading to speed up training (see SB3's [vector environments documentation](https://stable-baselines3.readthedocs.io/en/master/guide/vec_envs.html)).

After training and evaluation, this script will launch a demo game using human rendering. Trained models are saved and loaded from disk (see SB3's [documentation](https://stable-baselines3.readthedocs.io/en/master/guide/save_format.html) for more information).
After training and evaluation, this script will launch a demo game using human rendering. Trained models are saved and loaded from disk (see SB3's [model saving documentation](https://stable-baselines3.readthedocs.io/en/master/guide/save_format.html)).

```{eval-rst}
.. note::
Expand All @@ -17,9 +17,9 @@ After training and evaluation, this script will launch a demo game using human r
```

```{eval-rst}
.. warning::
.. note::
Because this environment allows agents to spawn and die, it requires using SuperSuit's Black Death wrapper, which provides blank observations to dead agents, rather than removing them from the environment.
This environment allows agents to spawn and die, so it requires using SuperSuit's Black Death wrapper, which provides blank observations to dead agents rather than removing them from the environment.
```


Expand Down
3 changes: 2 additions & 1 deletion docs/tutorials/sb3/waterworld.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,9 @@ title: "SB3: PPO for Waterworld (Parallel)"

This tutorial shows how to train agents using [Proximal Policy Optimization](https://stable-baselines3.readthedocs.io/en/master/modules/ppo.html) (PPO) on the [Waterworld](https://pettingzoo.farama.org/environments/sisl/waterworld/) environment ([Parallel](https://pettingzoo.farama.org/api/parallel/)).

After training and evaluation, this script will launch a demo game using human rendering. Trained models are saved and loaded from disk (see SB3's [documentation](https://stable-baselines3.readthedocs.io/en/master/guide/save_format.html) for more information).
We use SuperSuit to create vectorized environments, leveraging multithreading to speed up training (see SB3's [vector environments documentation](https://stable-baselines3.readthedocs.io/en/master/guide/vec_envs.html)).

After training and evaluation, this script will launch a demo game using human rendering. Trained models are saved and loaded from disk (see SB3's [model saving documentation](https://stable-baselines3.readthedocs.io/en/master/guide/save_format.html)).
```{eval-rst}
.. note::
Expand Down

0 comments on commit 3daf8c6

Please sign in to comment.