From 3daf8c6db5be082618b6d70fc2c59133b41b3204 Mon Sep 17 00:00:00 2001 From: Elliot Tower Date: Sat, 15 Jul 2023 01:57:49 -0400 Subject: [PATCH] Update wording for SB3 tutorial (#1027) --- docs/tutorials/sb3/connect_four.md | 3 +-- docs/tutorials/sb3/kaz.md | 8 ++++---- docs/tutorials/sb3/waterworld.md | 3 ++- 3 files changed, 7 insertions(+), 7 deletions(-) diff --git a/docs/tutorials/sb3/connect_four.md b/docs/tutorials/sb3/connect_four.md index 3042b4a27..ed5ee7a24 100644 --- a/docs/tutorials/sb3/connect_four.md +++ b/docs/tutorials/sb3/connect_four.md @@ -8,6 +8,7 @@ This tutorial shows how to train a agents using Maskable [Proximal Policy Optimi It creates a custom Wrapper to convert to a [Gymnasium](https://gymnasium.farama.org/)-like environment which is compatible with [SB3 action masking](https://sb3-contrib.readthedocs.io/en/master/modules/ppo_mask.html). +After training and evaluation, this script will launch a demo game using human rendering. Trained models are saved and loaded from disk (see SB3's [documentation](https://stable-baselines3.readthedocs.io/en/master/guide/save_format.html) for more information). ```{eval-rst} .. note:: @@ -21,8 +22,6 @@ It creates a custom Wrapper to convert to a [Gymnasium](https://gymnasium.farama This wrapper assumes that the action space and observation space is the same for each agent, this assumption may not hold for custom environments. ``` -After training and evaluation, this script will launch a demo game using human rendering. Trained models are saved and loaded from disk (see SB3's [documentation](https://stable-baselines3.readthedocs.io/en/master/guide/save_format.html) for more information). - ## Environment Setup To follow this tutorial, you will need to install the dependencies shown below. It is recommended to use a newly-created virtual environment to avoid dependency conflicts. diff --git a/docs/tutorials/sb3/kaz.md b/docs/tutorials/sb3/kaz.md index c15f055c2..e6522caf9 100644 --- a/docs/tutorials/sb3/kaz.md +++ b/docs/tutorials/sb3/kaz.md @@ -6,9 +6,9 @@ title: "SB3: PPO for Knights-Archers-Zombies" This tutorial shows how to train agents using [Proximal Policy Optimization](https://stable-baselines3.readthedocs.io/en/master/modules/ppo.html) (PPO) on the [Knights-Archers-Zombies](https://pettingzoo.farama.org/environments/butterfly/knights_archers_zombies/) environment ([AEC](https://pettingzoo.farama.org/api/aec/)). -It converts the environment into a Parallel environment and uses SuperSuit to create vectorized environments, leveraging multithreading to speed up training. +We use SuperSuit to create vectorized environments, leveraging multithreading to speed up training (see SB3's [vector environments documentation](https://stable-baselines3.readthedocs.io/en/master/guide/vec_envs.html)). -After training and evaluation, this script will launch a demo game using human rendering. Trained models are saved and loaded from disk (see SB3's [documentation](https://stable-baselines3.readthedocs.io/en/master/guide/save_format.html) for more information). +After training and evaluation, this script will launch a demo game using human rendering. Trained models are saved and loaded from disk (see SB3's [model saving documentation](https://stable-baselines3.readthedocs.io/en/master/guide/save_format.html)). ```{eval-rst} .. note:: @@ -17,9 +17,9 @@ After training and evaluation, this script will launch a demo game using human r ``` ```{eval-rst} -.. warning:: +.. note:: - Because this environment allows agents to spawn and die, it requires using SuperSuit's Black Death wrapper, which provides blank observations to dead agents, rather than removing them from the environment. + This environment allows agents to spawn and die, so it requires using SuperSuit's Black Death wrapper, which provides blank observations to dead agents rather than removing them from the environment. ``` diff --git a/docs/tutorials/sb3/waterworld.md b/docs/tutorials/sb3/waterworld.md index 9101d5703..6e39d2b04 100644 --- a/docs/tutorials/sb3/waterworld.md +++ b/docs/tutorials/sb3/waterworld.md @@ -6,8 +6,9 @@ title: "SB3: PPO for Waterworld (Parallel)" This tutorial shows how to train agents using [Proximal Policy Optimization](https://stable-baselines3.readthedocs.io/en/master/modules/ppo.html) (PPO) on the [Waterworld](https://pettingzoo.farama.org/environments/sisl/waterworld/) environment ([Parallel](https://pettingzoo.farama.org/api/parallel/)). -After training and evaluation, this script will launch a demo game using human rendering. Trained models are saved and loaded from disk (see SB3's [documentation](https://stable-baselines3.readthedocs.io/en/master/guide/save_format.html) for more information). +We use SuperSuit to create vectorized environments, leveraging multithreading to speed up training (see SB3's [vector environments documentation](https://stable-baselines3.readthedocs.io/en/master/guide/vec_envs.html)). +After training and evaluation, this script will launch a demo game using human rendering. Trained models are saved and loaded from disk (see SB3's [model saving documentation](https://stable-baselines3.readthedocs.io/en/master/guide/save_format.html)). ```{eval-rst} .. note::