Farama-Foundation · WillDudley · Oct 8, 2022 · Oct 8, 2022 · Oct 8, 2022 · Oct 8, 2022
diff --git a/README.md b/README.md
@@ -33,14 +33,14 @@ Get started with PettingZoo by following [the PettingZoo tutorial](https://petti
 
 PettingZoo model environments as [*Agent Environment Cycle* (AEC) games](https://arxiv.org/pdf/2009.14471.pdf), in order to be able to cleanly support all types of multi-agent RL environments under one API and to minimize the potential for certain classes of common bugs.
 
-Using environments in PettingZoo is very similar to Gym, i.e. you initialize an environment via:
+Using environments in PettingZoo is very similar to Gymnasium, i.e. you initialize an environment via:
 
 ```python
 from pettingzoo.butterfly import pistonball_v6
 env = pistonball_v6.env()
 ```
 
-Environments can be interacted with in a manner very similar to Gym:
+Environments can be interacted with in a manner very similar to Gymnasium:
 
 ```python
 env.reset()

diff --git a/docs/api/core.md b/docs/api/core.md
@@ -1,83 +1,91 @@
 # Core API
 
+## AECEnv
+
 ```{eval-rst}
 .. currentmodule:: pettingzoo.utils.env
 
 .. autoclass:: AECEnv
 
-    .. py:attribute:: agents
+```
 
-        A list of the names of all current agents, typically integers. These may be changed as an environment progresses (i.e. agents can be added or removed).
+### Attributes
 
-        :type: list[AgentID]
 
-    .. py:attribute:: num_agents
+```{eval-rst}
 
-        The length of the agents list.
+.. autoattribute:: AECEnv.agents
 
-        :type: int
+    A list of the names of all current agents, typically integers. These may be changed as an environment progresses (i.e. agents can be added or removed).
 
-    .. py:attribute:: possible_agents
+    :type: List[AgentID]
 
-        A list of all possible_agents the environment could generate. Equivalent to the list of agents in the observation and action spaces. This cannot be changed through play or resetting.
+.. autoattribute:: AECEnv.num_agents
 
-        :type: list[AgentID]
+    The length of the agents list.
 
-    .. py:attribute:: max_num_agents
+.. autoattribute:: AECEnv.possible_agents
 
-        The length of the possible_agents list.
+    A list of all possible_agents the environment could generate. Equivalent to the list of agents in the observation and action spaces. This cannot be changed through play or resetting.
 
-        :type: int
+    :type: List[AgentID]
 
-    .. py:attribute:: agent_selection
+.. autoattribute:: AECEnv.max_num_agents
 
-        An attribute of the environment corresponding to the currently selected agent that an action can be taken for.
+    The length of the possible_agents list.
 
-        :type: AgentID
+.. autoattribute:: AECEnv.agent_selection
 
-    .. py:attribute:: dones
+    An attribute of the environment corresponding to the currently selected agent that an action can be taken for.
 
-        A dict of the done state of every current agent at the time called, keyed by name. `last()` accesses this attribute. Note that agents can be added or removed from this dict. The returned dict looks like::
+    :type: AgentID
 
-        dones = {0:[first agent done state], 1:[second agent done state] ... n-1:[nth agent done state]}
+.. autoattribute:: AECEnv.dones
 
-        :type: Dict[AgentID, bool]
+    A dict of the done state of every current agent at the time called, keyed by name. `last()` accesses this attribute. Note that agents can be added or removed from this dict. The returned dict looks like::
 
-    .. py:attribute:: rewards
+    dones = {0:[first agent done state], 1:[second agent done state] ... n-1:[nth agent done state]}
 
-        A dict of the rewards of every current agent at the time called, keyed by name. Rewards the instantaneous reward generated after the last step. Note that agents can be added or removed from this attribute. `last()` does not directly access this attribute, rather the returned reward is stored in an internal variable. The rewards structure looks like::
+    :type: Dict[AgentID, bool]
 
-        {0:[first agent reward], 1:[second agent reward] ... n-1:[nth agent reward]}
+.. autoattribute:: AECEnv.rewards
 
-        :type: Dict[AgentID, float]
+    A dict of the rewards of every current agent at the time called, keyed by name. Rewards the instantaneous reward generated after the last step. Note that agents can be added or removed from this attribute. `last()` does not directly access this attribute, rather the returned reward is stored in an internal variable. The rewards structure looks like::
 
-    .. py:attribute:: infos
+    {0:[first agent reward], 1:[second agent reward] ... n-1:[nth agent reward]}
 
-        A dict of info for each current agent, keyed by name. Each agent's info is also a dict. Note that agents can be added or removed from this attribute. `last()` accesses this attribute. The returned dict looks like::
+    :type: Dict[AgentID, float]
 
-        infos = {0:[first agent info], 1:[second agent info] ... n-1:[nth agent info]}
+.. autoattribute:: AECEnv.infos
 
-        :type: Dict[AgentID, Dict[str, Any]]
+    A dict of info for each current agent, keyed by name. Each agent's info is also a dict. Note that agents can be added or removed from this attribute. `last()` accesses this attribute. The returned dict looks like::
 
-    .. py:attribute:: observation_spaces
+    infos = {0:[first agent info], 1:[second agent info] ... n-1:[nth agent info]}
 
-        A dict of the observation spaces of every agent, keyed by name. This cannot be changed through play or resetting.
+    :type: Dict[AgentID, Dict[str, Any]]
 
-        :type: Dict[AgentID, gym.spaces.Space]
+.. autoattribute:: AECEnv.observation_spaces
 
-    .. py:attribute:: action_spaces
+    A dict of the observation spaces of every agent, keyed by name. This cannot be changed through play or resetting.
 
-        A dict of the action spaces of every agent, keyed by name. This cannot be changed through play or resetting.
+    :type: Dict[AgentID, gymnasium.spaces.Space]
 
-        :type: Dict[AgentID, gym.spaces.Space]
+.. autoattribute:: AECEnv.action_spaces
 
-    .. automethod:: step
-    .. automethod:: reset
-    .. automethod:: observe
-    .. automethod:: render
-    .. automethod:: seed
-    .. automethod:: close
+    A dict of the action spaces of every agent, keyed by name. This cannot be changed through play or resetting.
 
+    :type: Dict[AgentID, gymnasium.spaces.Space]
+```
+
+### Methods
+
+```{eval-rst}
+.. automethod:: AECEnv.step
+.. automethod:: AECEnv.reset
+.. automethod:: AECEnv.observe
+.. automethod:: AECEnv.render
+.. automethod:: AECEnv.seed
+.. automethod:: AECEnv.close
 
 ```
 
diff --git a/docs/api/parallel.md b/docs/api/parallel.md
@@ -2,7 +2,7 @@
 
 In addition to the main API, we have a secondary parallel API for environments where all agents have simultaneous actions and observations. An environment with parallel API support can be created via `<game>.parallel_env()`. This API is based around the paradigm of *Partially Observable Stochastic Games* (POSGs) and the details are similar to [RLLib's MultiAgent environment specification](https://docs.ray.io/en/latest/rllib-env.html#multi-agent-and-hierarchical), except we allow for different observation and action spaces between the agents.
 
-### Example Usage
+## Example Usage
 
 Environments can be interacted with as follows:
 
@@ -15,6 +15,8 @@ for step in range(max_cycles):
     observations, rewards, terminations, truncations, infos = parallel_env.step(actions)
 ```
 
+## ParallelEnv
+
 ```{eval-rst}
 .. currentmodule:: pettingzoo.utils.env
 

diff --git a/docs/api/pz_wrappers.md b/docs/api/pz_wrappers.md
@@ -33,12 +33,17 @@ env = from_parallel(env)
 
 We wanted our pettingzoo environments to be both easy to use and easy to implement. To combine these, we have a set of simple wrappers which provide input validation and other convenient reusable logic.
 
-* `BaseWrapper`: All AECEnv wrappers should inherit from this base class
-* `TerminateIllegalWrapper`: Handles illegal move logic for classic games
-* `CaptureStdoutWrapper`: Takes an environment which prints to terminal, and gives it an `ansi` render mode where it captures the terminal output and returns it as a string instead.
-* `AssertOutOfBoundsWrapper`: Asserts if the action given to step is outside of the action space. Applied in PettingZoo environments with discrete action spaces.
-* `ClipOutOfBoundsWrapper`: Clips the input action to fit in the continuous action space (emitting a warning if it does so). Applied to continuous environments in pettingzoo.
-* `OrderEnforcingWrapper`: Gives a sensible error message if function calls or attribute access are in a disallowed order, for example if step() is called before reset(), or the .dones attribute is accessed before reset(), or if seed() is called and then step() is used before reset() is called again (reset must be called after seed()). Applied to all PettingZoo environments.
+```{eval-rst}
+.. currentmodule:: pettingzoo.utils.wrappers
+
+.. autoclass:: BaseWrapper
+.. autoclass:: TerminateIllegalWrapper
+.. autoclass:: CaptureStdoutWrapper
+.. autoclass:: AssertOutOfBoundsWrapper
+.. autoclass:: ClipOutOfBoundsWrapper
+.. autoclass:: OrderEnforcingWrapper
+
+```
 
 You can apply these wrappers to your environment in a similar manner to the below example: