-
-
Notifications
You must be signed in to change notification settings - Fork 422
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
c082f1a
commit bb7ae0b
Showing
15 changed files
with
411 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
--- | ||
title: "(WIP) Creating Environments: Repository Structure" | ||
--- | ||
|
||
# (WIP) Creating Environments: Repository Structure | ||
|
||
## Introduction | ||
|
||
Welcome to the first of five short tutorials, guiding you through the process of creating your own PettingZoo environment, from conception to deployment. | ||
|
||
We will be creating a parallel environment, meaning that each agent acts simultaneously. | ||
|
||
Before thinking about the environment logic, we should understand the structure of environment repositories. | ||
|
||
## Tree structure | ||
Environment repositories are usually laid out using the following structure: | ||
|
||
Custom-Environment | ||
├── custom-environment | ||
└── env | ||
└── custom_environment.py | ||
└── custom_environment_v0.py | ||
├── README.md | ||
└── requirements.txt | ||
|
||
- `/custom-environment/env` is where your environment will be stored, along with any helper functions (in the case of a complicated environment). | ||
- `/custom-environment/custom_environment_v0.py` is a file that imports the environment - we use the file name for environment version control. | ||
- `/README.md` is a file used to describe your environment. | ||
- `/requirements.txt` is a file used to keep track of your environment dependencies. At the very least, `pettingzoo` should be in there. **Please version control all your dependencies via `==`**. | ||
|
||
### Advanced: Additional (optional) files | ||
The above file structure is minimal. A more deployment-ready environment would include | ||
- `/docs/` for documentation, | ||
- `/setup.py` for packaging, | ||
- `/custom-environment/__init__.py` for depreciation handling, and | ||
- Github actions for continuous integration of environment tests. | ||
|
||
Implementing these are outside the scope of this tutorial. | ||
|
||
## Skeleton code | ||
The entirety of your environment logic is stored within `/custom-environment/env` | ||
|
||
```{eval-rst} | ||
.. literalinclude:: ../../../tutorials/EnvironmentCreation/1-SkeletonCreation.py | ||
:language: python | ||
:caption: /custom-environment/env/custom_environment.py | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,23 @@ | ||
--- | ||
title: "(WIP) Creating Environments: Environment Logic" | ||
--- | ||
|
||
# (WIP) Creating Environments: Environment Logic | ||
|
||
## Introduction | ||
|
||
Now that we have a basic understanding of the structure of environment repositories, we can start thinking about the fun part - environment logic! | ||
|
||
For this tutorial, we will be creating a two-player game consisting of a prisoner, trying to escape, and a guard, trying to catch the prisoner. This game will be played on a 7x7 grid, where: | ||
- The prisoner starts in the top left corner, | ||
- the guard starts in the bottom right corner, | ||
- the escape door is randomly placed in the middle of the grid, and | ||
- Both the prisoner and the guard can move in any of the four cardinal directions (up, down, left, right). | ||
|
||
## Code | ||
|
||
```{eval-rst} | ||
.. literalinclude:: ../../../tutorials/EnvironmentCreation/2-AddingGameLogic.py | ||
:language: python | ||
:caption: /custom-environment/env/custom_environment.py | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
--- | ||
title: "(WIP) Creating Environments: Action Masking" | ||
--- | ||
|
||
# (WIP) Creating Environments: Action Masking | ||
|
||
## Introduction | ||
|
||
In many environments, it is natural for some actions to be invalid at certain times. For example, in a game of chess, it is impossible to move a pawn forward if it is already at the front of the board. In PettingZoo, we can use action masking to prevent invalid actions from being taken. | ||
|
||
Action masking is a more natural way of handling invalid actions than having an action have no effect, which was how we handled bumping into walls in the previous tutorial. | ||
|
||
## Code | ||
|
||
```{eval-rst} | ||
.. literalinclude:: ../../../tutorials/EnvironmentCreation/3-ActionMasking.py | ||
:language: python | ||
:caption: /custom-environment/env/custom_environment.py | ||
:lines: -147 | ||
``` |
19 changes: 19 additions & 0 deletions
19
docs/tutorials/environmentcreation/4-testing-your-environment.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,19 @@ | ||
--- | ||
title: "(WIP) Creating Environments: Testing Your Environment" | ||
--- | ||
|
||
# (WIP) Creating Environments: Testing Your Environment | ||
|
||
## Introduction | ||
|
||
Now that our environment is complete, we can test it to make sure it works as intended. PettingZoo has a built-in testing suite that can be used to test your environment. | ||
|
||
## Code | ||
(add this code below the rest of the code in the file) | ||
|
||
```{eval-rst} | ||
.. literalinclude:: ../../../tutorials/EnvironmentCreation/3-ActionMasking.py | ||
:language: python | ||
:caption: /custom-environment/env/custom_environment.py | ||
:lines: 148- | ||
``` |
Empty file.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,21 @@ | ||
from pettingzoo.utils.env import ParallelEnv | ||
|
||
|
||
class CustomEnvironment(ParallelEnv): | ||
def __init__(self): | ||
pass | ||
|
||
def reset(self, seed=None, return_info=False, options=None): | ||
pass | ||
|
||
def step(self, actions): | ||
pass | ||
|
||
def render(self): | ||
pass | ||
|
||
def observation_space(self, agent): | ||
return self.observation_spaces[agent] | ||
|
||
def action_space(self, agent): | ||
return self.action_spaces[agent] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,114 @@ | ||
import functools | ||
import random | ||
from copy import copy | ||
|
||
import numpy as np | ||
from gymnasium.spaces import Discrete, MultiDiscrete | ||
|
||
from pettingzoo.utils.env import ParallelEnv | ||
|
||
|
||
class CustomEnvironment(ParallelEnv): | ||
def __init__(self): | ||
self.escape_y = None | ||
self.escape_x = None | ||
self.guard_y = None | ||
self.guard_x = None | ||
self.prisoner_y = None | ||
self.prisoner_x = None | ||
self.timestep = None | ||
self.possible_agents = ["prisoner", "guard"] | ||
|
||
def reset(self, seed=None, return_info=False, options=None): | ||
self.agents = copy(self.possible_agents) | ||
self.timestep = 0 | ||
|
||
self.prisoner_x = 0 | ||
self.prisoner_y = 0 | ||
|
||
self.guard_x = 7 | ||
self.guard_y = 7 | ||
|
||
self.escape_x = random.randint(2, 5) | ||
self.escape_y = random.randint(2, 5) | ||
|
||
observations = { | ||
a: ( | ||
self.prisoner_x + 7 * self.prisoner_y, | ||
self.guard_x + 7 * self.guard_y, | ||
self.escape_x + 7 * self.escape_y, | ||
) | ||
for a in self.agents | ||
} | ||
return observations | ||
|
||
def step(self, actions): | ||
# Execute actions | ||
prisoner_action = actions["prisoner"] | ||
guard_action = actions["guard"] | ||
|
||
if prisoner_action == 0 and self.prisoner_x > 0: | ||
self.prisoner_x -= 1 | ||
elif prisoner_action == 1 and self.prisoner_x < 6: | ||
self.prisoner_x += 1 | ||
elif prisoner_action == 2 and self.prisoner_y > 0: | ||
self.prisoner_y -= 1 | ||
elif prisoner_action == 3 and self.prisoner_y < 6: | ||
self.prisoner_y += 1 | ||
|
||
if guard_action == 0 and self.guard_x > 0: | ||
self.guard_x -= 1 | ||
elif guard_action == 1 and self.guard_x < 6: | ||
self.guard_x += 1 | ||
elif guard_action == 2 and self.guard_y > 0: | ||
self.guard_y -= 1 | ||
elif guard_action == 3 and self.guard_y < 6: | ||
self.guard_y += 1 | ||
|
||
# Check termination conditions | ||
terminations = {a: False for a in self.agents} | ||
rewards = {a: 0 for a in self.agents} | ||
if self.prisoner_x == self.guard_x and self.prisoner_y == self.guard_y: | ||
rewards = {"prisoner": -1, "guard": 1} | ||
terminations = {a: True for a in self.agents} | ||
|
||
elif self.prisoner_x == self.escape_x and self.prisoner_y == self.escape_y: | ||
rewards = {"prisoner": 1, "guard": -1} | ||
terminations = {a: True for a in self.agents} | ||
|
||
# Check truncation conditions (overwrites termination conditions) | ||
truncations = {a: False for a in self.agents} | ||
if self.timestep > 100: | ||
rewards = {"prisoner": 0, "guard": 0} | ||
truncations = {"prisoner": True, "guard": True} | ||
self.timestep += 1 | ||
|
||
# Get observations | ||
observations = { | ||
a: ( | ||
self.prisoner_x + 7 * self.prisoner_y, | ||
self.guard_x + 7 * self.guard_y, | ||
self.escape_x + 7 * self.escape_y, | ||
) | ||
for a in self.agents | ||
} | ||
|
||
# Get dummy infos (not used in this example) | ||
infos = {a: {} for a in self.agents} | ||
|
||
return observations, rewards, terminations, truncations, infos | ||
|
||
def render(self): | ||
grid = np.zeros((7, 7)) | ||
grid[self.prisoner_y, self.prisoner_x] = "P" | ||
grid[self.guard_y, self.guard_x] = "G" | ||
grid[self.escape_y, self.escape_x] = "E" | ||
print(f"{grid} \n") | ||
|
||
@functools.lru_cache(maxsize=None) | ||
def observation_space(self, agent): | ||
return MultiDiscrete([7 * 7 - 1] * 3) | ||
|
||
@functools.lru_cache(maxsize=None) | ||
def action_space(self, agent): | ||
return Discrete(4) |
Oops, something went wrong.