[Bug Report] classic.chess: Mismatch between observation shape in documentation and in code #922

x0wllaar · 2023-03-27T18:39:54Z

Describe the bug

The documentation (both on the website, and in the code) says that for the Chess environment, the observation shape (by default) is an 8x8x20 tensor that contains a "snapshot" representation of the current board state, without any history of the previous board states.

However, in my observations, with PettingZoo 1.22.3 and chess v5 environment, the actual observation shape is 8x8x111, and, as far as I understand from the code, it contains the previous board states. As far as I understand from the code, it's also impossible to turn this behavior off and return to the 8x8x20 tensors for board representation.

Code example

from pettingzoo.classic import chess_v5
env = chess_v5.env()
obs = env.unwrapped.observe("player_0")["observation"]
print(obs.shape)
#Expected: (8, 8, 20)
#Got: (8, 8, 111)

System info

PettingZoo was installed from pip.

Version of PettingZoo: 1.22.3

OS: Rocky Linux 9 in Docker, kernel Linux 8dee67faa3f8 6.2.2-x64v3-xanmod1 #0~20230303.0f2ddc7 SMP PREEMPT_DYNAMIC Sat Mar 4 00:56:43 UTC x86_64 x86_64 x86_64 GNU/Linux

Python version: Python 3.9.14

Additional context

I would love this issue to be resolved, or a method to get the 8x8x20 tensors back, as for my application, the RL agent will not have any access to the board states, it should learn to play using the "snapshot" board representations and no history.

Checklist

I have checked that there is no similar issue in the repo

The text was updated successfully, but these errors were encountered:

jacob975 · 2023-04-03T02:06:25Z

The same, then I also found the returned observation is not oriented to the current nor specified agent.

Code Example

from pettingzoo.classic import chess_v5
env = chess_v5.env(render_mode='ansi')
env.reset()
print(env.render())
action_mask = env.observe(f'player_0')['action_mask']
observation = env.observe(f'player_0')['observation']
table = observation[:,:,7] # The pawns
print("In the view of Player 0")
print(table)
possible_actions = np.where(action_mask>0)[0]
print(possible_actions)
action = 77
print(action)
env.env.step(action)
# After an action
print("------------------")
print(env.render())
observation = env.observe(f'player_0')['observation']
table = observation[:,:,7]
print("In the view of Player 0")
print(table)
observation = env.observe(f'player_1')['observation']
table = observation[:,:,7]
print("In the view of Player 1")
print(table)

The Return

r n b q k b n r
p p p p p p p p
. . . . . . . .
. . . . . . . .
. . . . . . . .
. . . . . . . .
P P P P P P P P
R N B Q K B N R
In the view of Player 0
[[False False False False False False False False]
 [False False False False False False False False]
 [False False False False False False False False]
 [False False False False False False False False]
 [False False False False False False False False]
 [False False False False False False False False]
 [False False False False False False False False]
 [False False False False False False False False]]
[  77   85  643  645  661  669 1245 1253 1829 1837 2413 2421 2997 3005
 3563 3565 3581 3589 4165 4173]
77
------------------
r n b q k b n r
p p p p p p p p
. . . . . . . .
. . . . . . . .
. . . . . . . .
P . . . . . . .
. P P P P P P P
R N B Q K B N R
In the view of Player 0
[[False False False False False False False False]
 [False False False False False False False False]
 [False False False False False False False False]
 [False False False False False False False False]
 [False False False False False False False False]
 [False False False False False False False  True]
 [ True  True  True  True  True  True  True False]
 [False False False False False False False False]]
In the view of Player 1
[[False False False False False False False False]
 [False False False False False False False False]
 [False False False False False False False False]
 [False False False False False False False False]
 [False False False False False False False False]
 [False False False False False False False  True]
 [ True  True  True  True  True  True  True False]
 [False False False False False False False False]]

By the way, the return observation seems repeating the channel 7-20 eight times, such that the wrong observation having 111 channels.

Tientjie-san · 2023-04-20T11:11:03Z

Facing the same issues, looks like some environments and/or documentation is outdated

benblack769 · 2023-05-02T01:45:43Z

Hmm. After a quick look, it is probably both a problem with both the documentation and implementation.

The documentation is out of date, because we decided to switch to alphazero style frame stacking a long time ago, and the documentation didn't catch up. This can be fixed by just noting that channels 7-19 are repeated 8 times, storing the game history for each player.

I think the main error in the implementation is this line

https://github.com/Farama-Foundation/PettingZoo/blob/master/pettingzoo/classic/chess/chess.py#L266

current_agent used to be an integer, which was 0 or 1, so the boolean condition here https://github.com/Farama-Foundation/PettingZoo/blob/master/pettingzoo/classic/chess/chess_utils.py#L205 made sense.

Now, it is a string, so that condition does not work.

The second error, is that the observe function does not flip the board_history depending on which agent gets the observation, so the off-turn agent gets the wrong observation.

elliottower · 2023-05-02T06:12:44Z

Hmm. After a quick look, it is probably both a problem with both the documentation and implementation.

The documentation is out of date, because we decided to switch to alphazero style frame stacking a long time ago, and the documentation didn't catch up. This can be fixed by just noting that channels 7-19 are repeated 8 times, storing the game history for each player.

I think the main error in the implementation is this line

https://github.com/Farama-Foundation/PettingZoo/blob/master/pettingzoo/classic/chess/chess.py#L266

current_agent used to be an integer, which was 0 or 1, so the boolean condition here https://github.com/Farama-Foundation/PettingZoo/blob/master/pettingzoo/classic/chess/chess_utils.py#L205 made sense.

Now, it is a string, so that condition does not work.

The second error, is that the observe function does not flip the board_history depending on which agent gets the observation, so the off-turn agent gets the wrong observation.

@Tientjie-san @jacob975 would either of you two be willing to submit a PR fixing these issues? Appreciate the issue and making us aware of the bugs. Otherwise let us know and we can have somebody else do it.

jacob975 · 2023-05-03T02:37:54Z

@elliottower I am interested in fixing the orientation issues. @benblack769 thank you for your nice suggestion. Then, I would suggest not to apply mirror in raw_env.step because it might mess up the board history. On the other hand, we can add a few more lines in raw_env.observe to mirror or transform the board history.

class raw_env(AECEnv):
    ...
    def observe(self, agent):
        ...
        observation = np.dstack((observation[:, :, :7], self.board_history)) # (8x8x111)
        # We need to swap the white 6 channels with black 6 channels
        if self.possible_agents.index(agent):
            # Section 1: Mirror the board
            observation  = np.flip(observation, axis=0)
            # Section 2: Swap the white 6 channels with the black 6 channels
            for i in range(1,9):
                tmp = observation[..., 13*i-6 : 13*i].copy()
                observation[..., 13*i-6 : 13*i] = observation[..., 13*i : 13*i+6]
                observation[..., 13*i : 13*i+6] = tmp
        ...
        return {"observation": observation, "action_mask": action_mask}

For raw_env.render(), it always render the board in the player 0 view. Althought this part is not consistent to the documentation, I think it is better leave it like this because the output of this function is not for RL agents but for humans. Maybe to put more description to this in documentation.

elliottower · 2023-05-03T14:24:54Z

Definitely agree the rendering should nd consistent, but personally don’t know enough about how other implementations do chess to know if it’s a good idea to swap for observations. The comments in that file I believe say the swapping is so self play agents can learn better because the pieces always start on the bottom. But you could look into other libraries or papers potentially to check ( maybe @benblack769 has more insight). OpenSpiel chess for example, RLlib has a LeelaChessZero implementation and their own chess thing I think

elliottower · 2023-05-12T17:29:08Z

@jacob975 if you're still interested in fixing this, it would be great if you could join the discord and shoot me a DM so we can coordiante.

elliottower · 2023-07-11T07:09:55Z

Fixed in #1004

x0wllaar added the bug Something isn't working label Mar 27, 2023

elliottower closed this as completed Jul 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug Report] classic.chess: Mismatch between observation shape in documentation and in code #922

[Bug Report] classic.chess: Mismatch between observation shape in documentation and in code #922

x0wllaar commented Mar 27, 2023

jacob975 commented Apr 3, 2023

Tientjie-san commented Apr 20, 2023

benblack769 commented May 2, 2023

elliottower commented May 2, 2023

jacob975 commented May 3, 2023

elliottower commented May 3, 2023

elliottower commented May 12, 2023

elliottower commented Jul 11, 2023

[Bug Report] classic.chess: Mismatch between observation shape in documentation and in code #922

[Bug Report] classic.chess: Mismatch between observation shape in documentation and in code #922

Comments

x0wllaar commented Mar 27, 2023

Describe the bug

Code example

System info

Additional context

Checklist

jacob975 commented Apr 3, 2023

Code Example

The Return

Tientjie-san commented Apr 20, 2023

benblack769 commented May 2, 2023

elliottower commented May 2, 2023

jacob975 commented May 3, 2023

elliottower commented May 3, 2023

elliottower commented May 12, 2023

elliottower commented Jul 11, 2023