[Bug]: PPO doesn't work correctly with MultiDiscrete action spaces with "start" parameter #1718

AlejandroCN7 · 2023-10-18T10:33:26Z

🐛 Bug

Hello!

I am trying to run the PPO algorithm with one of the environments we have created in Sinergym.

The point is that I have defined a MultiDiscrete action space (which according to the documentation is compatible), but the actions performed do not take into account the "start" parameter of the space definition.

As can be seen in the Traceback, the last action variable should be an integer value between 25 and 35, but it takes the values from 0 to 10.

I do not include test code in order not to increase the complexity of the problem, since I am using Sinergym as I have commented. The problem is simpler, and it can be seen in Traceback.

Code example

No response

Relevant log output / Error message

[ENVIRONMENT] (WARNING) : Step: The action [1 0 0 1 1 1] is not correct for the Action Space MultiDiscrete([ 2  2  2  2  2 11], start=[ 0  0  0  0  0 25])
[ENVIRONMENT] (WARNING) : Step: The action [1 1 1 0 0 5] is not correct for the Action Space MultiDiscrete([ 2  2  2  2  2 11], start=[ 0  0  0  0  0 25])
[ENVIRONMENT] (WARNING) : Step: The action [0 0 1 1 0 9] is not correct for the Action Space MultiDiscrete([ 2  2  2  2  2 11], start=[ 0  0  0  0  0 25])
[ENVIRONMENT] (WARNING) : Step: The action [1 1 1 0 1 8] is not correct for the Action Space MultiDiscrete([ 2  2  2  2  2 11], start=[ 0  0  0  0  0 25])
[ENVIRONMENT] (WARNING) : Step: The action [0 1 0 1 1 8] is not correct for the Action Space MultiDiscrete([ 2  2  2  2  2 11], start=[ 0  0  0  0  0 25])
[ENVIRONMENT] (WARNING) : Step: The action [0 0 0 1 0 2] is not correct for the Action Space MultiDiscrete([ 2  2  2  2  2 11], start=[ 0  0  0  0  0 25])
[ENVIRONMENT] (WARNING) : Step: The action [0 0 1 1 0 3] is not correct for the Action Space MultiDiscrete([ 2  2  2  2  2 11], start=[ 0  0  0  0  0 25])
[ENVIRONMENT] (WARNING) : Step: The action [0 1 0 0 0 7] is not correct for the Action Space MultiDiscrete([ 2  2  2  2  2 11], start=[ 0  0  0  0  0 25])
[ENVIRONMENT] (WARNING) : Step: The action [0 1 0 0 1 3] is not correct for the Action Space MultiDiscrete([ 2  2  2  2  2 11], start=[ 0  0  0  0  0 25])
...

System Info

SB3 intalled by pip (stable-baselines3==2.0.0).
Python 3.10.6
torch==2.0.1
gymnasium==0.29.1

Checklist

I have checked that there is no similar issue in the repo
I have read the documentation
I have provided a minimal and working example to reproduce the bug
I have checked my env using the env checker
I've used the markdown code blocks for both code and stack traces.

araffin · 2023-10-18T10:55:52Z

Probably similar to #1295, we need to update the env checker

edit: correct issue is #913 (comment)

AlejandroCN7 added the custom gym env Issue related to Custom Gym Env label Oct 18, 2023

AlejandroCN7 changed the title ~~[Bug]: PPO doen't work correctly with MultiDiscrete action spaces with "start" parameter~~ [Bug]: PPO doesn't work correctly with MultiDiscrete action spaces with "start" parameter Oct 18, 2023

araffin added more information needed Please fill the issue template completely check the checklist You have checked the required items in the checklist but you didn't do what is written... labels Oct 18, 2023

araffin mentioned this issue Feb 18, 2024

Update env checker for spaces with non-zero start #1845

Merged

16 tasks

araffin closed this as completed in #1845 Feb 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: PPO doesn't work correctly with MultiDiscrete action spaces with "start" parameter #1718

[Bug]: PPO doesn't work correctly with MultiDiscrete action spaces with "start" parameter #1718

AlejandroCN7 commented Oct 18, 2023 •

edited

Loading

araffin commented Oct 18, 2023 •

edited

Loading

[Bug]: PPO doesn't work correctly with MultiDiscrete action spaces with "start" parameter #1718

[Bug]: PPO doesn't work correctly with MultiDiscrete action spaces with "start" parameter #1718

Comments

AlejandroCN7 commented Oct 18, 2023 • edited Loading

🐛 Bug

Code example

Relevant log output / Error message

System Info

Checklist

araffin commented Oct 18, 2023 • edited Loading

AlejandroCN7 commented Oct 18, 2023 •

edited

Loading

araffin commented Oct 18, 2023 •

edited

Loading