-
-
Notifications
You must be signed in to change notification settings - Fork 429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update hanabi to use shimmy[openspiel] instead of Hanabi Learning Env #933
Conversation
The render tests work with the most recent version of shimmy where I enabled rendering for openspiel envs, once that is released this will pass. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some minor questions, but other than that fairly sane.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some general comments, but otherwise LGTM.
if isinstance(observation_0, dict) and "observation" in observation_0: | ||
observation_0 = observation_0["observation"] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks hacky but, I guess it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise I would've had to change like 4 different other calls which use observation_0 and expect it to be an obs, rather than a dict. This is a bit unclear to understand but what it does I think is fine, it results in errors otherwise.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Edit: once tests pass.
Given the complexity of the PR, there seems to be several different PRs happening here, updating |
Yea makes sense, I was thinking the same and believe I mentioned it in the pz dev channel about how it should probably be split unless we want to make an exception. |
It is simpler 90% of the time to have several prs if you want to fix several issues unless you are changing a total of 20 lines or less |
Closing in favor of separate PRs |
Description
This PR updates hanabi to use OpenSpiel's Hanabi implementation rather than hanabi learning environment. This implementation is in C++ and has been updated as recently as 4 weeks ago, whereas hanabi learning environment has not been updated since 2021. There is a flag in openspiel to use the hanabi learning environment but we don't have it enabled as there's not much point. The hanabi learning environment seems to be very limited from what I can tell, and this is a much more full featured implementation. It is also listed as thoroughly tested under openspiel's available games list: https://openspiel.readthedocs.io/en/latest/games.html.
The previous hanabi implementation (v4) used very different ways of doing things from all of the other games in pettingzoo.classic, which makes sense as it was last updated in pettingzoo v1.8.0.
This PR also updates the API test and render test code to properly handle action masking from pettingzoo's classic games, contained in obs['action_mask']. Previously, the code used numpy to randomly sample from the action space, but this led to errors if the action mask has zero valid actions, whereas my code using action_space(agent).sample(mask) works how we expect it to, generating a valid action in the space, which then causes the environment to terminate (as the action is illegal). I did some other cleanups but the tests are still passing, mostly just code style changes and accounting for the possibility of an action mask.
Unfortunately I was not able to get hanabi's seed test to work, which is bizarre because in Shimmy the openspiel seed tests actually do work but the API test doesn't. I think the API test is much more important to get working and I spent the majority of my time on that, but I can delay merging this and see if it's possible to fix the errors. The seed tests in PettingZoo should probably be updated as well, they are very confusing to debug and totally unlike the testing used in gymnasium and which we did in shimmy to test pettingzoo environments, which I think was much cleaner (adapted gymnasium's seed test code).
I think if I do update the seed tests to match these other repos they may end up passing for Hanabi, but I'm not exactly sure what the current seed tests do in comparison to it (wouldn't want to change it and remove functionality), so maybe someone could help figure that out before I make those changes.
Fixes # (issue), Depends on # (pull request)
Type of change
Screenshots
Checklist:
pre-commit
checks withpre-commit run --all-files
(seeCONTRIBUTING.md
instructions to set it up)pytest -v
and no errors are present.pytest -v
has generated that are related to my code to the best of my knowledge.