-
-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: have access to terminal_observation
in the infos.
#233
Conversation
@elliottower Do you mind having a look at this? This is quite important since if you don't have the correct terminal obs, you can't do bootstrapping correctly. |
Hey @KaleabTessera sorry about this, will take a look now. Just approved the ci workflows |
Going to get confirmation from another Farama dev who has more experience with this type of thing than me cause I’m not 100% confident in my ability to check correctness for this |
I see a few errors in the CI but I'm not sure they are anything to do with the changes you've made, testing it locally myself just to confirm and get a better idea (may be something with pettingzoo instead not sure) |
Fixed the bug on PettingZoo's end, queued up the CI to run (will take a little while as the PZ CI is running as well), I think it should all pass now |
Oh right I need to do a pettingzoo release for this to pass, this will have to wait for a few days as we are waiting on the AgileRL tutorial bugs to be fixed |
PZ release is out in case you didn't see, so this should be unblocked now. Approved the workflows just now |
Thanks @elliottower ! I added this code to ensure that a seed used even when reset is called here. Not sure if it is necessary, what do you think? A similar thing is done in stable baselines' vec env |
That sounds reasonable to me, going to get input from another dev to see if it makes sense to them |
Hi, I don't think it makes sense in this case since the underlying env (the parallel PZ env) is only a single entity. |
Jet (@jjshoots) also said he thought this shouldn't be implemented, here's a screenshot of what he said. |
@@ -52,6 +52,13 @@ def step_wait(self): | |||
return self.step(self._saved_actions) | |||
|
|||
def reset(self, seed=None, options=None): | |||
if seed is None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit confused, won't this pass the same seed to all of the environments, therefore, this does the opposite of what you want.
Even so, this should be part of a second PR, not this one if possible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe this will create the same seed. Each time np.random.randint
is called, np internally updates its internal state, meaning that a new seed is created in the next call - docs explaining this.
This would only create the same seed for all envs if np.random.seed
is used in each process to set them all to have the same seed.
Nonetheless, I agree this should be removed from this PR.
So I removed the manual seeding. I agree it should be in another PR and maybe it is not even useful. I still think there is a possible issue for seeding in this scenario :
I think the vec env should create a new seed deterministically - similar to how Jax handles random numbers. I think that is why stable baselines vec env ensure that a new seed is created or used. |
Normally the environment should handle this: the first |
I agree to what @ffelten is saying. That was what I was getting at in the first place but Florian put better words to it. |
This makes sense, thanks @ffelten @jjshoots ! I double-checked the base env I was using and this was the case 👍 So likely this is not an issue if base env handles seeding reasonably. |
Thanks for the input guys. Looks like there's still a pytest failure which I'm not 100% sure why is the case |
Looks like the error is thrown by pettingzoo's The tests passed locally when I ran them. My package versions, p3.10 + :
I also downgraded my numpy to I don't think it is related to this PR, since the |
Thanks for looking into this, I'll try re-running the tests to see if it works. Weird that it passed on one python version but not the other as well. |
Looks like it passed when I re-ran it, going to re-run on python 3.9 to see but yeah it's unrelated to this PR so not a big deal |
Closes #232.
This allows access to
terminal_observation
, which previously wasn't possible when using pettingzoo_env_to_vec_env_v1/MarkovVectorEnv.