-
Notifications
You must be signed in to change notification settings - Fork 8.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor mujoco envs to support dynamic arguments #1304
Refactor mujoco envs to support dynamic arguments #1304
Conversation
this looks neat to me! I am a little hesitant about the exact reproduction of the original mujoco envs - could you add some unit tests to that extent? For instance, instead of replacing the old mujoco envs you could create the -v3 version of them, and add unit tests that compare v2 and v3. If all of them pass, we can create a separate PR removing the non-configurable versions. |
d78f1c6
to
671d82a
Compare
@pzhokhov thanks for the comments. I moved the new environments under v3, and added tests cases to make sure that all the new environments match the corresponding old ones. |
@hartikainen there were some issues with the tests requiring mujoco; I have added respective skip modifiers and made a new PR: #1328. Unfortunately, I squash-merged your branch instead of merging it; which removed commit history. Feel free to either update your PR with the changes (the only changes are in |
671d82a
to
f84c1f3
Compare
@pzhokhov I merged your changes into this branch and at the same time rebased this one from master. Let me know if there's anything else that needs to change. |
Awesome, thanks! Merging |
Nice, thanks @pzhokhov! |
* Refactor gym envs to support dynamic arguments * Fix viewer setup lookat configuration * Add xml_file argument for mujoco envs * Move refactored mujoco envs to their own _v3.py files * Revert "Add xml_file argument for mujoco envs" This reverts commit 4a3a74c. * Revert "Fix viewer setup lookat configuration" This reverts commit 62b4bcf. * Revert "Refactor gym envs to support dynamic arguments" This reverts commit b2a439f. * Fix v3 SwimmerEnv info * Regiter v3 mujoco environments * Implement v2 to v3 conversion test * Add extra step info the v3 environments * polish the new unit tests a little bit
I'm glad to see that gym finally supports dynamic parameterization of the environments (#1301).
We've been using custom parameterizable versions of the basic mujoco environments in our softlearning codebase for a while now, and I thought I would share those environments here as well. The implementations are imo a bit more readable and they support parameterization of the environments via init args. For example, certain environments can be made to not terminate and all the rewards and costs can be tuned via init arguments. With the default parameters, these environments are exact copies of the original environments (I've verified the step and reset functions by testing them against the old implementations).
Currently, the only refactored environments are Ant, Hopper, HalfCheetah, Humanoid, Swimmer, and Walker2d. I'm happy to refactor the rest of the mujoco environments if these seem useful.