-
Notifications
You must be signed in to change notification settings - Fork 8.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change mujoco_py bindings for mujoco Deepmind bindings #2762
Conversation
@@ -14,19 +14,25 @@ | |||
"accept-rom-license": ["autorom[accept-rom-license]~=0.4.2"], | |||
"box2d": ["box2d-py==2.3.5", "pygame==2.1.0"], | |||
"classic_control": ["pygame==2.1.0"], | |||
"mujoco": ["mujoco_py>=1.50, <2.0"], | |||
"mujoco_py": ["mujoco_py<2.2,>=2.1"], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have mild concerns around gym
declaring a dependency on both mujoco
and mujoco-py
. In principle, this isn't a problem as long as only one gets imported, but if someone tries to import both it's likely that they'll run into inscrutable errors. Would you consider declaring mujoco-py
as an optional dependency (extras_require
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ignore me: this is already in extras_require
.
if self._mujoco_bindings.__name__ != "mujoco_py": | ||
self._mujoco_bindings.mj_rnePostConstraint(self.model, self.data) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not commented out as in ab38034, which causes regression in Ant-v4. See https://wandb.ai/costa-huang/cleanRL/reports/-4-19-MuJoCo-v4-vs-v2-CleanRL-s-PPO--VmlldzoxODYzODM0, https://github.com/vwxyzjn/validate-new-gym-mujoco-envs
Our previous benchmark with ab38034 fixes the performance in Ant-v4 as shown below, but I don’t know how it affects performance in other envs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Introducing bugs on purpose to avoid regression of performance makes no sense to me. It seems way more appropriate to just remove the contact forces from the observation space and reward, instead of carrying out computations pretending it is doing what it is supposed to do. It is just utterly confusing for actual roboticist willing to use mujoco with gym.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with @duburcqa
Doesn't this change make the benchmark compatible with MuJoCo < 2.0? Or do all those three (mujoco-py with MuJoCo < 2.0, mujoco-py with MuJoCo >= 2.0, and mujoco) produce different results?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Theoretically, for this change in particular it is the same as mujoco < 2.0, but there are probably other internal changes in mujoco binary so I doubt the result would be exactly the same as before.
gym/envs/mujoco/mujoco_rendering.py
Outdated
else: | ||
for name, import_func in _ALL_RENDERERS.items(): | ||
try: | ||
self.opengl_context = _ALL_RENDERERS["osmesa"](width, height) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this use name
instead of osmesa
?
Or
self.opengl_context = _ALL_RENDERERS["osmesa"](width, height) | |
self.opengl_context = import_func(width, height) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right. Thank you for pointing this out.
gym/envs/mujoco/walker2d_v4.py
Outdated
@@ -0,0 +1,278 @@ | |||
from turtle import distance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an unused import for walker2d.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove this. Thanks!
bc2d9aa
to
ff8e3fb
Compare
ff8e3fb
to
cb7e1a3
Compare
Thanks for all the work on this! Will this be in gym 0.24.0 and is there an ETA for that release? |
* update new mujoco bindings * optional ctc_force ant-v4 * force changes * contact force weight * add ctc force range * mujoco v3 skip test * doc Ant-v4 * inverted pendulum limit control
cb7e1a3
to
9c97414
Compare
:) @jkterry1 can probably address this question better. |
This PR is ready to be merged. Last additional change:
|
@rfernand2 Yes, there is one other PR that needs approval and merging then after this PR is merged then v0.24.0 will be released. |
I know that people spent a great deal of effort validating the v4 envs and this may well be a bit late now, but please consider pinning to |
@saran-t Has there been any significant changes since |
@pseudo-rnd-thoughts No. I ran the alignment test and it seems that |
Description
This PR is the continuation of #2595. The PR updates the python bindings for the mujoco environments. The new v4 versions of the mujoco environments now use the new mujoco python bindings from DeepMind https://pypi.org/project/mujoco/.
Fixes # (issue)
This PR also fixes the contact force issue in the Ant env (only v4) at #1541
Type of change
Please delete options that are not relevant.
v2-v4 benchmark by @vwxyzjn
@vwxyzjn provided benchmarks between old mujoco environments (v2) and the newer (v4). Results are provided here #2595 (comment)
Checklist:
pre-commit
checks withpre-commit run --all-files
(seeCONTRIBUTING.md
instructions to set it up)