Mujoco 1.0 #733

adamlerer · 2017-09-27T15:33:00Z

Some people have been asking for mujoco-1.0 (1.5.0.x) integration with gym (e.g. for headless rendering). This got it working for us.

… with boxer.

rhaps0dy · 2017-10-16T17:57:45Z

Thank you, very useful! I'm using your code with the latest MuJoCo.

machinaut · 2017-10-21T17:30:03Z

gym/envs/mujoco/mujoco_env.py

-        high = bounds[:, 1]
-        self.action_space = spaces.Box(low, high)
+        bounds = self.model.actuator_ctrlrange
+        # I'm not sure why bounds is at least sometimes None ... bug?


@adamlerer / @yayitsamyzhang could you share examples of when this happens? Possibly a bug.

Unfortunately I've lost track of when this happened. You could try setting it to None and then run an environment and see what happens!

It's constructed by generated code (here in the most recent version: https://github.com/openai/mujoco-py/blob/master/mujoco_py/generated/wrappers.pxi#L1285).

Wrapper is defined here: https://github.com/openai/mujoco-py/blob/master/mujoco_py/generated/wrappers.pxi#L3824

So it looks like we set it to None instead of a numpy array with 0 as one of the dimensions (which would just be empty anyways).

This would be true for models without any actuators (so all the ctrl related arrays are size 0).

I'm alright with this handling -- maybe update the comment to say that it's None when there are no actuators.

If you wanted to, you could check sim.model.nu == 0, which is the same thing (no actuators in the model).

machinaut

This looks great! I'm looking forward to trying it out.

Given that this changes the underlying simulator, the environment name versions Humanoid-v1, etc need to be bumped, to show that the environment has changed.

If any of y'all are able to do it, it'd be good to try training policies for all of the old environments, and verify they work well on the new environments and vice versa.

machinaut · 2017-10-21T17:31:50Z

gym/envs/mujoco/mujoco_env.py

-        self.model.forward()
+        sim_state = self.sim.get_state()
+        sim_state.qpos[:] = qpos
+        sim_state.qvel[:] = qvel


I'm pretty sure what we want to do here is

self.data.qpos[:] = qpos self.data.qvel[:] = qvel

The state is a separate thing, and if you set those parameters on it they'll just get thrown away.

matthiasplappert · 2018-01-24T19:27:39Z

This is addressed in #834.

Adam Lerer and others added 6 commits September 27, 2017 00:11

mujoco-1.0 integration. Warning: some stuff is hacky, and only tested…

4b64334

… with boxer.

Fix some outstanding problems for mujoco-1.0

b0e91cc

mujoco working on GPU

c3a31cc

Set resolution

e33703d

fixed viewer for mujoco-1.0

b5f34c4

add render flag for headless

c48018a

machinaut self-requested a review October 21, 2017 17:26

machinaut reviewed Oct 21, 2017

View reviewed changes

matthiasplappert closed this Jan 24, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mujoco 1.0 #733

Mujoco 1.0 #733

adamlerer commented Sep 27, 2017

rhaps0dy commented Oct 16, 2017

machinaut Oct 21, 2017

adamlerer Oct 28, 2017

machinaut Oct 28, 2017

machinaut left a comment

machinaut Oct 21, 2017

matthiasplappert commented Jan 24, 2018

Mujoco 1.0 #733

Mujoco 1.0 #733

Conversation

adamlerer commented Sep 27, 2017

rhaps0dy commented Oct 16, 2017

machinaut Oct 21, 2017

Choose a reason for hiding this comment

adamlerer Oct 28, 2017

Choose a reason for hiding this comment

machinaut Oct 28, 2017

Choose a reason for hiding this comment

machinaut left a comment

Choose a reason for hiding this comment

machinaut Oct 21, 2017

Choose a reason for hiding this comment

matthiasplappert commented Jan 24, 2018