[Proposal] Mujoco-v5 #91

Kallinteris-Andreas · 2023-01-09T22:22:36Z

Proposal 0

After the Gymnasium/envs/mujoco gets moved here, create a new revision (v5), that fixes the 0 const observations in the following domains:

Humanoid (+standup): Farama-Foundation/Gymnasium#204, Farama-Foundation/Gymnasium#504
Ant: Farama-Foundation/Gymnasium#204, Farama-Foundation/Gymnasium#214
Reacher: Farama-Foundation/Gymnasium#220
InvertedDoublePendulum: Farama-Foundation/Gymnasium#228, Farama-Foundation/Gymnasium#500
InvertedPendulum: Farama-Foundation/Gymnasium#500

Note: I have written a quick test, that only fails in the above environments

_MUJOCO_GYM_ENVIROMENTS = [  # Note: this could be made dynamic, i was just too lazy to do it
    "Ant-v4",
    "HalfCheetah-v4",
    "Hopper-v4",
    "HumanoidStandup-v4",
    "Humanoid-v4",
    "Reacher-v4",
    "Swimmer-v4",
    "Pusher-v4",
    "Walker2d-v4",
    "InvertedPendulum-v4",
    "InvertedDoublePendulum-v4",
]

@pytest.mark.parametrize("env_name", _MUJOCO_GYM_ENVIROMENTS)
def test_zeros_in_observation(env_name: str):
    """Asserts that all enviroments containt valid (non-zero) observations

    This Tests was created because some mujoco_enviroments pre version 5 did have part of the observation space that was
    """
    if env_name == "Ant-v4":
        env = gym.make(env_name, use_contact_forces=True)
    else:
        env = gym.make(env_name)
    env.action_space.seed(0)
    env.reset(seed=0)
    observation, _, _, _, _ = env.step(env.action_space.sample())
    assert (observation != 0).all()

Proposal 1

Register some new environments to the suite from mamujoco like coupled half cheetah https://github.com/Kallinteris-Andreas/Gymnasium-Robotics-Kalli/blob/main/gymnasium_robotics/envs/multiagent_mujoco/coupled_half_cheetah.py#L27

Proposal 2

Update Hopper and Walker2D models to not require coordinate="Global": google-deepmind/mujoco#833

And fix Walker2D feet Farama-Foundation/Gymnasium#477

Checklist

I have checked that there is no similar issue in the repo (required)

The text was updated successfully, but these errors were encountered:

rodrigodelazcano · 2023-01-09T23:54:48Z

Can you elaborate more on Proposal 1 and how would this work. I don't know if you mean to register mamujoco environments with Gymnasium id's.

Kallinteris-Andreas · 2023-01-13T17:34:35Z

I mean adding them to, to the suite of available environments.

pseudo-rnd-thoughts · 2023-02-20T11:14:30Z

Im unconvinced by the test showing this to be an issue.
It is quite possible for a couple of the observation elements for the first N time steps.
Could you upgrade the tests to show that from the reset obs and for the first (100) time steps then the same elements have a fixed value, don't have to be zero.

If we make this change, we will need to train agents to ensure there is no performance regression is a result of the change.

It is a shame we can't do v4.1 as the environment number

Kallinteris-Andreas · 2023-02-20T13:23:17Z

No, the test is indeed not good enough to prove the point (and it does fail after the fix to the humanoid)

The analysis done in the issues proves it

I have done testing for millions of steps, with them being zero (it is in the other issues)

pseudo-rnd-thoughts · 2023-05-02T14:28:02Z

Could this include new compiled versions of all of the models so they have the same format

Kallinteris-Andreas · 2023-05-02T15:06:18Z

Could this include new compiled versions of all of the models so they have the same format

I would rather not, I am not sure if I can assert identical behavior because they add autolimits="true"
(and see no benefit in it, other than aesthetics)

Kallinteris-Andreas · 2023-05-03T07:40:59Z

@pseudo-rnd-thoughts Reason as to why I do not trust the mujoco/comple

v4 is the existing Hopper-v4 model (blue)
v5 is with my hand ported xml model (orange)
v5_gen is the auto generated xml model from mujoco/complile (green)

Notice that for the first 20k steps v4 and v5 overlap (close enough, that it can not be seen with the graph), and slowly the floating point errors accumulate
while the v5_gen has different behavior from the beginning

pseudo-rnd-thoughts · 2023-05-03T09:14:52Z

@pseudo-rnd-thoughts Reason as to why I do not trust the mujoco/comple

Very interesting, thanks for doing that
Deepmind never claimed that compile produces identical models but this is very different, particularly, in training. To check if this is expected.
One response could be that we have only run this once and over several runs the differences average out

saran-t · 2023-05-05T23:51:18Z

@pseudo-rnd-thoughts Reason as to why I do not trust the mujoco/comple

v4 is the existing Hopper-v4 model (blue) v5 is with my hand ported xml model (orange) v5_gen is the auto generated xml model from mujoco/complile (green)

Notice that for the first 20k steps v4 and v5 overlap (close enough, that it can not be seen with the graph), and slowly the floating point errors accumulate while the v5_gen has different behavior from the beginning

Can you please provide us with the complete model for each of the three cases?

Kallinteris-Andreas · 2023-05-06T07:05:06Z

Hey @saran-t, I am currently doing more testing,
here are the xml files:
v4: https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper.xml
v5: https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper_new.xml
auto generated model (with mujoco/compile): https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper_new_gen.xml

Kallinteris-Andreas · 2023-05-06T11:13:57Z

Here is a test for 10 runs with 200k steps

v4 is the existing Hopper-v4 model (blue)
v5 is with my hand ported xml model (orange)
v5_gen is the auto generated xml model from mujoco/complile (green)

Shaded Area Show mins and maxs

Notice that for the first 20k steps v4 and v5 overlap (close enough, that it can not be seen with the graph), and slowly the floating point errors accumulate, while still being in the same ballpark
while the v5_gen has different behavior from the beginning

@saran-t

saran-t · 2023-05-06T11:46:43Z

Please try the auto-converted XML in https://colab.research.google.com/drive/1slY_8RlzzRffDQhLt3uqazyD_OU9pd2r#scrollTo=lH94cPjK4SKZ and let me know how that works.

Kallinteris-Andreas · 2023-05-06T12:19:31Z

@saran-t
This is identical to:
https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper_new_gen.xml
(other than numbers being in full prec)

you can use a service like https://www.diffchecker.com/text-compare/ to compare them

saran-t · 2023-05-06T12:23:18Z

(other than numbers being in full prec)

That is literally the whole point? The only difference that you're seeing is just the tail end of the precision. You can see at the bottom of the Colab that the auto-converted XMLs have bit-identical results.

To be clear, what I'm saying is you should just replace your existing XML with https://colab.research.google.com/drive/1slY_8RlzzRffDQhLt3uqazyD_OU9pd2r#scrollTo=s4DDElgo4_d6 (which is the absolute minimal diff that results in absolutely no change whatsoever under MuJoCo 2.3.3).

Kallinteris-Andreas · 2023-05-06T12:33:08Z

@saran-t k, I will test it

I am under the impression that the only change was printing my numbers at full prec

>>> numpy.array(-0.19999999999999996, dtype=numpy.float32) == numpy.array(-0.2, dtype=numpy.float32)
True

i will test your manually made xml

saran-t · 2023-05-06T12:34:09Z

Your problem there is float32. At float64 those two numbers are different (https://evanw.github.io/float-toy/).

Kallinteris-Andreas · 2023-05-07T11:21:48Z

v4 is the existing Hopper-v4 model (blue) https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper.xml
v5 is with my hand ported xml model (orange) https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper_new.xml
v5_gen is the auto generated xml model from mujoco/complile (green) https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper_new_gen.xml
@saran_t manually transcribed (red) https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/hopper_saran_t_trans.xml

Notice that for the first 20k steps v4 and v5 overlap (close enough, that it can not be seen with the graph), and slowly the floating point errors accumulate, while still being in the same ballpark
while the v5_gen and @saran_t manually transcribed have different behavior from the beginning

saran-t · 2023-05-07T11:29:59Z

Not really sure how that can be the case since I've verified that the model is literally identical (bit perfect) to the v4 XML...

Kallinteris-Andreas · 2023-05-07T11:31:42Z

I am double-checking, this weirds me out too.

Kallinteris-Andreas · 2023-05-08T16:16:48Z

I re-run the benchmark (I had pytorch related issues)

(The other models have different behavior, I am not showing them)
As you can see @saran-t's model is identical to hopper-v4.xml
On top of that I run a test for 1000 episodes with 4000 simulation steps per episode (with random actions) and the behavior 100% identical between both models (https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/test_hop.py).

@saran-t

~~Can you please do the same with walker2d.xml https://github.com/Farama-Foundation/Gymnasium/blob/main/gymnasium/envs/mujoco/assets/walker2d.xml~~ Never mind I did it myself https://github.com/Kallinteris-Andreas/gym-mjc-v5-model-validation/blob/main/walker2d_ported.xml
According to your expert opinion does the model you made have identical behavior with the previous model ALWAYS on mujoco==2.3.3
According to your expert opinion does the model you made have identical behavior with the previous model ALWAYS on 2.1.3<=mujoco<=2.3.3
According to your expert opinion does the model you made have identical behavior with the previous model ALWAYS on 1.5<=mujoco-py

questions 2-4 are there to know if we can change the model for environment versions v2, v3, v4

pseudo-rnd-thoughts · 2023-05-15T00:15:41Z

When this is done, I think we should write a blog post as this will be a better way of publishing the changes and for users to cite the changes compared to a PR.
Additionally, it would be easier to include all of the results in there.
@Kallinteris-Andreas Would you be interested in this?

Kallinteris-Andreas · 2023-05-15T06:10:48Z

Sure, where would the blog post be published? (farama.org?)

In the meantime the active changelog is here: #104

Kallinteris-Andreas mentioned this issue Jan 24, 2023

mujoco-v5 initial commit #104

Closed

34 tasks

Kallinteris-Andreas closed this as completed Jun 27, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Proposal] Mujoco-v5 #91

[Proposal] Mujoco-v5 #91

Kallinteris-Andreas commented Jan 9, 2023 •

edited

Loading

rodrigodelazcano commented Jan 9, 2023

Kallinteris-Andreas commented Jan 13, 2023 •

edited

Loading

pseudo-rnd-thoughts commented Feb 20, 2023

Kallinteris-Andreas commented Feb 20, 2023 •

edited

Loading

pseudo-rnd-thoughts commented May 2, 2023

Kallinteris-Andreas commented May 2, 2023 •

edited

Loading

Kallinteris-Andreas commented May 3, 2023

pseudo-rnd-thoughts commented May 3, 2023

saran-t commented May 5, 2023

Kallinteris-Andreas commented May 6, 2023

Kallinteris-Andreas commented May 6, 2023 •

edited

Loading

saran-t commented May 6, 2023 •

edited

Loading

Kallinteris-Andreas commented May 6, 2023 •

edited

Loading

saran-t commented May 6, 2023 •

edited

Loading

Kallinteris-Andreas commented May 6, 2023

saran-t commented May 6, 2023

Kallinteris-Andreas commented May 7, 2023 •

edited

Loading

saran-t commented May 7, 2023

Kallinteris-Andreas commented May 7, 2023

Kallinteris-Andreas commented May 8, 2023 •

edited

Loading

pseudo-rnd-thoughts commented May 15, 2023

Kallinteris-Andreas commented May 15, 2023 •

edited

Loading

[Proposal] Mujoco-v5 #91

[Proposal] Mujoco-v5 #91

Comments

Kallinteris-Andreas commented Jan 9, 2023 • edited Loading

Proposal 0

Proposal 1

Proposal 2

Checklist

rodrigodelazcano commented Jan 9, 2023

Kallinteris-Andreas commented Jan 13, 2023 • edited Loading

pseudo-rnd-thoughts commented Feb 20, 2023

Kallinteris-Andreas commented Feb 20, 2023 • edited Loading

pseudo-rnd-thoughts commented May 2, 2023

Kallinteris-Andreas commented May 2, 2023 • edited Loading

Kallinteris-Andreas commented May 3, 2023

pseudo-rnd-thoughts commented May 3, 2023

saran-t commented May 5, 2023

Kallinteris-Andreas commented May 6, 2023

Kallinteris-Andreas commented May 6, 2023 • edited Loading

saran-t commented May 6, 2023 • edited Loading

Kallinteris-Andreas commented May 6, 2023 • edited Loading

saran-t commented May 6, 2023 • edited Loading

Kallinteris-Andreas commented May 6, 2023

saran-t commented May 6, 2023

Kallinteris-Andreas commented May 7, 2023 • edited Loading

saran-t commented May 7, 2023

Kallinteris-Andreas commented May 7, 2023

Kallinteris-Andreas commented May 8, 2023 • edited Loading

pseudo-rnd-thoughts commented May 15, 2023

Kallinteris-Andreas commented May 15, 2023 • edited Loading

Kallinteris-Andreas commented Jan 9, 2023 •

edited

Loading

Kallinteris-Andreas commented Jan 13, 2023 •

edited

Loading

Kallinteris-Andreas commented Feb 20, 2023 •

edited

Loading

Kallinteris-Andreas commented May 2, 2023 •

edited

Loading

Kallinteris-Andreas commented May 6, 2023 •

edited

Loading

saran-t commented May 6, 2023 •

edited

Loading

Kallinteris-Andreas commented May 6, 2023 •

edited

Loading

saran-t commented May 6, 2023 •

edited

Loading

Kallinteris-Andreas commented May 7, 2023 •

edited

Loading

Kallinteris-Andreas commented May 8, 2023 •

edited

Loading

Kallinteris-Andreas commented May 15, 2023 •

edited

Loading