Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MaMuJoCo (Multi-agent mujoco) Environments #53

Merged
merged 53 commits into from
Jan 9, 2023

Conversation

Kallinteris-Andreas
Copy link
Collaborator

@Kallinteris-Andreas Kallinteris-Andreas commented Nov 22, 2022

MaMuJoCo was first introduced in "FACMAC: Factored Multi-Agent Centralised Policy Gradients"

I consider this version of the code to be:

  • almost feature complete
- bug free (at least I have written extensive tests)
  • doc wise I was I written a bunch ( more is needed), but I still not sure how the Docs should be structured, e.g. do we need 1 page per task, or 1 for the Gymnasium/MuJoCo Tasks and 1 per new task

requirements: (I have not added them to setup.py, because it is obvious to me, how it should be packaged, should it work with just pip install gymansium-robotics[MaMuJoCo] for example)

demo (feel free to try other scenarios/agent_configurations)

import numpy
from gymnasium_robotics import mamujoco_v0

if __name__ == "__main__":
    env = mamujoco_v0.parallel_env(scenario='Ant', agent_conf='2x4', agent_obsk=0, render_mode=None)
    # env = mamujoco_v0.parallel_env(scenario='Humanoid', agent_conf='9|8', agent_obsk=0, render_mode=None)
    # env = mamujoco_v0.parallel_env(scenario='Reacher', agent_conf='2x1', agent_obsk=1, render_mode=None)
    # env = mamujoco_v0.parallel_env(scenario='coupled_half_cheetah', agent_conf='1p1', agent_obsk=1, render_mode=None)
    # env = mamujoco_v0.parallel_env(scenario='Swimmer', agent_conf='2x1', agent_obsk=0, render_mode='human')
    # env = mamujoco_v0.parallel_env(scenario='manyagent_swimmer', agent_conf='2x1', agent_obsk=0, render_mode='human')
    # env = mamujoco_v0.parallel_env(scenario='coupled_half_cheetah', agent_conf='1p1', agent_obsk=0, render_mode='human')
    # env = mamujoco_v0.parallel_env(scenario='manyagent_swimmer', agent_conf='2x1', agent_obsk=0, render_mode='human')

    n_episodes = 1
    debug_step = 0

    for e in range(n_episodes):
        obs = env.reset()
        terminated = {'agent_0': False}
        truncated = {'agent_0': False}
        episode_reward = 0

        while not terminated['agent_0'] and not truncated['agent_0']:
            state = env.state()

            actions = {}
            for agent_id in env.agents:
                avail_actions = env.action_space(agent_id)
                action = numpy.random.uniform(avail_actions.low[0], avail_actions.high[0], avail_actions.shape[0])
                actions[str(agent_id)] = action

            obs, reward, terminated, truncated, info = env.step(actions)
            print(reward)
            episode_reward += reward['agent_0']

        print("Total reward in episode {} = {}".format(e, episode_reward))
    env.close()

Notes:

  • The Environments are fully Deterministic
  • Does not include versioning (-v0) this will be added right before it is ready for inclusion in the project
  • Tested only on x64 Linux py3.7, py3.8, py3.9, py3.10, py3.11 (I do not have option to test on MAC & ARM)
  • Documentation is not complete, I need some help with deciding the structure (Since there are effectively a lot of domains)
  • Has passed Black, isort, flake8, (in pre-commit)
  • Not sure if it belongs in this repo, or it would be better as part of PettingZoo (your call)
  • This is my first PR into a 'serious' repo, please feel free to dish out any criticism

TODO (not by me)

  • add Apache license to the enviroment

Copy link
Member

@pseudo-rnd-thoughts pseudo-rnd-thoughts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a quick review, could you fix the pre-commit and there seems to be a number of type hint issues.
For the implementation, I believe in PettingZoo, each environment needs a versioned class as a pettingzoo.make doesn't exist currently. Therefore, Im uncertain of the current implementation using MaMujoco(scenario="...") won't make sense.
Additionally, could you provide a class structure as Im uncertain of how the many agent swimmer environment relates to the MaMujoco class

gymnasium_robotics/envs/multiagent_mujoco/mujoco_multi.py Outdated Show resolved Hide resolved
gymnasium_robotics/envs/multiagent_mujoco/mujoco_multi.py Outdated Show resolved Hide resolved
gymnasium_robotics/envs/multiagent_mujoco/mujoco_multi.py Outdated Show resolved Hide resolved
tests/envs/MaMuJoCo/test_MaMuJoCo.py Outdated Show resolved Hide resolved
tests/envs/MaMuJoCo/test_MaMuJoCo.py Outdated Show resolved Hide resolved
tests/envs/MaMuJoCo/test_MaMuJoCo.py Outdated Show resolved Hide resolved
tests/envs/MaMuJoCo/test_MaMuJoCo.py Outdated Show resolved Hide resolved
gymnasium_robotics/envs/multiagent_mujoco/mujoco_multi.py Outdated Show resolved Hide resolved
@Kallinteris-Andreas
Copy link
Collaborator Author

Kallinteris-Andreas commented Dec 15, 2022

  1. pre-commit is mostly fixed

  2. do you want it to be importable with from gymansium-robitics.enviroments.multi_agent_mujco_v0 import MultiAgentMujocoEnv (or something similar)

  3. the reason that MultiAgentMujocoEnv(scenario=..., agent_conf=...., agent_obsk=..., ...) is this way, is because any different combination of these arguments will result in different task (with different observation/action spaces)

  4. Class Structure (oh man, buckle up for this one)

There is one class for MaMuJoCo (MultiAgentMujocoEnv) this has an instance of a single agent MuJoCo Environment
and is responsible for handling its factorization.

The single agent MuJoCo Environment may be either a member of Gymansium.Envs.MuJoCo or one of the three new single agent environments classes

  • coupled_half_cheetah.py (Yes, this is a single agent class)
  • manyagent_ant.py (Yes, this is a single agent class)
  • manyagent_swimmer.py (Yes, this is a single agent class)

Also, how should I name the agents in self.possible_agents, I currently have them named '0'. '1', '2'

@Kallinteris-Andreas Kallinteris-Andreas marked this pull request as draft December 15, 2022 21:13
Copy link
Member

@pseudo-rnd-thoughts pseudo-rnd-thoughts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes, this is looking good. I don't understand the PettingZoo environment structure so we will need @jjshoots or @WillDudley to review.
In addition, there are several PettingZoo API tests that exist which we could use

gymnasium_robotics/envs/multiagent_mujoco/manyagent_ant.py Outdated Show resolved Hide resolved
gymnasium_robotics/envs/multiagent_mujoco/manyagent_ant.py Outdated Show resolved Hide resolved
gymnasium_robotics/envs/multiagent_mujoco/mujoco_multi.py Outdated Show resolved Hide resolved
gymnasium_robotics/envs/multiagent_mujoco/mujoco_multi.py Outdated Show resolved Hide resolved
gymnasium_robotics/envs/multiagent_mujoco/mujoco_multi.py Outdated Show resolved Hide resolved
if self.agent_obsk is None:
return {self.possible_agents[0]: global_state}

class data_struct:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this outside the MaMujoco class to be a global class

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? This is a function specific data struct,

Copy link
Member

@pseudo-rnd-thoughts pseudo-rnd-thoughts Dec 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, looking at the function in more detail. Why does the class exist? Can we not have the class attributes as normal variables?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No we, can not pass all 122 arguments normally ;-)

obsk.build_obs is written this way, to be extensible, it supports all possible mujoco environments this way (and potentially be able to support future brax based environments with some modifications)

gymnasium_robotics/envs/multiagent_mujoco/obsk.py Outdated Show resolved Hide resolved
tests/envs/MaMuJoCo/test_MaMuJoCo.py Outdated Show resolved Hide resolved
@Kallinteris-Andreas
Copy link
Collaborator Author

I would like to request a review of the DOCs, I currently have written the index page for MaMuJoCo and the Ant page

If you are satisfied with the Ant I will make pages for the other environments following the same template

$ git clone https://github.com/Kallinteris-Andreas/Gymnasium-Robotics-Kalli.git
$ cd Gymnasium-Robotics-Kalli/docs/
$ pip install .. && make dirhtml
$ cd _build/dirhtml/envs/MaMuJoCo
$ firefox index.html  # or use your prefered browser

@pseudo-rnd-thoughts @rodrigodelazcano

@pseudo-rnd-thoughts
Copy link
Member

Could you fix the testing such that it passes all CI checks then I will make another pass over the code

Copy link
Member

@jjshoots jjshoots left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a pettingzoo perspective, LGTM. Seems like it's taking the single agent MJC envs and splitting the robot up into multiple segments as different agents and representing it as a ParallelEnv. From that perspective everything looks sound to me.

@pseudo-rnd-thoughts pseudo-rnd-thoughts changed the title New MaMuJoCo Enviroments (WIP) Add MaMuJoCo (Multi-agent mujoco) Environments Jan 4, 2023
Copy link
Member

@pseudo-rnd-thoughts pseudo-rnd-thoughts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it is very close to being done.

Couple of last comments

  1. I think you have already answered this but why are two of the .xml have .xml.template.
  2. Do we have tests that the new mujoco environments work as standard gym mujoco envs?
  3. Do we need copyright comments as listed in the PR todo list. If so, the comment should exist at the top of every file copied. https://github.com/Farama-Foundation/Gymnasium/blob/5d67eae4fbf699b84a20ec056f98802dd30268f3/gymnasium/utils/env_checker.py#L3 and https://github.com/Farama-Foundation/Gymnasium/blob/5d67eae4fbf699b84a20ec056f98802dd30268f3/gymnasium/utils/env_checker.py#L14

@rodrigodelazcano Do you have any last comments?

@rodrigodelazcano
Copy link
Member

It looks great, thank you @Kallinteris-Andreas.

We need to add the License as Mark mentioned. They use Apache 2.0 so you'll have to add the original Author in each file and a small description of any changes. Also, add the LICENSE.md to the multiagent_mujoco directory https://github.com/schroederdewitt/multiagent_mujoco/blob/master/LICENSE.

Other thing I would like to do in a future PR is to get rid of the additional .xml and .template files so that we don't add the extra jinja dependency. We can do this by importing the original Gymnasium .xml and editing it with ElementTree or dm_control pymjcf

In 2. @pseudo-rnd-thoughts, I think these tests are already being done by https://github.com/Farama-Foundation/Gymnasium-Robotics/blob/main/tests/test_envs.py

@Kallinteris-Andreas
Copy link
Collaborator Author

@pseudo-rnd-thoughts

  1. The .xml.template files are named that way because they are templates, e.g. many_segment_swimmer.xml.template is a template for swimmer like environments, it can have any arbitrary amount of segments (with n_segs=3 representing the Gymansium/MuJoCo/Swimmer)
  2. No, that can not be tested without them being registered, (nor is there a reason to test that, if they are not registered)
  3. Ok, thanks I will fix

@rodrigodelazcano
I do not think we can get rid of coupled_half_cheetah.xml easily, because it adds a tendon

@rodrigodelazcano
Copy link
Member

@Kallinteris-Andreas. I'm suggesting something similar to this:

@rodrigodelazcano
Copy link
Member

Also, this is ready to be merged, thank you for hanging there @Kallinteris-Andreas . It will be done after a new release is made with the D4RL environments soon :)

@Kallinteris-Andreas
Copy link
Collaborator Author

I have fixed all the pyright.reportGeneralTypeIssues I could, there is a class of them remaining, that originate from Gymnasium's pyright.reportGeneralTypeIssues, and I can not fix.

Also, I have added all the required license things.

@rodrigodelazcano
Copy link
Member

@Kallinteris-Andreas can you also rebase to the latest commit in the main repo?

@rodrigodelazcano rodrigodelazcano merged commit 9f0add4 into Farama-Foundation:main Jan 9, 2023
@pseudo-rnd-thoughts
Copy link
Member

@Kallinteris-Andreas nice job on this, sorry for it taking a while. I'm looking forward to your next contributions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants