Decomposing Temporal Equilibrium Strategy for Coordinated Distributed Multi-Agent Reinforcement Learning(AAAI2024)

The increasing demands for system complexity and robustness have prompted the integration of temporal logic into Multi-Agent Reinforcement Learning (MARL) to address tasks with non-Markovian properties. However, incorporating non-Markovian properties introduces additional computational complexities, as agents are required to integrate historical data into their decision-making process. Also, optimizing strategies within a multi-agent environment presents significant challenges due to the exponential growth of the state space with the number of agents. In this study, we introduce an innovative hierarchical MARL framework that synthesizes temporal equilibrium strategies through parity games and subsequently encodes them as individual reward machines for MARL coordination. More specifically, we reduce the strategy synthesis problem into an emptiness problem concerning parity games with optimized states and transitions. Following this synthesis step, the temporal equilibrium strategy is decomposed into individual reward machines for decentralized MARL. Theoretical proofs are provided to verify the consistency of the Nash equilibrium between the parallel composition of decomposed strategies and the original strategy. Empirical evidence confirms the efficacy of the proposed synthesis technique, showcasing its ability to reduce state space compared to EVE. Furthermore, our study highlights the superior performance of the distributed MARL paradigm over centralized approaches when deploying decomposed strategies.

Installation instructions

MATEA has been tested on Ubuntu 18.04 and macOS monterey:

The code has the following requirements:

Python 3.6 or 3.7
NumPy
OpenAI Gym
OpenAI Baselines
OPAM (https://opam.ocaml.org/doc/Install.html) + OCaml version 4.03.x or later (https://ocaml.org/docs/install.html). To install OPAM (along with OCaml):
- Ubuntu
  - sudo apt-get install m4
  - sudo wget https://raw.github.com/ocaml/opam/master/shell/opam_installer.sh -O - | sh -s /usr/local/bin
  - echo "y" | opam init
  - eval `opam config env`
Cairo (https://cairographics.org/download/) or from sourcecode (https://cairographics.org/releases/). To install Cairo:
- Ubuntu
  - sudo apt-get install libcairo2-dev
  - sudo apt-get install python-cairo
IGraph version 0.7 (http://igraph.org/python/)
- You need to have a C/C++ compiler installed on your machine. To install
- Ubuntu
  - sudo apt-get install python-igraph

How to run the code

Temporal Equilibrium Strategy Synthesis

From inside folder synthesis execute the following command: $ python main.py [path/name of the file] [options]

List of optional arguments:

-d Option to draw the synthesized strategies -v Option to record performance of the tool -s Option to save the synthesized strategy as well as the decomposed strategies for MARL to use
Example:
- Generate the synthesized strategy and decomposed strategies for the 2 agent environment: $ python main.py ../examples/cop_2agent -d, draw the synthesized strategies and the decomposed strategies
-Record the time taken for 2 agent using the tool to record performance: $ python main.py ../examples/cop_2agent -v, record the performance of the tool

The generation results of the synthesized strategy and decomposed strategies that satisfy the Nash equilibrium can be found in the folder results.

MARL with Synthesized Strategies and Decomposed Strategies

From inside folder marl execute the following command: $ python run.py --alg=<name of the algorithm> --env=<environment_id> [additional options]

Example:
- Decentralized MARL traning with 2 agent environment $ python run.py --alg=maqlearning --env=MACraft-2agentdcent-v0 --num_timesteps=1e7 --gamma=0.9 --log_path=../aaai/2agent/dcent/M1/1 --ma --num_agent=2 --dcent
- Centralized MARL traning with 2 agent environment $ python run.py --alg=maqlearning --env=MACraft-2agentcent-v0 --num_timesteps=1e7 --gamma=0.9 --log_path=../aaai/2agent/dcent/M1/1 --ma --num_agent=2

From inside folder marl/scripts are the scripts run for the paper.Execute the following command: $ ./[name of the file]

Example:
- To run the script for the 2 agent environment $ ./run_2agent.sh

The results generated by the MARL with the synthesized strategies and decomposed strategies can be found in the folder marl/results

Playing each environment

Finally, note that we included code that allows you to manually play each environment. Go to the reward_machines folder and run the following command: $ python play.py --env=<environment_id>

Examples:
- Play one of the 2 agent environments : $ python play.py --env=MACraft-2agentdcent-v0

where <environment_id> can be found in the folder marl/reward_machines/envs/init.py

Acknowledgments

Several files in our implementation adapt code originally included:

EVE: https://github.com/eve-mas/eve-parity.

Reward Machines: https://github.com/RodrigoToroIcarte/reward_machines

We thank the authors of their work.

Citations

If you use MATEA tool, please cite the following work:

MATEA [PDF]

@inproceedings{zhu2024decomposing,
  title={Decomposing Temporal Equilibrium Strategy for Coordinated Distributed Multi-Agent Reinforcement Learning},
  author={Zhu, Chenyang and Si, Wen and Zhu, Jinyu and Jiang, Zhihao},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={38},
  number={16},
  pages={17618--17627},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
examples		examples
ltl2ba		ltl2ba
marl		marl
result		result
synthesis		synthesis
.gitattributes		.gitattributes
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Decomposing Temporal Equilibrium Strategy for Coordinated Distributed Multi-Agent Reinforcement Learning(AAAI2024)

Installation instructions

How to run the code

Temporal Equilibrium Strategy Synthesis

MARL with Synthesized Strategies and Decomposed Strategies

Playing each environment

Acknowledgments

Citations

About

Releases

Packages

Languages

Gabriel0402/MATEA

Folders and files

Latest commit

History

Repository files navigation

Decomposing Temporal Equilibrium Strategy for Coordinated Distributed Multi-Agent Reinforcement Learning(AAAI2024)

Installation instructions

How to run the code

Temporal Equilibrium Strategy Synthesis

MARL with Synthesized Strategies and Decomposed Strategies

Playing each environment

Acknowledgments

Citations

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages