Official Implementation of UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers (ICLR 2021 spotlight)
The framework is inherited from PyMARL. UPDeT is written in pytorch and uses SMAC as its environment.
pip install -r requirements.txt
bash install_sc2.sh
Before training your own transformer-based multi-agent model, there are a list of things to note.
- Currently, this repository supports marine-based battle scenarios. e.g.
3m
,8m
,5m_vs_6m
. - If you are interested in training a different unit type, carefully modify the
Transformer Parameters
block atsrc/config/default.yaml
and revise the_build_input_transformer
function inbasic_controller.python
. - Before running the experiment, check the agent type in
Agent Parameters
block atsrc/config/default.yaml
. - This repository contains two new transformer-based agents from the UPDeT paper including
- Standard UPDeT
- Aggregation Transformer
python3 src/main.py --config=vdn --env-config=sc2 with env_args.map_name=5m_vs_6m
All results will be stored in the Results/
folder.
Surpass the GRU baseline on hard 5m_vs_6m
with:
- QMIX: QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
- VDN: Value-Decomposition Networks For Cooperative Multi-Agent Learning
- QTRAN: QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning
Zero-shot generalize to different tasks:
- Result on
7m-5m-3m
transfer learning.
Note: Only UPDeT can be deployed to other scenarios without changing the model's architecture.
More details please refer to UPDeT paper.
@article{hu2021updet,
title={UPDeT: Universal Multi-agent Reinforcement Learning via Policy Decoupling with Transformers},
author={Hu, Siyi and Zhu, Fengda and Chang, Xiaojun and Liang, Xiaodan},
journal={arXiv preprint arXiv:2101.08001},
year={2021}
}
The MIT License