SigmaRL: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion Planning
Note
- Check out our recent work CBF-MARL! It uses a learning-based, less conservative distance metric to categorize safety margins between agents and integrates it into Control Barrier Functions (CBFs) to guarantee safety in MARL.
- Check out our recent work XP-MARL! It augments MARL with learning-based auxiliary prioritization to address non-stationarity.
This repository provides the full code of SigmaRL, a Sample efficiency and generalization multi-agent Reinforcement Learning (MARL) for motion planning of Connected and Automated Vehicles (CAVs).
SigmaRL is a decentralized MARL framework designed for motion planning of CAVs. We use VMAS, a vectorized differentiable simulator designed for efficient MARL benchmarking, as our simulator and customize our own RL environment. The first scenario in Fig. 1 mirrors the real-world conditions of our Cyber-Physical Mobility Lab (CPM Lab). We also support maps handcrafted in JOSM, an open-source editor for OpenStreetMap. Below you will find detailed guidance to create your OWN maps.
(a) CPM scenario. |
(b) Intersection scenario. |
(c) On-ramp scenario. |
(d) "Roundabout" scenario. |
Figure 1: Demonstrating the generalization of SigmaRL (speed x2). Only the intersection part of the CPM scenario (the middle part in Fig. 1(a)) is used for training. All other scenarios are completely unseen. See our SigmaRL paper for more details.
Figure 2: We use an auxiliary MARL to learn dynamic priority assignments to address non-stationarity. Higher-priority agents communicate their actions (depicted by the colored lines) to lower-priority agents to stabilize the environment. See our XP-MARL paper for more details.
Figure 3: Demonstrating the safety and reduced conservatism of our MTV-based safety margin. In the overtaking scenario, while the traditional approach fails to overtake due to excessive conservatism (see (a)), ours succeeds (see (b)). Note that in the overtaking scenario, the slow-moving vehicle
Currently, SigmaRL
supports Python versions 3.9 and 3.10 and is also OS independent (Windows/macOS/Linux). It's recommended to use a virtual environment. For example, if you are using conda:
conda create -n sigmarl python=3.10
conda activate sigmarl
We recommend installing sigmarl
from source:
- Clone the repository
git clone https://github.com/bassamlab/SigmaRL.git cd SigmaRL pip install -e .
- (Optional) Verifying the Installation by first launching your Python interpreter in terminal:
Then run the following lines, which should show the version of the installed
python
sigmarl
:import sigmarl print(sigmarl.__version__)
Run main_training.py
. During training, all the intermediate models that have higher performance than the saved one will be automatically saved. You are also allowed to retrain or refine a trained model by setting the parameter is_continue_train
in the file sigmarl/config.json
to true
. The saved model will be loaded for a new training process.
sigmarl/scenarios/road_traffic.py
defines the RL environment, such as the observation function and reward function. Besides, it provides an interactive interface, which also visualizes the environment. To open the interface, simply run this file. You can use arrow keys
to control agents and use the tab key
to switch between agents. Adjust the parameter scenario_type
to choose a scenario. All available scenarios are listed in the variable SCENARIOS
in sigmarl/constants.py
. Before training, it is recommended to use the interactive interface to check if the environment is as expected.
After training, run main_testing.py
to test your model. You may need to adjust the parameter path
therein to tell which folder the target model was saved.
Note: If the path to a saved model changes, you need to update the value of where_to_save
in the corresponding JSON file as well.
We support maps customized in JOSM, an open-source editor for OpenStreetMap. Follow these steps:
- Install and open JOSM, click the green download button
- Zoom in and find an empty area (as empty as possible)
- Select the area by drawing a rectangle
- Click "Download"
- Now you will see a new window. Make sure there is no element. Otherwise, redo the above steps.
- Customize lanes. Note that all lanes you draw are considered center lines. You do not need to draw left and right boundaries, since they will be determined automatically later by our script with a given width.
- Save the osm file and store it at
sigmarl.assets/maps
. Give it a name. - Go to
sigmarl/constants.py
and create a new dictionary for it. You should at least give the value for the keymap_path
,lane_width
, andscale
. - Go to
sigmarl/parse_osm.py
. Adjust the parametersscenario_type
and run it.
- [2024-11-15] Check out our recent work CBF-MARL! It uses a learning-based, less conservative distance metric to quantify safety margins between agents and integrates it into Control Barrier Functions (CBFs) to guarantee safety in MARL.
- [2024-09-15] Check out our recent work XP-MARL! It augments MARL with learning-based auxiliary prioritization to address non-stationarity.
- [2024-08-14] We support customized maps in OpenStreetMap now (see here)!
- [2024-07-10] Our CPM Scenario is now available as an MARL benchmark scenario in VMAS (see here)!
- [2024-07-10] Our work SigmaRL was accepted by the 27th IEEE International Conference on Intelligent Transportation Systems (IEEE ITSC 2024)!
We would be grateful if you would refer to the papers below if you find this repository helpful.
-
BibTeX
@inproceedings{xu2024sigmarl, title={{{SigmaRL}}: A Sample-Efficient and Generalizable Multi-Agent Reinforcement Learning Framework for Motion Planning}, author={Xu, Jianye and Hu, Pan and Alrifaee, Bassam}, booktitle={2024 IEEE 27th International Conference on Intelligent Transportation Systems (ITSC), in press}, year={2024}, organization={IEEE} }
-
Reproduce Experimental Results in the Paper:
- Git checkout to the corresponding tag using
git checkout 1.2.0
- Go to this page and download the zip file
itsc24.zip
. Unzip it, copy and paste the whole folder to thecheckpoints
folder at the root of this repository. The structure should be like this:root/checkpoints/itsc24/
. - Run
sigmarl/evaluation_itsc24.py
.
You can also run
testing_mappo_cavs.py
to intuitively evaluate the trained models. Adjust the parameterpath
therein to specify which folder the target model was saved. Note: The evaluation results you get may deviate from the paper since we have meticulously adjusted the performance metrics. - Git checkout to the corresponding tag using
-
BibTeX
@article{xu2024xp, title={{{XP-MARL}}: Auxiliary Prioritization in Multi-Agent Reinforcement Learning to Address Non-Stationarity}, author={Xu, Jianye and Sobhy, Omar and Alrifaee, Bassam}, journal={arXiv preprint arXiv:2409.11852}, year={2024}, }
-
Reproduce Experimental Results in the Paper:
- Git checkout to the corresponding tag using
git checkout 1.2.0
- Go to this page and download the zip file
icra25.zip
. Unzip it, copy and paste the whole folder to thecheckpoints
folder at the root of this repository. The structure should be like this:root/checkpoints/icra25/
. - Run
sigmarl/evaluation_icra25.py
.
You can also run
testing_mappo_cavs.py
to intuitively evaluate the trained models. Adjust the parameterpath
therein to specify which folder the target model was saved. - Git checkout to the corresponding tag using
-
BibTeX
@article{xu2024learning, title={Learning-Based Control Barrier Function with Provably Safe Guarantees: Reducing Conservatism with Heading-Aware Safety Margin}, author={Xu, Jianye and Alrifaee, Bassam}, journal={arXiv preprint arXiv:2411.08999}, year={2024}, }
-
Reproduce Experimental Results in the Paper:
- Go to this page and download the zip file
ecc25.zip
. Unzip it, copy and paste the whole folder to thecheckpoints
folder at the root of this repository. The structure should be like this:root/checkpoints/ecc25/
. - Run
sigmarl/evaluation_ecc25.py
.
- Go to this page and download the zip file
- Effective observation design
- Image-based representation of observations
- Historic observations
- Attention mechanism
- Improve safety
- Integrating Control Barrier Functions (CBFs)
- Proof of concept with two agents (see the CBF-MARL paper here)
- Integrating Model Predictive Control (MPC)
- Integrating Control Barrier Functions (CBFs)
- Address non-stationarity
- Integrating prioritization (see the XP-MARL paper here)
- Misc
- OpenStreetMap support (see guidance here)
- Contribute our CPM scenario as an MARL benchmark scenario in VMAS (see news here)
- Update to the latest versions of Torch, TorchRL, and VMAS
- Support Python 3.11+
This research was supported by the Bundesministerium für Digitales und Verkehr (German Federal Ministry for Digital and Transport) within the project "Harmonizing Mobility" (grant number 19FS2035A).