Skip to content

A MARL PPO implementation with tf-agents, configured for the MultiCarRacing-v0 Gym environment.

Notifications You must be signed in to change notification settings

rmsander/marl_ppo

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Multi-Agent Proximal Policy Optimization with TF-Agents

Self Play Multi Agent Car Pushing Self Play Multi Agent Car Pushing Self Play Multi Agent Car Pushing

This repository contains a Multi-Agent Proximal Policy Optimization implementation with TensorFlow-Agents, configured for the MultiCarRacing-v0 Gym environment. To use this MARL framework, please see the sections below.

Installation

To install all the necessary dependencies for this repository, it is recommended to use conda and pip:

# Clone repo
git clone https://github.com/rmsander/marl_ppo.git
cd marl_ppo/install

# Create and activate conda environment
conda create -n marl_ppo python=3.7
conda activate marl_ppo

# Change to install directory and install packages with pip
cd install
pip3 install -r requirements.txt --no-cache-dir

Running the Environment

After installation, the environment can be tried out by running:

python3 envs/multi_car_racing.py

Running the Trainer

To run the trainer, please first edit the parameters in ppo/parameters.py. Then, once ready, run:

python3 ppo/ppo_marl.py

Loading Trained Policies

You can find trained policies for (i) Single-agent, (ii) Multi-Agent, and (iii) Self-Play within the ppo/ppo_policies directory. To load these policies for use in evaluation or pre-training, please see the utility functions and examples in ppo/load_policies.py.

Example: To load the trained, example self_play policy in ppo/ppo_policies/self_play, you can do so by running the following on command-line (from ./):

python3 ppo/load_policies.py -p ppo/ppo_policies/self_play/

Paper and Final Presentation

If you would like to learn more about the theoretical foundations and experiments of this approach, please find the paper in this repository under paper.pdf.

Citation

If you find these results or the multi-agent tutorial setup with tf_agents useful, please consider citing my paper:

@techreport{autonomousauto20,
author = {Sander, Ryan},
year = {2020},
month = {05},
pages = {},
title = {Emergent Autonomous Racing Via Multi-Agent Proximal Policy Optimization}
}

If you find the MultiCarRacing-v0 environment useful, please cite our CoRL 2020 paper: 

@inproceedings{SSG2020,
    title={Deep Latent Competition: Learning to Race Using Visual
      Control Policies in Latent Space},
    author={Wilko Schwarting and Tim Seyde and Igor Gilitschenski
      and Lucas Liebenwein and Ryan Sander and Sertac Karaman and Daniela Rus},
    booktitle={Conference on Robot Learning},
    year={2020}
}

About

A MARL PPO implementation with tf-agents, configured for the MultiCarRacing-v0 Gym environment.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages