OpenAI Gym environments for quadrotor UAV control. This repository implements both monolithic and modular reinforcement learning (RL) frameworks for the low-level control of a quadrotor unmanned aerial vehicle. A detailed explanation of these concepts can be found in this YouTube video. To better understand What Deep RL Do, see OpenAI Spinning UP.
The repository is compatible with Python 3.11.3, Gymnasium 0.28.1, Pytorch 2.0.1, and Numpy 1.25.1. It is recommended to create Anaconda environment with Python 3 (installation guide). Additionally, Visual Studio Code is recommended for efficient code editing.
- Open your
Anaconda Prompt
and install major packages.
conda install -c conda-forge gymnasium
conda install pytorch torchvision torchaudio pytorch-cuda=11.8 -c pytorch -c nvidia
conda install -c anaconda numpy
conda install -c conda-forge vpython
Refer to the official documentation for Gymnasium, Pytorch, and Numpy, and Vpython.
- Clone the repository.
git clone https://github.com/fdcl-gwu/gym-rotor-modularRL.git
Consider a quadrotor UAV below,
The position and the velocity of the quadrotor are represented by
Two major training frameworks are provided for quadrotor low-level control tasks: (a) In a monolithic setting, a large end-to-end policy directly outputs total thrust and moments; (b) In modular frameworks, two modules collaborate to control the quadrotor: Module #1 and Module #2 specialize in translational control and yaw control, respectively.
Env IDs | Description |
---|---|
Quad-v0 |
This serves as the foundational environment for wrappers, where the state and action are represented as |
CoupledWrapper |
Wrapper for monolithic RL framework: the observation and action are given by |
DecoupledWrapper |
Wrapper for modular RL schemes: the observation and action for each agent are defined as |
where the error terms
There are three training frameworks for quadrotor control: NMP (Non-modular Monolithic Policy), DMP (Decentralized Modular Policies), and CMP (Centralized Modular Policies).
Note that our study investigates two inter-module communication strategies to optimize coordination across modules: centralized and decentralized coordination.
In the decentralized setting, namely DMP, modules independently learn their action value functions and policies without inter-agent synchronization.
In contrast, centralized methods, called CMP, introduce centralized critic networks to share information between modules during training.
You can adjust the hyperparameters in args_parse.py
to fine-tune the models.
To train the agent with different frameworks, use the following commands,
# Non-modular monolithic framework
python3 main.py --framework NMP
# Centralized modular framework
python3 main.py --framework CMP
# Decentralized modular framework
python3 main.py --framework DMP
The trained model is saved in the models
folder (e.g. NMP_632.0k_steps_agent_0_789
).
First, modify total_steps
value in main.py
, for example, for NMP schemes,
# Load trained models for evaluation:
if self.args.eval_model:
if self.framework == "NMP":
total_steps, agent_id = 632_000, 0 # edit 'total_steps' accordingly
self.agent_n[agent_id].load(self.framework, total_steps, agent_id, self.seed)
Next, run the following command,
python3 main.py --framework NMP --eval_model True --save_log True --render True --seed 789
When testing the trained models, we can save the flight data using the --save_log True
flag.
Then the data is saved to the results
folder along with the current date and time (e.g. NMP_log_20250114_163953.dat
).
To visualize the flight data, open draw_plot.py
and update the file_name
accordingly, e.g., file_name = 'NMP_log_20250114_163953'
.
Lastly, run the plotting script,
python3 draw_plot.py --framework NMP
At slower desired yaw rates, such as
If you find this work useful in your work and would like to cite it, please give credit to our work:
@article{yu2024modular,
title={Modular Reinforcement Learning for a Quadrotor UAV with Decoupled Yaw Control},
author={Yu, Beomyeol and Lee, Taeyoung},
journal={IEEE Robotics and Automation Letters},
year={2024},
publisher={IEEE}}
@inproceedings{yu2024multi,
title={Multi-Agent Reinforcement Learning for the Low-Level Control of a Quadrotor UAV},
author={Yu, Beomyeol and Lee, Taeyoung},
booktitle={2024 American Control Conference (ACC)},
pages={1537--1542},
year={2024},
organization={IEEE}}