This repository provides the code implementation of our RA-L paper here. We developed an ASV decision making and control policy based on Actor Critic Implicit Quantile Networks (AC-IQN), and integrated it into a navigation system that works in congested multi-vehicle environments, under the influence of wind and wave disturbances on ASV motion and perception. The performance of our approach is shown in the video here.
If you find this repository useful, please cite our paper
author={Lin, Xi and Szenher, Paul and Huang, Yewei and Englot, Brendan},
journal={IEEE Robotics and Automation Letters},
title={Distributional Reinforcement Learning Based Integrated Decision Making and Control for Autonomous Surface Vehicles},
The preparation process of our proposed AC-IQN based policy, and all other five RL based policies described in our paper, is described as follows. You can also skip this section and run VRX experiments with the provided pretrained models.
Run the following command to train RL model.
Replace CONFIG_FILE with the path to the training config file. Example training config files are provided in config directory.
-P and -D flags are optional. -P can be used to specify the number of processes to be run in parallel when training multiple models. -D can be used to specify the CPU or GPU intended to be used in training.
We provide scripts for plotting learning performance and visualizing evaluation episodes of RL agents trained by you.
To plot learning curves, set data_dir, seeds, and eval_agents in according to the corresponding trained RL models, and run the command
To visualize an evaluation episode, set eval_id and episode_id in according to the corresponding evaluation config file, and run the command
To convert a trained PyTorch model to a TorchScript file that can be used by the navigation system operating in the Gazebo environment, customize as needed, and run the following command.
The Gazebo based simulator VRX is used for simulation experiments. We developed new packages that realize the navigation system and added them to the original simulator. The list of files we created and added, as well as files we modified from the original simulator, is given in file_modification_note.txt. The simulation envrionment can be built as follows.
Install ROS 2 Humble and Gazebo Garden. Then install additional dependencies by running the following commands
sudo apt install python3-sdformat13 ros-humble-ros-gzgarden ros-humble-xacro
Download LibTorch from here and add it to environment path. Our code implementation uses the version 2.2.1+cpu.
export LD_LIBRARY_PATH=/path/to/libtorch/lib:$LD_LIBRARY_PATH
source ~/.bashrc
sudo ldconfig
Set "/path/to/libtorch" to the corresponding location in line 5 of vrx-2.3.2/action_planner/CMakeLists.txt:
list(APPEND CMAKE_PREFIX_PATH "/path/to/libtorch")
Navigate to the root directory of this repo and run the following commands
mkdir -p vrx_ws/src
cp -r vrx-2.3.2/* vrx_ws/src
cd vrx_ws
source /opt/ros/humble/setup.bash
colcon build --merge-install
. install/setup.bash
cp src/ .
You may run a VRX experiment with customized experiment settings (vehicle initial poses, position of goals and buoys), or run mulitple VRX experiments with randomly generated experiment settings.
Before running experiment(s), navigate to vrx_ws directory and customize as needed: (1) Specify method ("RL", "APF" or "MPC"). If using a RL agent ("AC-IQN", "IQN", "SAC", "DDPG", "Rainbow", or "DQN"), specify the corresponding agent_type and model_path, (2) Set exp_result_file_dir to the directory that saves experiment results.
We provide an example script that generates and saves settings of an VRX experiment episode. In set save_dir to the directory that saves experiment results. Then run the following command
Set run_with_exp_config to True, and exp_config_file to path to the VRX experiment episode config file. Then run the experiment with the following command.
We provide a script that visualizes trajectories in the VRX experiment episode. In set (1) episode_dir to the directory that saves experiment results, and (2) plot_steps as moments (list indices in the result data) when you would like to visualize vehicles' poses. Then run the following command
Set run_with_exp_config to False, and customize eval_schedules parameters (Note: we don't use the vortex model in this work, and num_cores is always 0). Then run experiments with the following command.