Skip to content

mehdiboubnan/Deep-Reinforcement-Learning-applied-to-DOOM

Repository files navigation

Deep Reinforcement learning Applied to DOOM.

This is the final project for the Reinforcement Learning Course of the 2018/2019 MVA Master class.

This project is carried by Mehdi Boubnan & Ayman Chaouki. It consists of training an agent to play in different scenarios of the game DOOM with deep reinforcement learning methods from Deep Q learning and its enhancements like double Q learning, deep recurrent network (with LSTM), deep dueling architecture and prioritized replay to Asynchronous Advantage Actor-Critic (A3C) and Curiosity-Driven learning.

You can take a look at our paper Deep reinforcement learning applied to Doom for more details about the algorithms and some empirical results.

Here are two examples of agents trained with A3C.

Getting Started

Prerequisites

  • Operating system enabling the installation of VizDoom (there are some building problems with Ubuntu 16.04 for example), we use Ubuntu 18.04.
  • NVIDIA GPU + CUDA and CuDNN (for optimal performance for deep Q learning methods).
  • Python 3.6 (in order to install tensorflow).

Installation

conda install pytorch torchvision -c pytorch
pip install tensorboard
pip install tensorboardX
  • Install moviepy.
pip install moviepy
  • Clone this repo
git clone https://github.com/Swirler/Deep-Reinforcement-Learning-applied-to-DOOM
cd Deep-Reinforcement-Learning-applied-to-DOOM

Deep Q Learning

cd "Deep Q Learning"

Repositories

  • scenarios : Configurations and .wad files of the following scenarios (basic, deadly corridor and defend the center).
  • weights : The weights of training each scenario will be saved here.

Training

  • You can view training rewards, game variables and loss plots by running tensorboard --logdir runs and clicking the URL http://localhost:6006
  • Train a model with train.py , for example:
python train.py --scenario basic --window 1 --batch_size 32 --total_episodes 100 --lr 0.0001 --freq 20

Testing

  • The previous command saves training weights in weights/basic/ each 20 episodes. You can use the following command to view your agent playing:
python play.py --scenario basic --window 1 --weights weights/none_19.pth --total_episodes 20 --frame_skip 2

A3C & Curiosity

cd "A3C_Curiosity"

Repositories

  • scenarios : Configurations and .wad files of the following scenarios (basic, deadly corridor, defend the center, defend the line and my way home).
  • saves : Models, tensorboad summaries and workers gifs during training will be saved here.

Training

  • You can view training rewards, game variables and loss plots by running python utils/launch_tensorboard.py
  • Train a model with main.py , for example:
    • Deadly corridor with default parameters :
    python main.py --scenario deadly_corridor --actions all --num_workers 12 --max_episodes 1600
    
    • Basic with default parameters :
    python main.py --scenario basic --actions single --num_workers 12 --max_episodes 1200
    
    • Deadly corridor with default parameters with PPO:
    python main.py --use_ppo --scenario deadly_corridor --actions all --num_workers 12 --max_episodes 1600
    
    • Deadly corridor with default parameters with curiosity:
    python main.py --use_curiosity --scenario deadly_corridor --actions all --num_workers 12 --max_episodes 1600
    

See utils/args.py for more parameters.

Testing

  • You can use the following command to view your agent playing using the last trained model:
python main.py --play --scenario deadly_corridor --actions all --play_episodes 10