Simple pytorch implmentation of reinforcement learning algorithms

This repository is for those who want to implement the RL algorithms after reading the corresponding papers. All the algorithms are encapsulated in one file as minimum working examples, which let you focus more on the algorithm themselves.

Requirements:

python>=3.5
pytorch>=0.4.0
gym
drawnow

1. DQN

Discrete action space: CartPole-v0:

Code mainly adopted from https://pytorch.org/tutorials/intermediate/reinforcement_q_learning.html

2. REINFORCE

Discrete action space: CartPole-v0:

Code mainly adopted from https://github.com/JamesChuanggg/pytorch-REINFORCE

3. A2C

Discrete action space: CartPole-v0:

Continuous action space: MountainCarContinuous-v0:

Code mainly adopted from https://gist.github.com/Ashioto/10ec680395db48ddac1ad848f5f7382c#file-actorcritic-py
I just rewrite the code using pytorch. As mentioned by the original author, this a2c code is extremely unstable, if you find it does not work, just kill the process and restart training :)

4. DDPG

Continuous action space: MountainCarContinuous-v0:

Code mainly adopted from https://github.com/lirnli/OpenAI-gym-solutions/blob/master/Continuous_Deep_Deterministic_Policy_Gradient_Net/DDPG%20Class%20ver2.ipynb
I just rewrite the code using pytorch, and fixed a small bug in the original code when updating the actor.

5. DDPG with Prioritized Experience Replay

Continuous action space: MountainCarContinuous-v0:

In the original paper, the prioritized experience replay is implemented using binary heap for efficient training. Due to the small buffer size in our toy example, I just use an array to store samples and sort it after every update.

6. TRPO

Continuous action space: MountainCarContinuous-v0:

Code mainly adopted from https://github.com/ikostrikov/pytorch-trpo
This piece of code is a little bit complicated, I've tried my best to place everything (LBFGS, conjugate gradient, line search, GAE, ...) in one file.
~ 300 lines, not so horrible, right?

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
figures		figures
README.md		README.md
a2c_cartp.py		a2c_cartp.py
a2c_mtcar.py		a2c_mtcar.py
acer_cartp.py		acer_cartp.py
ddpg_mtcar.py		ddpg_mtcar.py
ddpg_per_mtcar.py		ddpg_per_mtcar.py
dqn_cartp.py		dqn_cartp.py
ppo_mtcar.py		ppo_mtcar.py
reinforce_cartp.py		reinforce_cartp.py
trpo_mtcar.py		trpo_mtcar.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Simple pytorch implmentation of reinforcement learning algorithms

Requirements:

1. DQN

Discrete action space: CartPole-v0:

2. REINFORCE

Discrete action space: CartPole-v0:

3. A2C

Discrete action space: CartPole-v0:

Continuous action space: MountainCarContinuous-v0:

4. DDPG

Continuous action space: MountainCarContinuous-v0:

5. DDPG with Prioritized Experience Replay

Continuous action space: MountainCarContinuous-v0:

6. TRPO

Continuous action space: MountainCarContinuous-v0:

7. PPO

Continuous action space: MountainCarContinuous-v0:

8. ACER

Discrete action space: CartPole-v0:

About

Releases

Packages

Languages

zzzxxxttt/pytorch_simple_RL

Folders and files

Latest commit

History

Repository files navigation

Simple pytorch implmentation of reinforcement learning algorithms

Requirements:

1. DQN

Discrete action space: CartPole-v0:

2. REINFORCE

Discrete action space: CartPole-v0:

3. A2C

Discrete action space: CartPole-v0:

Continuous action space: MountainCarContinuous-v0:

4. DDPG

Continuous action space: MountainCarContinuous-v0:

5. DDPG with Prioritized Experience Replay

Continuous action space: MountainCarContinuous-v0:

6. TRPO

Continuous action space: MountainCarContinuous-v0:

7. PPO

Continuous action space: MountainCarContinuous-v0:

8. ACER

Discrete action space: CartPole-v0:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages