GitHub - gopala-kr/DRL-Agents: research and implementations of Deep RL agents and their applications

RL Landscape

Back to top

Source: eleurent/phd-bibliography

RL Agents Implementation

Back to top

Value Optimization
- [QR-DQN]
- [DQN] - [Slides] [Code] [rainbow]
- [Bootstrapped DQN]
- [DDQN]
- [NEC]
- [MMC]
- [N-step Q Learning]
- [PAL]
- [Categorical DQN]
- [NAF]
Policy Optimization
- [Policy Gradient]
- [Actor Critic]
  - [DDPG] [Code]
    - [HAC DDPG]
    - [DDPG with HER]
  - [Clipped PPO]
  - [PPO]
[DFP]
Imitation
- [Behavioural cloning]
- [Inverse Reinforcement Learning] [Code] [irl-imitation-code]
- [Generative Adversarial Imitation Learning]

Behavioral Cloning (BC) (code)

Hierarchical Reinforcement Learning Agents

Back to top

Hierarchical Actor Critic (HAC) (code)

Memory Types

Back to top

Exploration Techniques

Back to top

E-Greedy (code)
Boltzmann (code)
Ornstein–Uhlenbeck process (code)
Normal Noise (code)
Truncated Normal Noise (code)
Bootstrapped Deep Q Network (code)
UCB Exploration via Q-Ensembles (UCB) (code)
Noisy Networks for Exploration (code)

RL History

Back to top

Temporal difference(TD) learning (1988)
Q‐learning (1998)
BayesRL (2002)
RMAX (2002)
CBPI (2002)
PEGASUS (2002)
Least‐Squares Policy Iteration (2003)
Fitted Q‐Iteration (2005)
GTD (2009)
UCRL (2010)
REPS (2010)
DQN (2014) - DeepMind

Back to top

RL Environments

Back to top

[Acrobot]
[Bike]
[Blackjack]
[Cartpole]
[ContextBandit]
[Continuous Chain]
[Corridor]
[Discrete Chain]
[Discretiser (for continuous environments)]
[Double Loop]
[Environment]
[Gridworld]
[Inventory management]
[Linear context bandit]
[Linear dynamic quadratic]
[Mountaincar (2d and 3d)]
[POMDP Maze]
[Optimistic Task]
[Puddleworld]
[Random MDPs]
[Riverswim]

RL Mechanisms

Back to top

[Attention and Memory]
[Unsupervised learning ]
- [GANs]
- [GQN]
- [UNREAL]
[Hierarchical RL]
- [FuNs]
- [Option-Critic]
- [STRAW]
- [h-DQN]
- [Stochastic Neural Networks]
[Multi-agent RL]
[Relational RL]
[Learning to Learn, a.k.a. Meta-Learning]
- [Few/One/Zero-shot Learning]
  - [MAML]
- [Transfer and Multi-Task Learning]
- [Learning to Optimize]
- [Learning to Re-inforcement Learn]
- [Learning Combinatorial Optimization]
- [AutoML]

RL Games

Back to top

Chinook (1997;2007) for Checkers,
Deep Blue (2002) for chess,
Logistello (1999) for Othello,
TD-Gammon (1994) for Backgammon,
GIB (2001) for contract bridge,
MoHex (2017) for Hex,
DQN (2016)(2018) for Atari 2600 games,
AlphaGo (2016a) and AlphaGo Zero (2017) for Go,
Alpha Zero (2017) for chess, shogi, and Go,
Cepheus (2015), DeepStack (2017), and Libratus (2017a;b) for heads-up Texas Hold’em Poker,
Jaderberg et al. (2018) for Quake III Arena Capture the Flag,
OpenAI Five, for Dota 2 at 5v5, https://openai.com/five/,
Zambaldi et al. (2018), Sun et al. (2018), and Pang et al. (2018) for StarCraft II

Back to top

[Board Games]
- [Computer Go]
- [AlphaGo: Trainig pipeline with MCTS]
- [AlphaGo Zero]
- [Alpha Zero]
[Card Games]
- [DeepStack]
[Video Games]
- [Atari 2600 games]
- [StarCraft]
- [StarCraft II mini-games]
- [Quake III Arena]
- [Minecraft]
- [Super Smash Bros]
- [Doom]
- [ViZDoom]

DRL applied to Robotics

Back to top

[Sim-to-Real]
- [MuJoCo]
[Imitation Learning]
[Value-based Learning]
[Policy-based Learning]
[Model-based Learning]
[Autonomous Driving Vehicles]

DRL applied to NLP

Back to top

[Sequence Generation]
[Machine Translation]
[Dialogue Systems]

DRL applied to Vision

Back to top

[Recognition]
[Motion Analysis]
[Scene Understanding]
[Vision + NLP]
[Visual Control]
[Interactive Perception]

References

Back to top

Maintainer

Gopala KR / @gopala-kr

Name		Name	Last commit message	Last commit date
Latest commit History 556 Commits
agents		agents
gntr		gntr
resources/img		resources/img
.gitignore		.gitignore
24.md		24.md
LICENSE		LICENSE
README.md		README.md
__init__.py		__init__.py
platforms.md		platforms.md
ref-implementations.md		ref-implementations.md
review-papers.md		review-papers.md
rl20.md		rl20.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Contents

RL Landscape

RL Agents Implementation

Value Optimization Agents

Policy Optimization Agents

General Agents

Imitation Learning Agents

Hierarchical Reinforcement Learning Agents

Memory Types

Exploration Techniques

RL History

RL Environments

RL Mechanisms

RL Games

DRL applied to Robotics

DRL applied to NLP

DRL applied to Vision

References

About

Releases

Packages

Languages

License

gopala-kr/DRL-Agents

Folders and files

Latest commit

History

Repository files navigation

Contents

RL Landscape

RL Agents Implementation

Value Optimization Agents

Policy Optimization Agents

General Agents

Imitation Learning Agents

Hierarchical Reinforcement Learning Agents

Memory Types

Exploration Techniques

RL History

RL Environments

RL Mechanisms

RL Games

DRL applied to Robotics

DRL applied to NLP

DRL applied to Vision

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages