Skip to content

Code of the Paper "The Successful Ingredients of Policy Gradient Algorithms"

Notifications You must be signed in to change notification settings

SvenGronauer/successful-ingredients-paper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Successful Ingredients of Policy Gradient Algorithms

This repository contains source code used to produce the results reported in:

The Successful Ingredients of Policy Gradient Algorithms
Sven Gronauer, Martin Gottwald, Klaus Diepold
Proceedings of the Thirtieth International Joint Conference on Artificial Intelligence, IJCAI 2021

For the technical appendix, please see: Technical Appendix PDF

Operating Systems

We tested this repository under MacOSX (Catalina) and Linux Ubuntu (20.04 LTS) with Python 3.7 and 3.8. We cannot guarantee that this repository works under deviating distributions or versions.

Major dependencies:

  • PyBullet (pybullet==3.0.6)
  • PyTorch (torch==1.6.0)

Benchmarked Environments

Locomotion

  • HalfCheetahBulletEnv-v0
  • HopperBulletEnv-v0
  • AntBulletEnv-v0
  • Walker2DBulletEnv-v0
  • HumanoidBulletEnv-v0

Manipulation

  • ReacherBulletEnv-v0
  • PusherBulletEnv-v0
  • KukaBulletEnv-v0

Comments on Hardware

The default setup is that each algorithm run is executed in a single thread (single learner setup). For each policy iteration, a batch of 32k transition samples is collected, which are then used for policy updates. We use Threadripper 3970X (64 threads) and 3990X CPUs (128 threads) which offer high amount of parallel processing.

Quick Start

Installation

git clone https://github.com/SvenGronauer/successful-ingredients-paper.git
cd successful-ingredients-paper
pip install -e .

Usage

    python -m sipga.train --alg ALG --env ENV

where for ALG you can choose between [iwpg, ppo, trpo, npg] and for ENV any child class of OpenAI's gym.Env

About

Code of the Paper "The Successful Ingredients of Policy Gradient Algorithms"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages