Reproduce PPO with PARL

Based on PARL, the PPO algorithm of deep reinforcement learning has been reproduced, reaching the same level of indicators as the paper in mujoco benchmarks.

Paper: PPO in Proximal Policy Optimization Algorithms

Mujoco games introduction

Please see here to know more about Mujoco games.

Benchmark result

How to use

Dependencies:

paddle>=2.0.0
parl>=2.0.2
gym==0.9.2
mujoco-py==0.5.7

Start Training:

# To train an agent for Hopper-v1 game
python train.py

# For more customized arguments
# python train.py --help

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Reproduce PPO with PARL

Mujoco games introduction

Benchmark result

How to use

Dependencies:

Start Training:

Files

README.md

Latest commit

History

README.md

File metadata and controls

Reproduce PPO with PARL

Mujoco games introduction

Benchmark result

How to use

Dependencies:

Start Training: