Code for "Synthesis of Biologically Realistic Human Motion Using Joint Torque Actuation", SIGGRAPH 2019, Part 2
https://arxiv.org/abs/1904.13041
This repository contains examples applying the paper's techniques to deep RL locomotion training. For the techniques themselves (how to learn R and E), and for examples of using them in Optimal Control, go to the sibling repository: https://github.com/jyf588/lrle
Installation: (reference: https://github.com/DartEnv/dart-env/wiki)
git clone https://github.com/jyf588/lrle-rl-examples.git
cd lrle-rl-examples
git checkout master(or release-old, see Usage)
1. Install Dart
/usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
brew cask install xquartz
brew install dartsim --only-dependencies
brew install cmake
brew install ode # ignore this if it tells you ode has already been installed
sudo apt-get install build-essential cmake pkg-config git
sudo apt-get install libeigen3-dev libassimp-dev libccd-dev libfcl-dev libboost-regex-dev libboost-system-dev
sudo apt-get install libopenscenegraph-dev
sudo apt-get install libbullet-dev
sudo apt-get install liburdfdom-dev
sudo apt-get install libnlopt-dev
sudo apt-get install libxi-dev libxmu-dev freeglut3-dev
sudo apt-get install libode-dev # ignore this if it tells you ode has already been installed
sudo apt-get install libtinyxml2-dev
sudo apt-get install libblacs-mpi-dev openmpi-bin
Note for Ubuntu 14.04 Trusty:
To correctly handle collision for capsule shape, ODE(Open Dynamics Engine) is required. However, currently, libode-dev seems to be broken on Trusty if installed from apt-get. To use capsule shape, please go to ODE download for installing ODE from source.
git clone git://github.com/dartsim/dart.git
cd dart
git checkout tags/v6.3.0
cp ../patches/lcp.cpp dart/external/odelcpsolver/lcp.cpp
mkdir build
cd build
cmake ..
make -j4
sudo make install
cd ..
cd ..
export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH
Note: If you have Nvidia drivers installed on your computer, you might need to do something similar to this RobotLocomotion/drake#2087 to address one error during Dart installation.
Install Anaconda first: https://www.anaconda.com, then:
conda create -n lrle-rl-env python=3.6
conda activate lrle-rl-env
Note: if you encouter permission denied errors below, try using "sudo chown" command to changed the privilege of the denied file/folder. It is in general bad practice to "sudo install/setup" something into a conda env.
3. Install PyDart2, a python binder for Dart
conda install swig
conda install pyqt=5
git clone https://github.com/sehoonha/pydart2.git
cd pydart2
cp ../patches/pydart2_draw.cpp pydart2/pydart2_draw.cpp
Modify setup.py: add a space before -framework Cocoa; add add CXX_FLAGS += '-stdlib=libc++ ' after the line, CXX_FLAGS += '-framework GLUT ' (this is a temporary issue and should not be needed soon)
If you are using Xcode 10 on MacOS X Mojave, run the following line:
MACOSX_DEPLOYMENT_TARGET=10.9 python setup.py build build_ext
otherwise:
python setup.py build build_ext
And then:
python setup.py develop
export PYTHONPATH=$PWD:$PYTHONPATH
cd ..
The installation is similar to openai gym. To install, simply do
cd dart-env
pip install -e .[dart]
cd ..
Use pip3 instead of pip if you plan to use python3.
cd baselines
pip install -e .
cd ..
pip install keras
Note: I would recommend install keras using pip rather than conda, since its dependency tensorflow was installed using pip as well when Baselines was installed.
conda install -c kidzik opensim
conda install matplotlib
Usage: (reference: https://github.com/VincentYu68/SymmetryCurriculumLocomotion)
There are two versions of the code in two separate branches. The master branch improves upon our results reported in the siggraph paper in terms of motion quality, by adding toes to the simulated human and better reward shaping. We no longer need curriculum training in this version.
Walking training:
cd baselines
LR+LE:
mpirun -np 8 python -m baselines.ppo1.run_humanoid_wtoe_walking --seed=some_number --HW_muscle_add_tor_limit=True --HW_muscle_add_energy_cost=True
LR:
mpirun -np 8 python -m baselines.ppo1.run_humanoid_wtoe_walking --seed=some_number --HW_muscle_add_tor_limit=True
BOX:
mpirun -np 8 python -m baselines.ppo1.run_humanoid_wtoe_walking --seed=some_number
AMTU:
mpirun -np 8 python -m baselines.ppo1.run_humanoid_wtoe_MD_walking --seed=some_number
Running training:
LR+LE:
mpirun -np 8 python -m baselines.ppo1.run_humanoid_wtoe_running --seed=some_number --HW_muscle_add_tor_limit=True --HW_muscle_add_energy_cost=True
LR:
mpirun -np 8 python -m baselines.ppo1.run_humanoid_wtoe_running --seed=some_number --HW_muscle_add_tor_limit=True
BOX:
mpirun -np 8 python -m baselines.ppo1.run_humanoid_wtoe_running --seed=some_number --HW_energy_weight=0.2 --HW_alive_pen=7.0
(The default params for LR+LE could not train a successful locomotion policy for BOX.)
AMTU:
mpirun -np 8 python -m baselines.ppo1.run_humanoid_wtoe_MD_running --seed=some_number
Testing and visualizing trained policies:
python test_policy.py DartHumanWalker-v2 PATH_TO_POLICY
Or for AMTU walking/running
python test_policy.py DartHumanWalkerMD-v2 PATH_TO_POLICY
The agent learns to walk/run in several hundred iterations, but letting it train longer usually gives more natural gaits.
Note 1: reading the following few files (instead of the whole repo) will suffice if you want to learn about the implementation details: baselines/baselines/ppo1/, dart-env/gym/envs/dart
Note 2: there is a bug with the argparse library: https://stackoverflow.com/questions/15008758/parsing-boolean-values-with-argparse, and setting --HW_muscle_add_tor_limit=False will not work.
This branch is thus deprecated and left here only for reproducing the siggraph paper results. The training here uses curriculum and follows exactly the same pipeline as in previous work: https://arxiv.org/abs/1801.08093
Walking training:
LR+LE:
mpirun -np 8 python -m baselines.ppo1.run_humanoid_staged_learning --seed=xxx --HW_energy_weight=0.5 --HW_muscle_add_tor_limit=True --HW_muscle_add_energy_cost=True
LR:
mpirun -np 8 python -m baselines.ppo1.run_humanoid_staged_learning --seed=xxx --HW_energy_weight=0.5 --HW_muscle_add_tor_limit=True
BOX:
mpirun -np 8 python -m baselines.ppo1.run_humanoid_staged_learning --seed=xxx --HW_energy_weight=0.5
AMTU:
mpirun -np 8 python -m baselines.ppo1.run_humanoid_MD_staged_learning --seed=xxx --HW_energy_weight=0.5
Note: argument —HW_energy_weight<0.4 will usually result in hopping motion
Running training:
LR+LE:
mpirun -np 8 python -m baselines.ppo1.run_humanoid_staged_learning --seed=xxx --HW_energy_weight=0.45 --HW_muscle_add_tor_limit=True --HW_muscle_add_energy_cost=True --HW_final_tv=3.5 --HW_tv_endtime=2.0
LR:
mpirun -np 8 python -m baselines.ppo1.run_humanoid_staged_learning --seed=xxx --HW_energy_weight=0.45 --HW_muscle_add_tor_limit=True --HW_final_tv=3.5 --HW_tv_endtime=2.0
BOX:
mpirun -np 8 python -m baselines.ppo1.run_humanoid_staged_learning --seed=xxx --HW_energy_weight=0.45 --HW_final_tv=3.5 --HW_tv_endtime=2.0
AMTU: fails to learn running
Testing and visualizing trained policy: First change env.env.final_tv & env.env.tv_endtime in test_policy.py to match the values during training, then:
python test_policy.py DartHumanWalker-v1 PATH_TO_POLICY
Or (for AMTU):
python test_policy_MD.py DartHumanWalkerMD-v1 PATH_TO_POLICY