Inspired by implementation of daisatojp. Basic framework and SAC implementation are mostly taken from OpenAI SpinningUp
References:
MPO: link SAC: link RERPI: link
Python 3.9+ and working MuJoCo installation are required. Optional: Create conda environment with
conda create -n myenv python=3.9
Standard installation with pip
git clone https://github.com/freiberg-roman/mpo.git
cd mpo
pip install -e ".[dev]"
Test installation by running
python -m mpo.examples.main algorithm=mpo q_learning=retrace overrides=pendulum