This folder contains the code implementation of the toy experiments in Section 4.1 of our paper.
- Launch a docker image through the following commands:
# assume the current directory is the root of this repository
docker run --rm -it --gpus all --ipc=host -v ${PWD}:/app nvcr.io/nvidia/pytorch:20.12-py3
# inside the docker container, run:
cd /app
- Install
conda
, create a conda envrionmentmeow
, and activate it:
conda create --name meow python=3.8 -y
source activate
conda activate meow
- Install the dependencies using the following commands:
pip install -r requirements.txt
- Train MEow using the following command:
python train.py config=multigoal/meow.yaml
- Visualize MEow's sampling trajectories using the following command:
python plot_multigoal_value.py config=multigoal/meow.yaml path_load=ckpts/MultiGoal-v0/meow/base/1-seed0/best.pt
NOTE: Replace
path_load
with the checkpoint's location.
This code implementation is developed based on the following repositories:
- VincentStimper/normalizing-flows (at commit 848277e) is licensed under the MIT License.
- rail-berkeley/softlearning (at commit 13cf187) is licensed under the MIT License.
If you find this repository useful, please consider citing our paper:
@inproceedings{chao2024maximum,
title={Maximum Entropy Reinforcement Learning via Energy-Based Normalizing Flow},
author={Chao, Chen-Hao and Feng, Chien and Sun, Wei-Fang and Lee, Cheng-Kuang and See, Simon and Lee, Chun-Yi},
booktitle={Proceedings of the International Conference on Neural Information Processing Systems (NeurIPS)},
year={2024}
}
Visit our GitHub pages by clicking the images above.