Zhi Wang, Li Zhang, Wenhao Wu, Yuanheng Zhu, Dongbin Zhao, Chunlin Chen*
A link to our paper can be found on arXiv
Official codebase for Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement
Experiments require MuJoCo and D4RL. Follow the instructions in the [MuJoCo][D4RL] to install.
Create a virtual environment using conda, and see requirments.txt
file for more information about how to install the dependencies.
conda create -n meta_dt python=3.8.18 -y
conda activate meta_dt
pip install -r requirements.txt
Note that we set done = False
in all environments, so we need to set done = False
for environments walker
and hopper
manually in package rand_param_envs
.
We also share our datasets below.
We use SAC to train agents on different environments and collect datasets.
Train agents on different tasks in AntDir:
python train_data_collection.py --env_type ant_dir --save_freq 4000 --task_id_start 0 --task_id_end 5
in which task_id_start
and task_id_end
mean that training tasks of [task_id_start, task_id_end).
We use checkpoints of traning process to generate datasets.
For medium
and expert
datasets, use:
python get_datassets.py --env_type ant_dir --data_type medium --task_id_start 0 --task_id_end 5 --capacity 20000
After obtaining datasets of all tasks, we should manually merge all task_info_{task_id}.json
files into one file named task_info.json
.
For medium-expert
datasets, we use a mix of 70% medium
and 30% expert
datasets.
- We share our datasets via this datasets
- We share our pretrained world model via this world_model
Train the context encoder using world model
python train_context.py --env_name AntDir-v0
Train the Meta Decision Transformer for few_shot Meta-DT
python train_meta_dt.py --env_name AntDir-v0 --zero_shot False --data_quality medium
Train the Meta Decision Transformer for zero_shot Meta-DT
python train_meta_dt.py --env_name AntDir-v0 --zero_shot True --data_quality medium
Please cite our paper as:
@inproceedings{
wang2024metadt,
title={Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement},
author={Zhi Wang and Li Zhang and Wenhao Wu and Yuanheng Zhu and Dongbin Zhao and Chunlin Chen},
booktitle={Advances in Neural Information Processing Systems},
year={2024},
}