Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement

Zhi Wang, Li Zhang, Wenhao Wu, Yuanheng Zhu, Dongbin Zhao, Chunlin Chen*

A link to our paper can be found on arXiv

Overview

Official codebase for Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement

Installation

Experiments require MuJoCo and D4RL. Follow the instructions in the [MuJoCo][D4RL] to install. Create a virtual environment using conda, and see requirments.txt file for more information about how to install the dependencies.

conda create -n meta_dt python=3.8.18 -y
conda activate meta_dt
pip install -r requirements.txt

Data Collection

Note that we set done = False in all environments, so we need to set done = False for environments walker and hopper manually in package rand_param_envs. We also share our datasets below.

Train SAC

We use SAC to train agents on different environments and collect datasets.
Train agents on different tasks in AntDir:

python train_data_collection.py --env_type ant_dir --save_freq 4000 --task_id_start 0 --task_id_end 5

in which task_id_start and task_id_end mean that training tasks of [task_id_start, task_id_end).

Generate Datasets

We use checkpoints of traning process to generate datasets. For medium and expert datasets, use:

python get_datassets.py --env_type ant_dir --data_type medium --task_id_start 0 --task_id_end 5 --capacity 20000

After obtaining datasets of all tasks, we should manually merge all task_info_{task_id}.json files into one file named task_info.json.

For medium-expert datasets, we use a mix of 70% medium and 30% expert datasets.

Downloads Datasets and pretrained world model

We share our datasets via this datasets
We share our pretrained world model via this world_model

Run Experiments

Train the context encoder using world model

python train_context.py --env_name AntDir-v0

Train the Meta Decision Transformer for few_shot Meta-DT

python train_meta_dt.py --env_name AntDir-v0 --zero_shot False --data_quality medium

Train the Meta Decision Transformer for zero_shot Meta-DT

python train_meta_dt.py --env_name AntDir-v0 --zero_shot True --data_quality medium

Citation

Please cite our paper as:

@inproceedings{
wang2024metadt,
title={Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement},
author={Zhi Wang and Li Zhang and Wenhao Wu and Yuanheng Zhu and Dongbin Zhao and Chunlin Chen},
booktitle={Advances in Neural Information Processing Systems},
year={2024},
}

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.vscode		.vscode
configs		configs
context		context
data_collection		data_collection
decision_transformer		decision_transformer
meta_dt		meta_dt
metaworld		metaworld
src		src
Meta-DT.jpg		Meta-DT.jpg
README.md		README.md
get_datasets.py		get_datasets.py
requirements.txt		requirements.txt
train_context.py		train_context.py
train_data_collection.py		train_data_collection.py
train_meta_dt.py		train_meta_dt.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement

Overview

Installation

Data Collection

Train SAC

Generate Datasets

Downloads Datasets and pretrained world model

Run Experiments

Citation

About

Releases

Packages

Contributors 2

Languages

NJU-RL/Meta-DT

Folders and files

Latest commit

History

Repository files navigation

Meta-DT: Offline Meta-RL as Conditional Sequence Modeling with World Model Disentanglement

Overview

Installation

Data Collection

Train SAC

Generate Datasets

Downloads Datasets and pretrained world model

Run Experiments

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages