This repo contains the PyTorch implementation of the CVPR 2024 paper
Hierarchical Diffusion Policy for Multi-Task Robotic Manipulation
Xiao Ma, Sumit Patidar, Iain Haughton, Stephen James
CVPR 2024
Dyson Robot Learning Lab
HDP factorises a manipulation policy into a hierarchical structure: a high-level task-planning agent which predicts a distant next-best end-effector pose (NBP), and a low-level goal-conditioned diffusion policy which generates optimal motion trajectories. The factorised policy representation allows HDP to tackle both long-horizon task planning while generating fine-grained low-level actions. To generate context-aware motion trajectories while satisfying robot kinematics constraints, we present a novel kinematics-aware goal-conditioned control agent, Robot Kinematics Diffuser (RK-Diffuser). Specifically, RK-Diffuser learns to generate both the end-effector pose and joint position trajectories, and distill the accurate but kinematics-unaware end-effector pose diffuser to the kinematics-aware but less accurate joint position diffuser via differentiable kinematics.
In this repository, we provide the code for training the low-level RK-Diffuser. We use PerAct as our high-level agent and we refer to its official implementation for detailed training configurations. We also include the evaluation code for the HDP architecture.
For more details, see our project page.
conda create -n hdp python=3.10
conda activate hdp
bash ./extra_scripts/install_coppeliasim.sh
conda install pytorch torchvision pytorch-cuda=11.8 -c pytorch -c nvidia
pip install cffi==1.15
pip install -r requirements.txt
python setup.py develop
Please refer to the official guide of RLBench.
First, we need to generate the training dataset.
python extra_scripts/dataset_generator.py --save_path=<your dataset path> --tasks=<your task> --variations=1 --processes=1 --episodes_per_task=100
For example, to generate a dataset for the reach_target
task,
python extra_scripts/dataset_generator.py --save_path=/data/${USER}/rlbench --tasks=reach_target --variations=1 --processes=1 --episodes_per_task=100
The script will generate both train
and eval
datasets at the same time.
To train the low-level sim agent, simply do
python3 train_low_level.py env=sim env.data_path=<your dataset path> env.tasks="[<task1>, <task2>, ...]"
For example, to train a model for the reach_target
and the take_lid_off_saucepan
tasks, run
python3 train_low_level.py env=sim env.data_path=/data/${USER}/rlbench env.tasks="[reach_target, take_lid_off_saucepan]"
You can enable online logging or set wandb run name by adding the following args
log=True run_name=<your run name>
We also provide the full evaluation pipeline to HDP. To run the evaluation, simply do
python eval.py rlbench.tasks="[<your task>]" rlbench.headless=False method.model_path=<path to rk-diffuser ckpt> framework.logdir=<path to peract ckpt dir>
@article{ma2024hierarchical,
author = {Ma, Xiao and Patidar, Sumit and Haughton, Iain and James, Stephen},
title = {Hierarchical Diffusion Policy for Kinematics-Aware Multi-Task Robotic Manipulation},
journal = {CVPR},
year = {2024},
}
This repository is adapted from PerAct and decision diffusers.