Guanxing Lu *, Tengbo Yu *, Haoyuan Deng, Season Si Chen, Yansong Tang †, Ziwei Wang
[Project Page] | [Paper] | [Real-World Codebase] | [YouTube] | [X]
AnyBimanual is a training framework to transfer any pretrained unimanual robotic manipulation policy to multi-task bimanual manipulation policy with few bimanual demonstrations. We first introduce a skill manager to dynamically schedule the skill representations discovered from pretrained unimanual policy for bimanual manipulation tasks, which linearly combines skill primitives with task-oriented compensation to represent the bimanual manipulation instruction. To mitigate the observation discrepancy between unimanual and bimanual systems, we present a visual aligner to generate soft masks for visual embedding of the workspace, which aims to align visual input of unimanual policy model for each arm with those during pretraining stage. AnyBimanual shows superiority on 12 simulated tasks from RLBench2 with a sizable 12.67% improvement in success rate over previous methods. Experiments on 9 real-world tasks further verify its practicality with an average success rate of 84.62%.
method.mp4
🎉 NEWS:
- Dec. 2024: Codebase for both simulated and real-world experiments is released!
- Release pretrained checkpoints.
NOTE: AnyBimanual is mainly built upon the Perceiver-Actor^2 repo by Markus Grotz et al.
See INSTALL.md for installation instructions.
See ERROR_CATCH.md for error catching.
The following steps are structured in order.
Please checkout the website for pre-generated RLBench
demonstrations. If you directly use these
datasets, you don't need to run tools/bimanual_data_generator.py
from
RLBench. Using these datasets will also help reproducibility since each scene
is randomly sampled in data_generator_bimanual.py
.
We use wandb to log some curves and visualizations. Login to wandb before running the scripts.
wandb login
To train our PerAct + AnyBimanual, run:
bash scripts/train.sh BIMANUAL_PERACT 0,1 12345 ${exp_name}
where the exp_name
can be specified as you like.
To train our PerAct-LF + AnyBimanual, run:
bash scripts/train.sh PERACT_BC 0,1 12345 ${exp_name}
To train our RVT-LF + AnyBimanual, run:
bash scripts/train.sh RVT 0,1 12345 ${exp_name}
Set the augmentation_type
in the scripts/train.sh
to choose whether to apply the augmentation methods mentioned in our paper or to use the original SE3 augmentation.
To evaluate the checkpoint in simulator, you can use:
bash scripts/eval.sh BIMANUAL_PERACT 0 ${exp_name}
You can refer to Demonstrations Collection by teleoperation to set up your device in the real world and collect raw data.
Convert raw data into RLbench2 form, run:
python3 anybimanual_real_supply/data/preprocess_ntu_dualarm.py
Keyframes selection, run:
python3 anybimanual_real_supply/data/auto_keyframe_mani.py
bash scripts/train_real.sh BIMANUAL_PERACT 0,1 12345 ${exp_name}
Run model inference scripts to receive real-world observation to generate actions, here we give an example of the Agent Class.
python3 anybimanual_real_supply/eval_agent_on_robot.py
After receiving the action generated by the model, you can refer to Bimanual_ur5e_action_control_for_IL to drive dual_UR5e to perform the action.
Release the checkpoints.
This repository is released under the MIT license.
Our code is built upon Perceiver-Actor^2, SkillDiffuser, PerAct, RLBench, and CLIP. We thank all these authors for their nicely open sourced code and their great contributions to the community.
If you find this repository helpful, please consider citing:
@article{lu2024anybimanual,
title={AnyBimanual: Transferring Unimanual Policy for General Bimanual Manipulation},
author={Lu, Guanxing and Yu, Tengbo and Deng, Haoyuan and Chen, Season Si and Tang, Yansong and Wang, Ziwei},
journal={arXiv preprint arXiv:2412.06779},
year={2024}
}