Skip to content

IRVLab/poseiden

Repository files navigation

Poseiden

This is the official code release for our paper "". [Paper] [Project Page]

demo_mads

Poseiden (Pose In Dynamic Environment) is a stereo-based 3D human pose estimation model capable of providing absolute-scale 3D human poses from stereo image pairs. Surpass the limitation of dynamic environments like underwater where 3D ground truths are extermely challenging to acquire, the model only requires 2D groud truths for training.

This repository is the implementation of the stereo-based 3D human pose estimation model proposed in the paper. For more information about the auto-refinement pipeline preposed in the paper, please refer to DiverPose-AutoRefinement (Coming Soon). Note that the model in this repository has been re-trained. Its performance is close to the model reported in the paper but does not match it exactly.

Table of Contents

Environment Setup

  1. Build Docker Image

    docker build -t diverpose docker/
    

    All codes within this repository should be able to run under this docker environment.

  2. Run Docker container:

    • Change the $WORKSPACE_DIR variables in run_container.sh to the path where you store this repository before running the following command.
    bash run_container.sh
    

Datasets

COCO (For Model Pretrain)

  • Poseiden requires pretraining process to enhance feature representations in transformer layers.
  1. Download the 2017 train and val images and annotations from COCO Keypoints Dataset.
  2. Move the data folder into data/ and structure as follows:
    data/
    └── coco/
        ├── annotations/
        ├── train2017/
        └── val2017/
    

MADS

  1. Download MADS_depth and MADS_multiview from MADS: Martial Arts, Dancing, and Sports Dataset
  2. Run extract_mads_data.py to extract images from videos
    python extract_mads_data.py \
        --depth_data_path <PATH_TO_MADS_depth> \
        --multiview_data_path <PAth_TO_MADS_multiview> \
        --output_path data/MADS_extract \
        --rectify
    • Note: the root value in conf/dataset/mads.yaml should be the same as the directory set for output_path (data/MADS_extract by default)
  3. (Optional) Visualize the data to check if loaded correctly
    python helpers/display_data_3d.py --config-name train_stereo dataset=mads
    

DiverPose

  • One of the key contributions of this paper is to automatically refine human annotations from stereo keypoints. Please refer to DiverPose-AutoRefinement for more details and guideline for download.
  • Once extracts image and annotations, put the data folder under data/.
  • (Optional) Visualize the data to check if loaded correctly
    python helpers/display_data_3d.py --config-name train_stereo dataset=diver
    

Train

We use the Hydra library to manage configurations. For more information, please refer to the Hydra documentation.

Pretrain

python train.py --config-name train_mono name=<CUSTOM_NAME_FOR_MODEL> dataset=coco

MADS

python train.py \
    --config-name train_stereo \
    name=name=<CUSTOM_NAME_FOR_MODEL> \
    model.backbone=gelanS \
    dataset=mads \
    model.pretrained=<PATH_TO_PRETRAIN_MODEL_WEIGHT> \
    model.dmin=5 \
    model.dmax=30 \

DiverPose

python train.py \
    --config-name train_stereo \
    name=name=<CUSTOM_NAME_FOR_MODEL> \
    model.backbone=gelanS \
    dataset=diver \
    model.pretrained=<PATH_TO_PRETRAIN_MODEL_WEIGHT> \
    model.dmin=2 \
    model.dmax=15 \
  • Note that you can also set model.pretrained="" to avoid loading weights from pretrained model.

Test

Make sure the mode configuration (dmin, dmax, backbone, etc.) used for testing are same for training.

MADS

python test.py \
    --config-name test_stereo \
    dataset=mads \
    model.backbone=gelanS \
    model.dmin=5 \
    model.dmax=30 \
    model_weight=<PATH_TO_MODEL_WEIGHT> \
    visualize=False (if true, visualize the estimations and ground truths)
  • Note: model weight is also provided here for demo

DiverPose

  • Due to the difficulity in collecting 3D ground truths underwater, we collect pseudo ground truths data for validation instead. Please refer to the paper for more details.
  • Download test data (coming soon)
  • Run:
    python test_diver.py \
        --config-name test_diver \
        model.backbone=gelanS \
        model.dmin=2 \
        model.dmax=15 \
        data_path=<PATH_TO_TEST_DATA> \
        model_weight=<PATH_TO_MODEL_WEIGHT> \
        yolo_weight=<PATH_TO_ONNX_MODEL>
    
  • Note: model weight is also provided here for demo
  • The YOLOv7 onnx model is used here to locate the diver and crop the region of it from the entire image. Please refer to DiverPose-AutoRefinement for more details or download the model weight here for convenience.

Acknowledgments

Several functions in this repository are adapted and modified from TransPose, YOLOv7, and mmpose.

Citation

If you use this code or the DiverPose dataset for your research, please cite:

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages