Poseiden

This is the official code release for our paper "". [Paper] [Project Page]

Poseiden (Pose In Dynamic Environment) is a stereo-based 3D human pose estimation model capable of providing absolute-scale 3D human poses from stereo image pairs. Surpass the limitation of dynamic environments like underwater where 3D ground truths are extermely challenging to acquire, the model only requires 2D groud truths for training.

This repository is the implementation of the stereo-based 3D human pose estimation model proposed in the paper. For more information about the auto-refinement pipeline preposed in the paper, please refer to DiverPose-AutoRefinement (Coming Soon). Note that the model in this repository has been re-trained. Its performance is close to the model reported in the paper but does not match it exactly.

Environment Setup

Build Docker Image
```
docker build -t diverpose docker/
```
All codes within this repository should be able to run under this docker environment.
Run Docker container:
- Change the $WORKSPACE_DIR variables in run_container.sh to the path where you store this repository before running the following command.
```
bash run_container.sh
```

Datasets

COCO (For Model Pretrain)

Poseiden requires pretraining process to enhance feature representations in transformer layers.

Download the 2017 train and val images and annotations from COCO Keypoints Dataset.

Move the data folder into data/ and structure as follows:

data/
└── coco/
    ├── annotations/
    ├── train2017/
    └── val2017/

MADS

Download MADS_depth and MADS_multiview from MADS: Martial Arts, Dancing, and Sports Dataset

Run extract_mads_data.py to extract images from videos

python extract_mads_data.py \
    --depth_data_path <PATH_TO_MADS_depth> \
    --multiview_data_path <PAth_TO_MADS_multiview> \
    --output_path data/MADS_extract \
    --rectify

Note: the root value in conf/dataset/mads.yaml should be the same as the directory set for output_path (data/MADS_extract by default)

(Optional) Visualize the data to check if loaded correctly

python helpers/display_data_3d.py --config-name train_stereo dataset=mads

DiverPose

One of the key contributions of this paper is to automatically refine human annotations from stereo keypoints. Please refer to DiverPose-AutoRefinement for more details and guideline for download.
Once extracts image and annotations, put the data folder under data/.

(Optional) Visualize the data to check if loaded correctly

python helpers/display_data_3d.py --config-name train_stereo dataset=diver

Train

We use the Hydra library to manage configurations. For more information, please refer to the Hydra documentation.

Pretrain

python train.py --config-name train_mono name=<CUSTOM_NAME_FOR_MODEL> dataset=coco

MADS

python train.py \
    --config-name train_stereo \
    name=name=<CUSTOM_NAME_FOR_MODEL> \
    model.backbone=gelanS \
    dataset=mads \
    model.pretrained=<PATH_TO_PRETRAIN_MODEL_WEIGHT> \
    model.dmin=5 \
    model.dmax=30 \

DiverPose

python train.py \
    --config-name train_stereo \
    name=name=<CUSTOM_NAME_FOR_MODEL> \
    model.backbone=gelanS \
    dataset=diver \
    model.pretrained=<PATH_TO_PRETRAIN_MODEL_WEIGHT> \
    model.dmin=2 \
    model.dmax=15 \

Note that you can also set model.pretrained="" to avoid loading weights from pretrained model.

Test

Make sure the mode configuration (dmin, dmax, backbone, etc.) used for testing are same for training.

MADS

python test.py \
    --config-name test_stereo \
    dataset=mads \
    model.backbone=gelanS \
    model.dmin=5 \
    model.dmax=30 \
    model_weight=<PATH_TO_MODEL_WEIGHT> \
    visualize=False (if true, visualize the estimations and ground truths)

Note: model weight is also provided here for demo

DiverPose

Due to the difficulity in collecting 3D ground truths underwater, we collect pseudo ground truths data for validation instead. Please refer to the paper for more details.
Download test data (coming soon)

Run:

python test_diver.py \
    --config-name test_diver \
    model.backbone=gelanS \
    model.dmin=2 \
    model.dmax=15 \
    data_path=<PATH_TO_TEST_DATA> \
    model_weight=<PATH_TO_MODEL_WEIGHT> \
    yolo_weight=<PATH_TO_ONNX_MODEL>

Note: model weight is also provided here for demo
The YOLOv7 onnx model is used here to locate the diver and crop the region of it from the entire image. Please refer to DiverPose-AutoRefinement for more details or download the model weight here for convenience.

Acknowledgments

Several functions in this repository are adapted and modified from TransPose, YOLOv7, and mmpose.

Citation

If you use this code or the DiverPose dataset for your research, please cite:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Poseiden

Table of Contents

Environment Setup

Datasets

COCO (For Model Pretrain)

MADS

DiverPose

Train

Pretrain

MADS

DiverPose

Test

MADS

DiverPose

Acknowledgments

Citation

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
conf		conf
datasets		datasets
docker		docker
engine		engine
helpers		helpers
images		images
models		models
tools		tools
.gitignore		.gitignore
README.md		README.md
extract_mads_data.py		extract_mads_data.py
run_container.sh		run_container.sh
test.py		test.py
test_diver.py		test_diver.py
train.py		train.py

IRVLab/poseiden

Folders and files

Latest commit

History

Repository files navigation

Poseiden

Table of Contents

Environment Setup

Datasets

COCO (For Model Pretrain)

MADS

DiverPose

Train

Pretrain

MADS

DiverPose

Test

MADS

DiverPose

Acknowledgments

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages