Skip to content

orhir/PoseAnything

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

29 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ†• Please check out EdgeCape, our more recent effort in the same line of work.

A Graph-Based Approach for Category-Agnostic Pose Estimation [ECCV 2024]

Hugging Face Open in OpenXLab PWC

By Or Hirschorn and Shai Avidan

This repo is the official implementation of "A Graph-Based Approach for Category-Agnostic Pose Estimation".

πŸ”” News

  • 11 July 2024 Our paper will be presented at ECCV 2024.
  • 10 July 2024 Uploaded new annotations - fix a small bug of DeepFashion skeletons.
  • 2 Feburary 2024 Uploaded new weights - smaller models with stronger performance.
  • 20 December 2023 Demo is online on Huggingface and OpenXLab.
  • 7 December 2023 Official code release.

Introduction

We present a novel approach to CAPE that leverages the inherent geometrical relations between keypoints through a newly designed Graph Transformer Decoder. By capturing and incorporating this crucial structural information, our method enhances the accuracy of keypoint localization, marking a significant departure from conventional CAPE techniques that treat keypoints as isolated entities.

Citation

If you find this useful, please cite this work as follows:

@misc{hirschorn2023pose,
      title={A Graph-Based Approach for Category-Agnostic Pose Estimation}, 
      author={Or Hirschorn and Shai Avidan},
      year={2024},
      eprint={2311.17891},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2311.17891}, 
}

Getting Started

Docker [Recommended]

We provide a docker image for easy use. You can simply pull the docker image from docker hub, containing all the required libraries and packages:

docker pull orhir/pose_anything
docker run --name pose_anything -v {DATA_DIR}:/workspace/PoseAnything/PoseAnything/data/mp100 -it orhir/pose_anything /bin/bash

Conda Environment

We train and evaluate our model on Python 3.8 and Pytorch 2.0.1 with CUDA 12.1.

Please first install pytorch and torchvision following official documentation Pytorch. Then, follow MMPose to install the following packages:

mmcv-full=1.6.2
mmpose=0.29.0

Having installed these packages, run:

python setup.py develop

Demo on Custom Images

TRY IT NOW ON: HuggingFace / OpenXLab

We provide a demo code to test our code on custom images.

Gradio Demo

We first require to install gradio:

pip install gradio==3.44.0

Then, Download the pretrained model and run:

python app.py --checkpoint [path_to_pretrained_ckpt]

Terminal Demo

Download the pretrained model and run:

python demo.py --support [path_to_support_image] --query [path_to_query_image] --config configs/demo_b.py --checkpoint [path_to_pretrained_ckpt]

Note: The demo code supports any config with suitable checkpoint file. More pre-trained models can be found in the evaluation section.

Updated MP-100 Dataset

Please follow the official guide to prepare the MP-100 dataset for training and evaluation, and organize the data structure properly.

We provide an updated annotation file, which includes skeleton definitions, in the following link.

Please note:

Current version of the MP-100 dataset includes some discrepancies and filenames errors:

  1. Note that the mentioned DeepFasion dataset is actually DeepFashion2 dataset. The link in the official repo is wrong. Use this repo instead.
  2. We provide a script to fix CarFusion filename errors, which can be run by:
python tools/fix_carfusion.py [path_to_CarFusion_dataset] [path_to_mp100_annotation]

Training

Backbone Options

To use pre-trained Swin-Transformer as used in our paper, we provide the weights, taken from this repo, in the following link. These should be placed in the ./pretrained folder.

We also support DINO and ResNet backbones. To use them, you can easily change the config file to use the desired backbone. This can be done by changing the pretrained field in the config file to dinov2, dino or resnet respectively (this will automatically load the pretrained weights from the official repo).

Training

To train the model, run:

python train.py --config [path_to_config_file]  --work-dir [path_to_work_dir]

Evaluation and Pretrained Models

You can download the pretrained checkpoints from following link.

Here we provide the evaluation results of our pretrained models on MP-100 dataset along with the config files and checkpoints:

1-Shot Models

Setting split 1 split 2 split 3 split 4 split 5
Tiny 91.19 87.81 85.68 85.87 85.61
link / config link / config link / config link / config link / config
Small 94.73 89.79 90.69 88.09 90.11
link / config link / config link / config link / config link / config

5-Shot Models

Setting split 1 split 2 split 3 split 4 split 5
Tiny 94.24 91.32 90.15 90.37 89.73
link / config link / config link / config link / config link / config
Small 96.67 91.48 92.62 90.95 92.41
link / config link / config link / config link / config link / config

Evaluation

The evaluation on a single GPU will take approximately 30 min.

To evaluate the pretrained model, run:

python test.py [path_to_config_file] [path_to_pretrained_ckpt]

Acknowledgement

Our code is based on code from:

License

This project is released under the Apache 2.0 license.

About

A Graph-Based Approach for Category-Agnostic Pose Estimation [ECCV 2024]

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages