Collaborative Learning for Weakly Supervised Object Detection

If you use this code in your research, please cite

@inproceedings{ijcai2018-135,
  title     = {Collaborative Learning for Weakly Supervised Object Detection},
  author    = {Jiajie Wang and Jiangchao Yao and Ya Zhang and Rui Zhang},
  booktitle = {Proceedings of the Twenty-Seventh International Joint Conference on
               Artificial Intelligence, {IJCAI-18}},
  publisher = {International Joint Conferences on Artificial Intelligence Organization},             
  pages     = {971--977},
  year      = {2018},
  month     = {7},
  doi       = {10.24963/ijcai.2018/135},
  url       = {https://doi.org/10.24963/ijcai.2018/135},
}

Prerequisites

A basic pytorch installation. The version is 0.2. If you are using the old version 0.1.12, you can checkout 0.1.12 branch.
Python packages you might not have: cffi, opencv-python, easydict (similar to py-faster-rcnn). For easydict make sure you have the right version. Xinlei uses 1.6.
tensorboard-pytorch to visualize the training and validation curve. Please build from source to use the latest tensorflow-tensorboard.

Installation

Clone the repository

git clone https://github.com/ruotianluo/pytorch-faster-rcnn.git

Choose your -arch option to match your GPU for step 3 and 4.

GPU model	Architecture
TitanX (Maxwell/Pascal)	sm_52
GTX 960M	sm_50
GTX 1080 (Ti)	sm_61
Grid K520 (AWS g2.2xlarge)	sm_30
Tesla K80 (AWS p2.xlarge)	sm_37

Note: You are welcome to contribute the settings on your end if you have made the code work properly on other GPUs.

Build RoiPooling module

cd pytorch-faster-rcnn/lib/layer_utils/roi_pooling/src/cuda
echo "Compiling roi_pooling kernels by nvcc..."
nvcc -c -o roi_pooling_kernel.cu.o roi_pooling_kernel.cu -x cu -Xcompiler -fPIC -arch=sm_52
cd ../../
python build.py
cd ../../../

Build NMS

cd lib/nms/src/cuda
echo "Compiling nms kernels by nvcc..."
nvcc -c -o nms_kernel.cu.o nms_kernel.cu -x cu -Xcompiler -fPIC -arch=sm_52
cd ../../
python build.py
cd ../../

Setup data

Please follow the instructions of py-faster-rcnn here to setup VOC. The steps involve downloading data and optionally creating soft links in the data folder. Since faster RCNN does not rely on pre-computed proposals, it is safe to ignore the steps that setup proposals.

If you find it useful, the data/cache folder created on Xinlei's side is also shared here.

Train your own model

Download pre-trained models and weights. For the pretrained wsddn model, you can find the download link here. For other pre-trained models like VGG16 and Resnet V1 models, they are provided by pytorch-vgg and pytorch-resnet (the ones with caffe in the name). You can download them in the data/imagenet_weights folder. For example for VGG16 model, you can set up like:

mkdir -p data/imagenet_weights
cd data/imagenet_weights
python # open python in terminal and run the following Python code

import torch
from torch.utils.model_zoo import load_url
from torchvision import models

sd = load_url("https://s3-us-west-2.amazonaws.com/jcjohns-models/vgg16-00b39a1b.pth")
sd['classifier.0.weight'] = sd['classifier.1.weight']
sd['classifier.0.bias'] = sd['classifier.1.bias']
del sd['classifier.1.weight']
del sd['classifier.1.bias']

sd['classifier.3.weight'] = sd['classifier.4.weight']
sd['classifier.3.bias'] = sd['classifier.4.bias']
del sd['classifier.4.weight']
del sd['classifier.4.bias']

torch.save(sd, "vgg16.pth")

cd ../..

For Resnet101, you can set up like:

mkdir -p data/imagenet_weights
cd data/imagenet_weights
# download from my gdrive (link in pytorch-resnet)
mv resnet101-caffe.pth res101.pth
cd ../..

Train (and test, evaluation)

./experiments/scripts/train.sh [GPU_ID] [DATASET] [NET] [WSDDN_PRETRAINED]
# Examples:
./experiments/scripts/train.sh 0 pascal_voc vgg16 path_to_wsddn_pretrained_model

Visualization with Tensorboard

tensorboard --logdir=tensorboard/vgg16/voc_2007_trainval/ --port=7001 &

Test and evaluate

./experiments/scripts/test.sh [GPU_ID] [DATASET] [NET] [WSDDN_PRETRAINED]
# Examples:
./experiments/scripts/test.sh 0 pascal_voc vgg16 path_to_wsddn_pretrained_model

By default, trained networks are saved under:

output/[NET]/[DATASET]/default/

Test outputs are saved under:

output/[NET]/[DATASET]/default/[SNAPSHOT]/

Tensorboard information for train and validation is saved under:

tensorboard/[NET]/[DATASET]/default/
tensorboard/[NET]/[DATASET]/default_val/

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
data/scripts		data/scripts
experiments		experiments
lib		lib
tools		tools
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Collaborative Learning for Weakly Supervised Object Detection

Prerequisites

Installation

Setup data

Train your own model

Our results can be found here

About

Releases

Packages

Languages

License

alexshaodong/Collaborative-Learning-for-Weakly-Supervised-Object-Detection

Folders and files

Latest commit

History

Repository files navigation

Collaborative Learning for Weakly Supervised Object Detection

Prerequisites

Installation

Setup data

Train your own model

Our results can be found here

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages