Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image classification Networks Updating #979

Merged
merged 53 commits into from
Jun 15, 2018
Merged
Show file tree
Hide file tree
Changes from 37 commits
Commits
Show all changes
53 commits
Select commit Hold shift + click to select a range
2d09862
small fix.
May 31, 2018
a6a3896
small fix.
BigFishMaster May 31, 2018
6385127
add models folder
BigFishMaster May 31, 2018
f3bcc11
update readme
BigFishMaster May 31, 2018
a394d72
Update README.md
BigFishMaster May 31, 2018
4693b4d
Update README.md
BigFishMaster May 31, 2018
eb89a6e
Update README.md
BigFishMaster May 31, 2018
82bea3e
Update README.md
BigFishMaster May 31, 2018
2dc67ba
Update README.md
BigFishMaster Jun 1, 2018
7f61a69
Update README.md
BigFishMaster Jun 1, 2018
0f534bf
Update README.md
BigFishMaster Jun 1, 2018
9dfcd60
Update README.md
BigFishMaster Jun 1, 2018
1b301e1
Update README.md
BigFishMaster Jun 1, 2018
4814877
Update README.md
BigFishMaster Jun 1, 2018
b1c78e3
Update README.md
BigFishMaster Jun 1, 2018
604b72c
Update README.md
BigFishMaster Jun 1, 2018
f8131a9
add dpn
BigFishMaster Jun 11, 2018
f176173
Merge branch 'develop' of https://github.com/PaddlePaddle/models into…
BigFishMaster Jun 11, 2018
7a6ad78
update train.py
BigFishMaster Jun 11, 2018
177919f
update train.py
BigFishMaster Jun 12, 2018
49af8fd
update train and eval
BigFishMaster Jun 12, 2018
dfcf841
Update README.md
BigFishMaster Jun 12, 2018
90712cb
Update README.md
BigFishMaster Jun 12, 2018
4f82ea4
format
BigFishMaster Jun 13, 2018
d21d3c3
Merge branch 'image_classification' of https://github.com/BigFishMast…
BigFishMaster Jun 13, 2018
b5feaeb
Update README.md
BigFishMaster Jun 13, 2018
fc71934
Update README.md
BigFishMaster Jun 13, 2018
339af6f
update
BigFishMaster Jun 13, 2018
01a7b89
Merge branch 'image_classification' of https://github.com/BigFishMast…
BigFishMaster Jun 13, 2018
6908903
Update README.md
BigFishMaster Jun 13, 2018
1835b1f
update shell
BigFishMaster Jun 13, 2018
299fd41
Merge branch 'image_classification' of https://github.com/BigFishMast…
BigFishMaster Jun 13, 2018
609b87f
Update README.md
BigFishMaster Jun 13, 2018
96661c5
add yapf disable to args
BigFishMaster Jun 13, 2018
020f94b
Create README_cn.md
BigFishMaster Jun 13, 2018
6bd9871
Update README_cn.md
BigFishMaster Jun 13, 2018
a744622
update googlenet
BigFishMaster Jun 14, 2018
bb11061
add images folder
BigFishMaster Jun 14, 2018
5d5a1df
Update README.md
BigFishMaster Jun 14, 2018
df07ee4
Update README.md
BigFishMaster Jun 14, 2018
0e44ef4
add curve.jpg
BigFishMaster Jun 14, 2018
de50be2
Merge branch 'image_classification' of https://github.com/BigFishMast…
BigFishMaster Jun 14, 2018
4b611d0
Update README.md
BigFishMaster Jun 14, 2018
b9c62e2
update download_imagenet2012.sh
BigFishMaster Jun 14, 2018
8db82cb
Update README.md
BigFishMaster Jun 14, 2018
1f6005f
Merge branch 'image_classification' of https://github.com/BigFishMast…
BigFishMaster Jun 14, 2018
a6e9f0f
Update README.md
BigFishMaster Jun 14, 2018
8113cbb
Update README.md
BigFishMaster Jun 14, 2018
94aaea6
Update README.md
BigFishMaster Jun 14, 2018
e8d2db6
Update README.md
BigFishMaster Jun 14, 2018
3803d34
Update README_cn.md
BigFishMaster Jun 14, 2018
2c59d9e
Update README.md
BigFishMaster Jun 14, 2018
20080a7
Update README_cn.md
BigFishMaster Jun 14, 2018
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
164 changes: 117 additions & 47 deletions fluid/image_classification/README.md
Original file line number Diff line number Diff line change
@@ -1,38 +1,35 @@
The minimum PaddlePaddle version needed for the code sample in this directory is the lastest develop branch. If you are on a version of PaddlePaddle earlier than this, [please update your installation](http://www.paddlepaddle.org/docs/develop/documentation/en/build_and_install/pip_install_en.html).
# Image Classification and Model Zoo
Image classification, which is an important field of computer vision, is to classify an image into pre-defined labels. Recently, many researchers developed different kinds of neural networks and highly improve the classification performance. This page introduces how to do image classification with PaddlePaddle, including [data preparation](#data-preparation), [training](#training-a-model), [finetuning](#finetuning), [evaluation](#evaluation) and [inference](#inference).

---
## Table of Contents
- [Installation](#installation)
- [Data preparation](#data-preparation)
- [Training a model with flexible parameters](#training-a-model)
- [Finetuning](#finetuning)
- [Evaluation](#evaluation)
- [Inference](#inference)
- [Supported models and performances](#supported-models)

# SE-ResNeXt for image classification
## Installation

This model built with paddle fluid is still under active development and is not
the final version. We welcome feedbacks.
Running sample code in this directory requires PaddelPaddle v0.10.0 and later. If the PaddlePaddle on your device is lower than this version, please follow the instructions in [installation document](http://www.paddlepaddle.org/docs/develop/documentation/zh/build_and_install/pip_install_cn.html) and make an update.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PaddelPaddle v0.10.0 -> PaddelPaddle Fluid v0.13.0


## Introduction
## Data preparation

The current code support the training of [SE-ResNeXt](https://arxiv.org/abs/1709.01507) (50/152 layers).

## Data Preparation

1. Download ImageNet-2012 dataset
An example for ImageNet classification is as follows. First of all, preparation of imagenet data can be done as:
```
cd data/
mkdir -p ILSVRC2012/
cd ILSVRC2012/
# get training set
wget http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_img_train.tar
# get validation set
wget http://www.image-net.org/challenges/LSVRC/2012/nnoupb/ILSVRC2012_img_val.tar
# prepare directory
tar xf ILSVRC2012_img_train.tar
tar xf ILSVRC2012_img_val.tar

# unzip all classes data using unzip.sh
sh unzip.sh
cd data/ILSVRC2012/
sh download_imagenet2012.sh
```

2. Download training and validation label files from [ImageNet2012 url](https://pan.baidu.com/s/1Y6BCo0nmxsm_FsEqmx2hKQ)(password:```wx99```). Untar it into workspace ```ILSVRC2012/```. The files include
In the shell script ```download_imagenet2012.sh```, there are two steps to prepare data:

**step-1:** Download ImageNet-2012 dataset from website. The training and validation data will be downloaded into folder "train" and "val" respectively.

**train_list.txt**: training list of imagenet 2012 classification task, with each line seperated by SPACE.
**step-2:** Download training and validation label files. There are two label files which contain train and validation image labels respectively:

* *train_list.txt*: label file of imagenet-2012 training set, with each line seperated by ```SPACE```, like:
```
train/n02483708/n02483708_2436.jpeg 369
train/n03998194/n03998194_7015.jpeg 741
Expand All @@ -41,7 +38,7 @@ train/n04596742/n04596742_3032.jpeg 909
train/n03208938/n03208938_7065.jpeg 535
...
```
**val_list.txt**: validation list of imagenet 2012 classification task, with each line seperated by SPACE.
* *val_list.txt*: label file of imagenet-2012 validation set, with each line seperated by ```SPACE```, like.
```
val/ILSVRC2012_val_00000001.jpeg 65
val/ILSVRC2012_val_00000002.jpeg 970
Expand All @@ -50,38 +47,111 @@ val/ILSVRC2012_val_00000004.jpeg 809
val/ILSVRC2012_val_00000005.jpeg 516
...
```
**synset_words.txt**: the semantic label of each class.

## Training a model
## Training a model with flexible parameters

To start a training task, one can use command line as:
After data preparation, one can start the training step by:

```
python train.py --num_layers=50 --batch_size=8 --with_mem_opt=True --parallel_exe=False
python train.py \
--model=SE_ResNeXt50_32x4d \
--batch_size=32 \
--total_images=1281167 \
--class_dim=1000
--image_shape=3,224,224 \
--model_save_dir=output/ \
--with_mem_opt=False \
--lr_strategy=piecewise_decay \
--lr=0.1
```
## Finetune a model
**parameter introduction:**
* **model**: name model to use. Default: "SE_ResNeXt50_32x4d".
* **num_epochs**: the number of epochs. Default: 120.
* **batch_size**: the size of each mini-batch. Default: 256.
* **use_gpu**: whether to use GPU or not. Default: True.
* **total_images**: total number of images in the training set. Default: 1281167.
* **class_dim**: the class number of the classification task. Default: 1000.
* **image_shape**: input size of the network. Default: "3,224,224".
* **model_save_dir**: the directory to save trained model. Default: "output".
* **with_mem_opt**: whether to use memory optimization or not. Default: False.
* **lr_strategy**: learning rate changing strategy. Default: "piecewise_decay".
* **lr**: initialized learning rate. Default: 0.1.
* **pretrained_model**: model path for pretraining. Default: None.
* **checkpoint**: the checkpoint path to resume. Default: None.

**data reader introduction:**

Data reader is defined in ```reader.py```. In [training stage](#training-a-model), random crop and flipping are used, while center crop is used in [evaluation](#inference) and [inference](#inference) stages. Supported data augmentation includes:
* rotation
* color jitter
* random crop
* center crop
* resize
* flipping

## Finetuning

Finetuning is to finetune model weights in a specific task by loading pretrained weights. After initializing ```path_to_pretrain_model``` , one can finetune a model as:
```
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to give the training results here.

python train.py --num_layers=50 --batch_size=8 --with_mem_opt=True --parallel_exe=False --pretrained_model="pretrain/96/"
python train.py
--model=SE_ResNeXt50_32x4d \
--pretrained_model=${path_to_pretrain_model} \
--batch_size=32 \
--total_images=1281167 \
--class_dim=1000 \
--image_shape=3,224,224 \
--model_save_dir=output/ \
--with_mem_opt=True \
--lr_strategy=piecewise_decay \
--lr=0.1
```
TBD
## Inference

## Evaluation
Evaluation is to evaluate the performance of a trained model. One can get top1/top5 accuracy by running the following command:
```
python infer.py --num_layers=50 --batch_size=8 --model='model/90' --test_list=''
python eval.py \
--model=SE_ResNeXt50_32x4d \
--batch_size=32 \
--class_dim=1000 \
--image_shape=3,224,224 \
--with_mem_opt=True \
--pretrained_model=${path_to_pretrain_model}
```
TBD

## Results

The SE-ResNeXt-50 model is trained by starting with learning rate ```0.1``` and decaying it by ```0.1``` after each ```10``` epoches. Top-1/Top-5 Validation Accuracy on ImageNet 2012 is listed in table.

|model | [original paper(Fig.5)](https://arxiv.org/abs/1709.01507) | Pytorch | Paddle fluid
|- | :-: |:-: | -:
|SE-ResNeXt-50 | 77.6%/- | 77.71%/93.63% | 77.42%/93.50%
## Inference
Inference is used to get prediction score or image features based on trained models.
```
python infer.py \
--model=SE_ResNeXt50_32x4d \
--batch_size=32 \
--class_dim=1000 \
--image_shape=3,224,224 \
--with_mem_opt=True \
--pretrained_model=${path_to_pretrain_model}
```

## Supported models and performances

Models are trained by starting with learning rate ```0.1``` and decaying it by ```0.1``` after each ```30``` epoches, if not special introduced. Available top-1/top-5 validation accuracy on ImageNet 2012 is listed in table. Pretrained models can be downloaded by clicking related model names.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

URLs are all pointing to image_classification directory. Clicking doesn't download them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cancel the urls without models, and add right urls to networks with trained models

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok


## Released models
|model | Baidu Cloud
|model | top-1/top-5 accuracy
|- | -:
|SE-ResNeXt-50 | [url]()
TBD
|[AlexNet](http://paddle-imagenet-models.bj.bcebos.com/alexnet_model.tar) | 57.21%/79.72%
|VGG11 | -
|VGG13 | -
|VGG16 | -
|VGG19 | -
|GoogleNet | -
|InceptionV4 | -
|MobileNet | -
|[ResNet50](http://paddle-imagenet-models.bj.bcebos.com/resnet_50_model.tar) | 76.63%/93.10%
|ResNet101 | -
|ResNet152 | -
|[SE_ResNeXt50_32x4d](http://paddle-imagenet-models.bj.bcebos.com/se_resnext_50_model.tar) | 78.33%/93.96%
|SE_ResNeXt101_32x4d | -
|SE_ResNeXt152_32x4d | -
|DPN68 | -
|DPN92 | -
|DPN98 | -
|DPN107 | -
|DPN131 | -
156 changes: 156 additions & 0 deletions fluid/image_classification/README_cn.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@

# 图像分类以及模型库
图像分类是计算机视觉的重要领域,它的目标是将图像分类到预定义的标签。近期,需要研究者提出很多不同种类的神经网络,并且极大的提升了分类算法的性能。本页将介绍如何使用PaddlePaddle进行图像分类,包括[数据准备](#data-preparation)、 [训练](#training-a-model)、[参数微调](#finetuning)、[模型评估](#evaluation)以及[模型推断](#inference)。

---
## 内容
- [安装](#installation)
- [数据准备](#data-preparation)
- [模型训练](#training-a-model)
- [参数微调](#finetuning)
- [模型评估](#evaluation)
- [模型推断](#inference)
- [已有模型及其性能](#supported-models)

## 安装

在当前目录下运行样例代码需要PadddlePaddle的v0.10.0或以上的版本。如果你的运行环境中的PaddlePaddle低于此版本,请根据[安装文档](http://www.paddlepaddle.org/docs/develop/documentation/zh/build_and_install/pip_install_cn.html)中的说明来更新PaddlePaddle。
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PadddlePaddle的v0.10.0 -> PadddlePaddle Fluid v0.13.0


## 数据准备

下面给出了ImageNet分类任务的样例,首先,通过如下的方式进行数据的准备:
```
cd data/ILSVRC2012/
sh download_imagenet2012.sh
```
在```download_imagenet2012.sh```脚本中,通过下面两步来准备数据:

**步骤一:** 从ImageNet官网下载ImageNet-2012的图像数据。训练以及验证数据集会分别被下载到"train" 和 "val" 目录中。

**步骤二:** 下载训练与验证集合对应的标签文件。下面两个文件分别包含了训练集合与验证集合中图像的标签:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

告诉用户需要注册ImageNet官网,需要获取key? 提醒下用户下载ImageNet时间较长, 如果用户已经有了数据,按照下面方式组织。

同步改下英文。


* *train_list.txt*: ImageNet-2012训练集合的标签文件,每一行采用"空格"分隔图像路径与标注,例如:
```
train/n02483708/n02483708_2436.jpeg 369
train/n03998194/n03998194_7015.jpeg 741
train/n04523525/n04523525_38118.jpeg 884
train/n04596742/n04596742_3032.jpeg 909
train/n03208938/n03208938_7065.jpeg 535
...
```
* *val_list.txt*: ImageNet-2012验证集合的标签文件,每一行采用"空格"分隔图像路径与标注,例如:
```
val/ILSVRC2012_val_00000001.jpeg 65
val/ILSVRC2012_val_00000002.jpeg 970
val/ILSVRC2012_val_00000003.jpeg 230
val/ILSVRC2012_val_00000004.jpeg 809
val/ILSVRC2012_val_00000005.jpeg 516
...
```

## 模型训练

数据准备完毕后,可以通过如下的方式启动训练:
```
python train.py \
--model=SE_ResNeXt50_32x4d \
--batch_size=32 \
--total_images=1281167 \
--class_dim=1000
--image_shape=3,224,224 \
--model_save_dir=output/ \
--with_mem_opt=False \
--lr_strategy=piecewise_decay \
--lr=0.1
```
**参数说明:**
* **model**: name model to use. Default: "SE_ResNeXt50_32x4d".
* **num_epochs**: the number of epochs. Default: 120.
* **batch_size**: the size of each mini-batch. Default: 256.
* **use_gpu**: whether to use GPU or not. Default: True.
* **total_images**: total number of images in the training set. Default: 1281167.
* **class_dim**: the class number of the classification task. Default: 1000.
* **image_shape**: input size of the network. Default: "3,224,224".
* **model_save_dir**: the directory to save trained model. Default: "output".
* **with_mem_opt**: whether to use memory optimization or not. Default: False.
* **lr_strategy**: learning rate changing strategy. Default: "piecewise_decay".
* **lr**: initialized learning rate. Default: 0.1.
* **pretrained_model**: model path for pretraining. Default: None.
* **checkpoint**: the checkpoint path to resume. Default: None.

**数据读取器说明:**

数据读取器定义在```reader.py```中。在[训练阶段](#training-a-model), 默认采用的增广方式是随机裁剪与水平翻转, 而在[评估](#inference)与[推断](#inference)阶段用的默认方式是中心裁剪。当前支持的数据增广方式有:
* 旋转
* 颜色抖动
* 随机裁剪
* 中心裁剪
* 长宽调整
* 水平翻转

## 参数微调

参数微调是指在特定任务上微调已训练模型的参数。通过初始化```path_to_pretrain_model```,微调一个模型可以采用如下的命令:
```
python train.py
--model=SE_ResNeXt50_32x4d \
--pretrained_model=${path_to_pretrain_model} \
--batch_size=32 \
--total_images=1281167 \
--class_dim=1000 \
--image_shape=3,224,224 \
--model_save_dir=output/ \
--with_mem_opt=True \
--lr_strategy=piecewise_decay \
--lr=0.1
```

## 模型评估
模型评估是指对训练完毕的模型评估各类性能指标。运行如下的命令,可以获得一个模型top-1/top-5精度:
```
python eval.py \
--model=SE_ResNeXt50_32x4d \
--batch_size=32 \
--class_dim=1000 \
--image_shape=3,224,224 \
--with_mem_opt=True \
--pretrained_model=${path_to_pretrain_model}
```
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

给个示例,下载一个已经训练好的模型,这里eval,输出结果长啥样?


## 模型推断
模型推断可以获取一个模型的预测分数或者图像的特征:
```
python infer.py \
--model=SE_ResNeXt50_32x4d \
--batch_size=32 \
--class_dim=1000 \
--image_shape=3,224,224 \
--with_mem_opt=True \
--pretrained_model=${path_to_pretrain_model}
```
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

给个示例,预测结果长啥样。


## 已有模型及其性能

表格中列出了在"models"目录下支持的神经网络种类,并且给出了已完成训练的模型在ImageNet-2012验证集合上的top-1/top-5精度。预训练模型可以通过点击相应模型的名称进行下载。

|model | top-1/top-5 accuracy
|- | -:
|[AlexNet](http://paddle-imagenet-models.bj.bcebos.com/alexnet_model.tar) | 57.21%/79.72%
|VGG11 | -
|VGG13 | -
|VGG16 | -
|VGG19 | -
|GoogleNet | -
|InceptionV4 | -
|MobileNet | -
|[ResNet50](http://paddle-imagenet-models.bj.bcebos.com/resnet_50_model.tar) | 76.63%/93.10%
|ResNet101 | -
|ResNet152 | -
|[SE_ResNeXt50_32x4d](http://paddle-imagenet-models.bj.bcebos.com/se_resnext_50_model.tar) | 78.33%/93.96%
|SE_ResNeXt101_32x4d | -
|SE_ResNeXt152_32x4d | -
|DPN68 | -
|DPN92 | -
|DPN98 | -
|DPN107 | -
|DPN131 | -
Empty file.
Loading