flax/examples/cifar10 at master · rolandgvc/flax

History

Name		Name	Last commit message	Last commit date
parent directory ..
models		models
README.md		README.md
input_pipeline.py		input_pipeline.py
train.py		train.py
train_benchmark.py		train_benchmark.py

README.md

CIFAR10 classification

Trains a family ResNet-style models (He et al., 2015; Zagoruyko and Komodakis, 2017; Han et al., 2017) for the CIFAR10 classification task (Krizhevsky, 2009).

This example implements several architectures, regularization methods (Gastaldi, 2017; Yamada et al., 2018) and learning rate schedules that can be used in various combinations.

Requirements

TensorFlow dataset cifar10

Supported setups

The model should run with other configurations and hardware, but explicitly tested on the following.

Wide ResNet: 26 layers, 10x width (Zagoruyko and Komodakis, 2017)

Hardware	Epochs	Learning rate	Training time	Error rate	TensorBoard.dev
1 x Nvidia V100 (16GB)	200	Piece-wise constant	4h 36m	4.45%	2020-03-22
8 x Nvidia V100 (16GB)	200	Piece-wise constant	57m	3.93%	2020-03-22

Wide ResNet: 26 layers, 6x width, Shake-Shake regularization (Gastaldi, 2017)

Hardware	Epochs	Learning rate	Training time	Error rate	TensorBoard.dev
1 x Nvidia V100 (16GB)	200	Piece-wise constant	3h 38m	3.43%	2020-03-22
8 x Nvidia V100 (16GB)	200	Piece-wise constant	54m	3.39%	2020-03-26
1 x Nvidia V100 (16GB)	1800	Cosine	1d 9h 25m	2.97%	2020-03-22
8 x Nvidia V100 (16GB)	1800	Cosine	8h 5m	2.82%	2020-03-26

PyramidNet, Shake-drop regularization (Han et al., 2017; Yamada et al., 2018)

Hardware	Epochs	Learning rate	Training time	Error rate	TensorBoard.dev
8 x Nvidia V100 (16GB)	300	Piece-wise constant	6h 41m	3.25%	2020-03-24
8 x Nvidia V100 (16GB)	1800	Cosine	1d 16h 27m	2.75%	2020-03-24

How to run

All models were trained with a global batch size of 256.

Wide ResNet: 26 layers, 10x width

python train.py --arch=wrn26_10 --model_dir=./cifar10_wrn26_10_bs=256_lr=0.1

Wide ResNet: 26 layers, 6x width, Shake-shake regularization

python train.py --arch=wrn26_6_ss --model_dir=./cifar10_wrn26_6_ss_bs=256_lr=0.1

or

python train.py --arch=wrn26_6_ss --lr_schedule=cosine --num_epochs=1800 --model_dir=./cifar10_wrn26_6_ss_bs=256_lr=cosine_epochs=1800

PyramidNet, Shake-drop regularization

python train.py --arch=pyramid --lr_sched_steps="[[150,0.1],[225,0.01]]" --num_epochs=300 --l2_reg=0.0001 --model_dir=./cifar10_pyramid_bs=256_lr=0.1_l2=0.0001_epoch=300

or

python train.py --arch=pyramid --lr_sched_steps=cosine --num_epochs=1800 --l2_reg=0.0001 --model_dir=./cifar10_pyramid_bs=256_lr=cosine_l2=0.0001_epochs=1800

Known issues

L2 regularization is applied to model kernels and biases, instead of only being applied to kernels.

References

This example consulted the following open-source repositories for implementation details and hyper-parameters:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cifar10

cifar10

README.md

CIFAR10 classification

Requirements

Supported setups

Wide ResNet: 26 layers, 10x width (Zagoruyko and Komodakis, 2017)

Wide ResNet: 26 layers, 6x width, Shake-Shake regularization (Gastaldi, 2017)

PyramidNet, Shake-drop regularization (Han et al., 2017; Yamada et al., 2018)

How to run

Wide ResNet: 26 layers, 10x width

Wide ResNet: 26 layers, 6x width, Shake-shake regularization

PyramidNet, Shake-drop regularization

Known issues

References

Files

cifar10

Directory actions

More options

Directory actions

More options

Latest commit

History

cifar10

Folders and files

parent directory

README.md

CIFAR10 classification

Requirements

Supported setups

Wide ResNet: 26 layers, 10x width (Zagoruyko and Komodakis, 2017)

Wide ResNet: 26 layers, 6x width, Shake-Shake regularization (Gastaldi, 2017)

PyramidNet, Shake-drop regularization (Han et al., 2017; Yamada et al., 2018)

How to run

Wide ResNet: 26 layers, 10x width

Wide ResNet: 26 layers, 6x width, Shake-shake regularization

PyramidNet, Shake-drop regularization

Known issues

References