moco-tf2.x

Unofficial reimplementation of MoCo: Momentum Contrast for Unsupervised Visual Representation Learning
Found many helpful implementations from moco.tensorflow
Augmentation code are copied from SimCLR - A Simple Framework for Contrastive Learning of Visual Representations
Trying to implement as much as possible in Tensorflow 2.x
- Used MirroredStrategy with custom training loop (tensorflow-tutorial)

Note

Difference between official implementation
- 8 GPUs vs 4 GPUs
- 53 Hours vs 147 hours (Unsupervised training time) - much slower than official one
Batch normalization - tf
- If one sets batch normalization layer as un-trainable, tf will normalize input with their moving mean & var, even though you use training=True
Lack of information about how to properly apply weight regularization within distributed environment

MoCo v1
- Could not reproduce same accuracy (Linear classification protocol on Imagenet) result as official one.

	InfoNCE	(K+1) Accuracy
MoCo V1

	Train Accuracy	Validation Accuracy
lincls

ResNet-50	pre-train epochs	pre-train time	MoCo v1 top-1 acc.
Official Result	200	53 hours	60.6
This repo Result	200	147 hours	50.8

Name		Name	Last commit message	Last commit date
Latest commit History 153 Commits
assets		assets
base_networks		base_networks
datasets		datasets
misc		misc
models		models
multi_worker_test		multi_worker_test
official_pretrained		official_pretrained
.gitignore		.gitignore
README.md		README.md
copy_pytorch_official_weights.py		copy_pytorch_official_weights.py
docker-compose.yml		docker-compose.yml
implementations.md		implementations.md
linear_classification_protocol.py		linear_classification_protocol.py
moco.py		moco.py
train.py		train.py