Improved training speed of GHNs, extra results for CIFAR-10 #2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Training times
Implementation of some steps in the decoder of GHNs is improved to speed up the training time of GHNs without altering their overall behavior. These improvements mainly affect the speed when a meta-batch size > 1 is used (see the tables below).
Speed is measured on NVIDIA Quadro RTX 6000 in terms of seconds per training iteration (averaged for the first 100 iterations).
CIFAR-10
ImageNet
*To estimate the total training time, 300 epochs is used for bm=1 and 150 epochs is used for bm=8 (according to the paper).
When 4 GPUs and bm = 8 is used, the speed up is not significant, because each GPU receives only two architectures.
Evaluation of GHNs
To make sure that the evaluation results (classification accuracies of predicted parameters) reported in the paper are the same as in this PR, the GHNs were evaluated on selected architectures and the same results were obtained (see the table below).
Extra results on CIFAR-10
The image below is obtained using this notebook.
Other minor updates
--amp
flag was added that can be used to decrease GPU memory consumption and, in some cases, improve speed (this flag was used to measure speed on ImageNet with 4 GPUs)md5sum
values of the DeepNets-1M files were added to make it easier to verify the dataset