Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EfficientNet implementation #796

Closed
leonid-pishchulin opened this issue Jun 5, 2019 · 10 comments
Closed

EfficientNet implementation #796

leonid-pishchulin opened this issue Jun 5, 2019 · 10 comments
Labels

Comments

@leonid-pishchulin
Copy link

leonid-pishchulin commented Jun 5, 2019

Hey, are there any EfficientNet (https://arxiv.org/abs/1905.11946) implementations available? If not, is somebody working on it?

@KellenSunderland
Copy link

Just for some background: we're mostly interested so no one duplicates effort.

@zhreshold
Copy link
Member

here's one candidate @sufeidechabei
To summarize, it might be easy to write the definitions(https://github.com/mnikitin/EfficientNet/blob/master/efficientnet_model.py), but training part is unpredictable yet due to the missing training hyper-parameters in paper.

Should we split the risk by training the network simultanously?

@bermanmaxim
Copy link

bermanmaxim commented Jun 10, 2019

Note that https://github.com/tensorflow/tpu/blob/master/models/official/efficientnet/main.py has the training script that seems to have been used and provides hyperparameters.

@bermanmaxim
Copy link

Also there might be some differences with the paper, author says "source code is correct": see tensorflow/tpu#383, tensorflow/tpu#390

@ryanjay0
Copy link

ryanjay0 commented Jun 11, 2019

They never said "source code is correct" about tpu issue tensorflow/tpu#390. Did they? Seems like a much larger discrepancy than the padding issues in tensorflow/tpu#383

@bermanmaxim
Copy link

bermanmaxim commented Jun 11, 2019

True, what I meant is that since the author said source code is correct on tensorflow/tpu#383 I was assuming the code is what they actually used, including concerning tensorflow/tpu#390. But you are right that this resolution discrepancy is a big difference 🤔

@bermanmaxim
Copy link

Regarding the training of efficientnet, see the remarks of Ross Wightman here: https://forums.fast.ai/t/efficientnet/46978/67 ; it might be that keeping an exponential moving average of the weights during training, for use at testing, helps this family of models a lot.

@bermanmaxim
Copy link

Similar discussion over here: pytorch/vision#980

@hetong007
Copy link
Member

@sufeidechabei please check your PR with the resources above.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants