-
Notifications
You must be signed in to change notification settings - Fork 19.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Addition of NASNet models #8711
Comments
I think this would be a great addition. If/when that happens we should remove your implementation in keras-contrib/applications/nasnet.py. |
Yes, I think that would be nice to have. I don't think we should include the auxiliary branch though, since it's a regularization mechanism (same spirit as dropout, etc), and we don't include those in applications. |
@fchollet Great, I'll work on porting the implementation without the auxiliary branch. It seems it is only used during training, and I agree that the applications don't have regularization parameters. On that topic, should I include the CIFAR type model as well then? Since it doesn't have weights, I was thinking of excluding it. @ahundt I feel the contrib version should remain, since it will probably fully support the CIFAR models as well (for small image size problems). |
Interesting, so the models in applications aren't really designed for best practices w.r.t. training, perhaps that information should be in the docs with a reference to the contrib version in the case training is needed? In that case I agree with you @titu1994, the contrib version should definitely remain. |
@titu1994 Have these models been tested with checkpointing? File "/opt/anaconda/envs/py35/lib/python3.5/site-packages/keras/engine/training.py", line 1657, in fit Turning off the checkpoint callback means no error occurs and execution proceeds (but of course you don't get to save the weights, so...) |
See https://stackoverflow.com/questions/44198201/callbackfunction-modelcheckpoint-causes-error-in-keras This occurs because NASNet is an incredibly large model with a massive number of layers, so much so that it exceeds the HDFS limit of 64K memory for the layer_names attribute in the HDFS file. There was a proposal to chunk the HDFS files if they grew so big, though I don't know if this was merged into master. One possible solutions you can try are to save only the weights in the ModelCheckpoint callback. Renaming every single layer in NASNet into short acronyms is another answer to save weights right now, though this must be done at the Keras master level. |
Hi @titu1994, I'm just wondering if there was some rationale behind this that I am missing, or is this an error? |
Thats because of the @fchollet If you wouldn't mind, could you patch this into master directly? Very minor change to warrant an additional PR. |
@fchollet Is there any interest in the addition of NASNet models to Keras Applications?
Based on the paper Learning Transferable Architectures for Scalable Image Recognition, and the codebase at the Tensorflow research site, I was able to replicate the NASNet models (with weights) in Keras.
Weights for NASNet Large and NASNet Mobile are provided (with and without auxiliary branches). Since CIFAR weights are not available at the site, and my laptop is not powerful enough to train NASNet CIFAR, weights for CIFAR models are not available.
Repository with Code + Weights: https://github.com/titu1994/Keras-NASNet
Edit : Link to the model builder - https://github.com/titu1994/Keras-NASNet/blob/master/nasnet.py
The text was updated successfully, but these errors were encountered: