Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addition of NASNet models #8711

Closed
titu1994 opened this issue Dec 6, 2017 · 8 comments
Closed

Addition of NASNet models #8711

titu1994 opened this issue Dec 6, 2017 · 8 comments

Comments

@titu1994
Copy link
Contributor

titu1994 commented Dec 6, 2017

@fchollet Is there any interest in the addition of NASNet models to Keras Applications?

Based on the paper Learning Transferable Architectures for Scalable Image Recognition, and the codebase at the Tensorflow research site, I was able to replicate the NASNet models (with weights) in Keras.

Weights for NASNet Large and NASNet Mobile are provided (with and without auxiliary branches). Since CIFAR weights are not available at the site, and my laptop is not powerful enough to train NASNet CIFAR, weights for CIFAR models are not available.

Repository with Code + Weights: https://github.com/titu1994/Keras-NASNet
Edit : Link to the model builder - https://github.com/titu1994/Keras-NASNet/blob/master/nasnet.py

@ahundt
Copy link
Contributor

ahundt commented Dec 7, 2017

I think this would be a great addition. If/when that happens we should remove your implementation in keras-contrib/applications/nasnet.py.

@fchollet
Copy link
Collaborator

fchollet commented Dec 7, 2017

Yes, I think that would be nice to have. I don't think we should include the auxiliary branch though, since it's a regularization mechanism (same spirit as dropout, etc), and we don't include those in applications.

@titu1994
Copy link
Contributor Author

titu1994 commented Dec 7, 2017

@fchollet Great, I'll work on porting the implementation without the auxiliary branch. It seems it is only used during training, and I agree that the applications don't have regularization parameters.

On that topic, should I include the CIFAR type model as well then? Since it doesn't have weights, I was thinking of excluding it.

@ahundt I feel the contrib version should remain, since it will probably fully support the CIFAR models as well (for small image size problems).

@ahundt
Copy link
Contributor

ahundt commented Dec 7, 2017

Interesting, so the models in applications aren't really designed for best practices w.r.t. training, perhaps that information should be in the docs with a reference to the contrib version in the case training is needed?

In that case I agree with you @titu1994, the contrib version should definitely remain.

@drscotthawley
Copy link

drscotthawley commented Dec 31, 2017

@titu1994 Have these models been tested with checkpointing?
I tried taking my working code and swapping out MobileNet with NASNetMobile, and everything works fine except when I train for an epoch (@ahundt, I didn't see any mention of the contrib version in the docs) and try to save a checkpoint, I get a bunch of hdf5 errors...

File "/opt/anaconda/envs/py35/lib/python3.5/site-packages/keras/engine/training.py", line 1657, in fit
validation_steps=validation_steps)
File "/opt/anaconda/envs/py35/lib/python3.5/site-packages/keras/engine/training.py", line 1233, in _fit_loop
callbacks.on_epoch_end(epoch, epoch_logs)
File "/opt/anaconda/envs/py35/lib/python3.5/site-packages/keras/callbacks.py", line 73, in on_epoch_end
callback.on_epoch_end(epoch, logs)
File "/opt/anaconda/envs/py35/lib/python3.5/site-packages/keras/callbacks.py", line 415, in on_epoch_end
self.model.save(filepath, overwrite=True)
File "/opt/anaconda/envs/py35/lib/python3.5/site-packages/keras/engine/topology.py", line 2565, in save
save_model(self, filepath, overwrite, include_optimizer)
File "/opt/anaconda/envs/py35/lib/python3.5/site-packages/keras/models.py", line 116, in save_model
topology.save_weights_to_hdf5_group(model_weights_group, model_layers)
File "/opt/anaconda/envs/py35/lib/python3.5/site-packages/keras/engine/topology.py", line 2875, in save_weights_to_hdf5_group
g.attrs['weight_names'] = weight_names
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1490027952255/work/h5py/_objects.c:2846)
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1490027952255/work/h5py/_objects.c:2804)
File "/opt/anaconda/envs/py35/lib/python3.5/site-packages/h5py/_hl/attrs.py", line 93, in setitem
self.create(name, data=value, dtype=base.guess_dtype(value))
File "/opt/anaconda/envs/py35/lib/python3.5/site-packages/h5py/_hl/attrs.py", line 188, in create
attr = h5a.create(self._id, self._e(tempname), htype, space)
File "h5py/_objects.pyx", line 54, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1490027952255/work/h5py/_objects.c:2846)
File "h5py/_objects.pyx", line 55, in h5py._objects.with_phil.wrapper (/home/ilan/minonda/conda-bld/h5py_1490027952255/work/h5py/_objects.c:2804)
File "h5py/h5a.pyx", line 47, in h5py.h5a.create (/home/ilan/minonda/conda-bld/h5py_1490027952255/work/h5py/h5a.c:2075)
RuntimeError: Unable to create attribute (Object header message is too large)

Turning off the checkpoint callback means no error occurs and execution proceeds (but of course you don't get to save the weights, so...)

@titu1994
Copy link
Contributor Author

titu1994 commented Dec 31, 2017

See https://stackoverflow.com/questions/44198201/callbackfunction-modelcheckpoint-causes-error-in-keras

#7508

This occurs because NASNet is an incredibly large model with a massive number of layers, so much so that it exceeds the HDFS limit of 64K memory for the layer_names attribute in the HDFS file.

There was a proposal to chunk the HDFS files if they grew so big, though I don't know if this was merged into master.

One possible solutions you can try are to save only the weights in the ModelCheckpoint callback.

Renaming every single layer in NASNet into short acronyms is another answer to save weights right now, though this must be done at the Keras master level.

@urche0n-82
Copy link

Hi @titu1994,
I was trying out the NASNet version pulled into Keras today, and found that specifying a custom shape with include_top = False would still produce an error that the shape had to be 331x331x3 as if include_top was still set to True.
I traced this back to the line: require_flatten=include_top or weights, when calling the _obtain_input_shape helper function in imagenet_utils. Is there a reason you are passing the weights value as the require_flatten parameter when the user sets include_top to False? Once I removed this, the model compiled as expected using my custom input_shape.

I'm just wondering if there was some rationale behind this that I am missing, or is this an error?
Thanks, -J.

@titu1994
Copy link
Contributor Author

titu1994 commented Jan 15, 2018

Thats because of the or weights part of that line. It needs to be removed, since the next argument is using the value properly. When I was submitting the PR, the weights argument hadn't come around yet, so I had to use that trick.

@fchollet If you wouldn't mind, could you patch this into master directly? Very minor change to warrant an additional PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants