Addition of model building scripts to applications #10

titu1994 · 2017-02-03T06:46:17Z

I have a few scripts can be used to build keras models of recent papers. Some of them are :

Finished models with weights

Inception v4, Inception-Resnet v1, Inception-Resnet v2 (weights available for Inception v4 - via https://github.com/kentsommer/keras-inceptionV4, and therefore he should add a PR)

Wide Residual Networks (weights for WRN-16-8 and WRN-28-8 for theano backend are available)

Residual of Residual Networks (weights for RoR-WRN-40-2 is available for theano backend)

DenseNets (weights for DenseNet-40-12 available for theano backend)

Custom Callbacks

Snapshot Ensembles (ensemble weights for CIFAR 10 (Wide ResNet - 16 - 4) and CIFAR 100 (Wide ResNet - 16 - 4) are available for theano backend)

My question is : are they worth the addition to the applications folder? And if so, should the weights be hosted in the original repositories or over here?

Moreover, for snapshot ensemble, the example is more important than the custom callback (snapshot ensemble can be used for any model, it simply requires 3 different callbacks to work). The main difficulty is how to use the ensemble model weights to predict (and or simply calculate accuracy).

So should an example be added as well in the examples directory of keras_contrib?

farizrahman4u · 2017-02-05T05:40:58Z

are they worth the addition to the applications folder?

Yes, please.

should the weights be hosted in the original repositories or over here?

@patyork What do you think?

So should an example be added as well in the examples directory of keras_contrib?

Yes.

patyork · 2017-02-05T16:07:25Z

Weights (and any large files, such as datasets) should be hosted elsewhere - not within this repo or in Keras. Another Github repo is okay, although I'd recommend AWS or google drive or something to that effect.

On that note, there should be a test to check that external files still exist (not necessarily download it, just check for a non-404/503 http response). The maintainers should also probably host a copy of the files, and the applications/dataset code should allow fallback to the copies. Ex:

applications = {
     'vgg16' : [
             'http://original_file.com',
             'http://copy1.com' ]
     ...
     ...
}

titu1994 · 2017-02-05T16:46:57Z

@farizrahman4u @patyork Thanks for the input.

I'll start with Snapshot Ensemble, since it is just 1 callback builder. On that note, Snapshot ensembles is basically a collection of 3 callbacks :

A custom model saving callback (SnapshotModelCheckpoint)
A custom learning rate scheduler (use LearningRateSchedule callback with cosine annealing schedule)
(Optional) A ModelCheckpoint callback which saves the weights of the single best model. I say it is optional, but in fact this one alone increases the ensemble score by approximately 0.3 - 0.5 % on average.

It can be seen here : https://github.com/titu1994/Snapshot-Ensembles/blob/master/snapshot.py

I can see that having a builder class build the 3 callbacks is not in the standards of keras or keras_contrib, but it makes it simple to perform snapshot ensemble on any given model.

So, should I split the class into separate callback and schedule, in which case I don't know where to place the custom schedule, or do I add the entire snapshot.py to the callbacks directory?

junwei-pan · 2017-02-12T19:05:28Z

I think it's fine to have a builder class that returns a list of 3 callbacks, there are no such complicated callbacks previously in the keras repo.

titu1994 · 2017-02-12T23:55:33Z

@kemaswill Thanks for the input.

On another important note, should I add an example of how to use the callback? Its pretty straightforward, but it may be helpful for beginners.

A more important example in my opinion is how to use a weighted ensemble prediction to improve the accuracy of the model. I have a script here which shows how to optimize the score using the ensemble weights of a Wide ResNet model, and how weighted classification can improve the score : https://github.com/titu1994/Snapshot-Ensembles/blob/master/predict_cifar_100.py

However, I fear that may be beyond the scope of what we are trying to do with the contrib repo, so for now I am only considering adding a script to train a classifier with snapshot ensemble callback.

junwei-pan · 2017-02-14T01:24:08Z

I'm reviewing @titu1994 's PR now, there're several questions: How should one commit a new example? Should this be pushed to 'keras_contrib/applications' or 'examples'? I think we should only choose one directory, just as the original keras repo does. If so, which models should be pushed to 'keras_contrib/applications' and which should be pushed to 'examples'?

junwei-pan · 2017-02-14T01:28:00Z

@titu1994 I think a Snapshot Ensemble example is fine for how to use such callback, so you can create one PR consists of an Snapshot Ensemble example and a new callback function.

titu1994 · 2017-02-14T01:53:09Z

@kemaswill An example is useful for showing how Snapshot Ensemble callback builder can be used, because there are no other callback builder patterns in either Keras or Keras Contrib.

However, there isn't much need of adding the other examples, for the DenseNet, Wide ResNet and RoR Networks. Shall I remove those examples?

On the note of where these classes should go, definitely DenseNet, Wide ResNet and RoR Networks should go into applications. I will remove the examples if its not needed, since they are general training examples.

the-moliver · 2017-02-14T04:45:23Z

I actually like having both the examples and applications since they serve somewhat different purposes. What do others think?

the-moliver · 2017-02-14T05:08:03Z

@titu1994 What do you think about modifying one of your current PR examples to incorporate the Snapshot Ensemble?

titu1994 · 2017-02-14T05:23:57Z

@the-moliver Ok if it is valuable to have the training script as well, then I'll keep them in the DenseNet, RoR and WRN PRs.

On the note of modifying one of my PR's to add a Snapshot Ensemble example, if it's OK, I'd like to add that as a separate PR next week. The main reason is I'm a bit tied up with assignment work, and the weights for the ensemble model need to be translated to the different backends first.

Also, the snapshot ensemble needs two examples :

A training example, which shows how to use the SnapshotCallbackBuilder to generate the list of callbacks and train.
An ensemble prediction example, with 2 different flavors.
a) One if straight up even weight scale for all snapshot models, which leads to a pretty high score.
b) Other is using score maximization approach (random weight initialization, and then subsequent application of optimization algorithm to find optimal weights for the snapshot ensemble).

2-b delivers much higher scores than 2-a, but the approach is quite technical, and not really suited as an example. For examples, see Performance table at https://github.com/titu1994/Snapshot-Ensembles

In your oppinion, which of the two examples, 2-a or 2-b should I add as a PR along side the 1st training example?

the-moliver · 2017-02-14T05:59:44Z

A separate PR demonstrating both training and prediction is fine. Regarding 2-b, it looks like you are optimizing the weights on each model based on the test set prediction. That is bad practice and not to be encouraged! (This is just fitting to your test data, so of course the performance on the test set will improve) However, you could find these weights on the training data, and I would be curious to see if they result in test set performance improvements. If this does result in significant improvement, I'd love to include it, otherwise we'll go with 2-a.

titu1994 · 2017-02-14T06:18:18Z

I can alter the test to look at the training data instead. I dunno what I was thinking making it look at the testing data.

Now that I remember if I did write that code just before my shift to the US so I was quite time strapped. Will edit it once I have some time.

junwei-pan · 2017-02-14T06:33:23Z

@titu1994 OK, so just keep both the applications and the examples.

titu1994 · 2017-02-15T00:34:00Z

@kemaswill So I was wondering, Snapshot Ensemble was originally used on WRN. So I can't really add the example of Snapshot Ensemble training or predictions until the WRN PR has been merged.

I have the pretrained weights of WRN-16-4 models on CIFAR100, so should I just give an example to train on WRN and another script to predict using ensemble weights?

@the-moliver So I fixed the prediction script, and it works well by obtaining the ensemble weights from looking only at the training data and then using those ensemble weights to weigh the test predictions. It offers just about the same improvement as before (slightly less by 0.03%), but it takes much longer (because it now has to minimize loss for 60k samples instead of 10k) and several hundred iterations of the algorithm to reach such a high value (previously, 20 iterations would give a fairly great score).

I'm thinking of simply using even scaling of the predictions (each model has the same weight) since that easily provides a pretty good score (much higher than the single best model).

Should I add those scripts then as another PR?

junwei-pan mentioned this issue Feb 12, 2017

Call for Contribution #17

Closed

titu1994 closed this as completed Feb 17, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Addition of model building scripts to applications #10

Addition of model building scripts to applications #10

titu1994 commented Feb 3, 2017

farizrahman4u commented Feb 5, 2017 •

edited

Loading

patyork commented Feb 5, 2017

titu1994 commented Feb 5, 2017

junwei-pan commented Feb 12, 2017

titu1994 commented Feb 12, 2017 •

edited

Loading

junwei-pan commented Feb 14, 2017

junwei-pan commented Feb 14, 2017

titu1994 commented Feb 14, 2017 •

edited

Loading

the-moliver commented Feb 14, 2017

the-moliver commented Feb 14, 2017

titu1994 commented Feb 14, 2017

the-moliver commented Feb 14, 2017

titu1994 commented Feb 14, 2017

junwei-pan commented Feb 14, 2017

titu1994 commented Feb 15, 2017 •

edited

Loading

Addition of model building scripts to applications #10

Addition of model building scripts to applications #10

Comments

titu1994 commented Feb 3, 2017

Finished models with weights

Custom Callbacks

farizrahman4u commented Feb 5, 2017 • edited Loading

patyork commented Feb 5, 2017

titu1994 commented Feb 5, 2017

junwei-pan commented Feb 12, 2017

titu1994 commented Feb 12, 2017 • edited Loading

junwei-pan commented Feb 14, 2017

junwei-pan commented Feb 14, 2017

titu1994 commented Feb 14, 2017 • edited Loading

the-moliver commented Feb 14, 2017

the-moliver commented Feb 14, 2017

titu1994 commented Feb 14, 2017

the-moliver commented Feb 14, 2017

titu1994 commented Feb 14, 2017

junwei-pan commented Feb 14, 2017

titu1994 commented Feb 15, 2017 • edited Loading

farizrahman4u commented Feb 5, 2017 •

edited

Loading

titu1994 commented Feb 12, 2017 •

edited

Loading

titu1994 commented Feb 14, 2017 •

edited

Loading

titu1994 commented Feb 15, 2017 •

edited

Loading