Dropout2d and residual #42

AliVard · 2022-06-15T09:33:18Z

Dear authors and contributors,

There is an observation that I would be happy to get your confirmation on :-)
In all of the model hierarchy: SequenceModel, SequenceResidualBlock and S4 ,you are using Dropout2d which zeros at the batch dimension, i.e. ignores the entire sample. Without a residual link, with multiple layers, the probability that each sample is not ignored through the model becomes negligible. Consequently, the model does not see the inputs and will not train!
In the SequenceResidualBlock, the dropout is applied only if a residual link is present. The residual link of SequenceResidualBlock also takes care of the dropout from S4.
So my issue is two-fold:

When using dropout > 0, we never should set residual = None in the parameters of SequenceResidualBlock, right? Is it possible to add a check in the initialization to avoid possible misconfigurations?
The dropinp input of SequenceModel should not be used, as there is no residual link there. I've seen in all of the configs we have dropinp: 0.0. So why is it there at all?

Thanks and regards,

The text was updated successfully, but these errors were encountered:

albertfgu · 2022-06-15T16:51:38Z

There is a bug in PyTorch 1.11 which is causing the behavior of Dropout2d that you've observed: pytorch/pytorch#77081

We will add a warning and a fix for this.

dropinp is a hyperparameter that people sometimes use, and we also used in earlier experiments on WikiText-103.

albertfgu · 2022-08-11T19:01:44Z

The READMEs have been updated to mention this issue, and we have implemented a custom dropout function to avoid problems with the PyTorch implementation. Perhaps in the far future when everyone is using torch 1.12 or later we can switch back to using the official functions.

albertfgu closed this as completed Aug 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dropout2d and residual #42

Dropout2d and residual #42

AliVard commented Jun 15, 2022

albertfgu commented Jun 15, 2022

albertfgu commented Aug 11, 2022

Dropout2d and residual #42

Dropout2d and residual #42

Comments

AliVard commented Jun 15, 2022

albertfgu commented Jun 15, 2022

albertfgu commented Aug 11, 2022