You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A batch normalization is possibly the next most widely used layer after dense, convolutional, and maxpooling layers, and is an important tool in optimization (accelerating training).
For neural-fortran, it will mean that we will need to allow passing a batch of data to individual layer's forward and backward methods. While for dense and conv2d layers it may also mean an opportunity to numerically optimize the operations (e.g. running the same operation on a batch of data instead of over one sample at a time), for a batchnorm layer it is required because this layer evaluates moments (e.g. means and standard deviations) over a batch of inputs to normalize the input data.
Implementing batchnorm will require another non-trivial refactor like we did to enable generic optimizers. However, it will probably be easier. The first step will be to allow the passing of a batch of data to forward and backward methods, as I mentioned above. In other words, this snippet:
where the first dim corresponds to inputs and outputs in input and output layers, respectively, and the second dim corresponds to multiple samples in a batch. I will open a separate issue for this.
@Spnetic-5 given limited time in the remainder of the GSoC program, we may be unable to complete the batchnorm implementation, but we can make significant headway on it for sure.
Originally requested by @rweed in #114.
A batch normalization is possibly the next most widely used layer after dense, convolutional, and maxpooling layers, and is an important tool in optimization (accelerating training).
For neural-fortran, it will mean that we will need to allow passing a batch of data to individual layer's
forward
andbackward
methods. While fordense
andconv2d
layers it may also mean an opportunity to numerically optimize the operations (e.g. running the same operation on a batch of data instead of over one sample at a time), for a batchnorm layer it is required because this layer evaluates moments (e.g. means and standard deviations) over a batch of inputs to normalize the input data.Implementing batchnorm will require another non-trivial refactor like we did to enable generic optimizers. However, it will probably be easier. The first step will be to allow the passing of a batch of data to
forward
andbackward
methods, as I mentioned above. In other words, this snippet:neural-fortran/src/nf/nf_network_submodule.f90
Lines 587 to 590 in b119194
after the refactor, we should be able to write like this:
where the first dim corresponds to inputs and outputs in input and output layers, respectively, and the second dim corresponds to multiple samples in a batch. I will open a separate issue for this.
@Spnetic-5 given limited time in the remainder of the GSoC program, we may be unable to complete the batchnorm implementation, but we can make significant headway on it for sure.
References
BatchNormalization
in KerasBatchNorm1d
in PyTorchThe text was updated successfully, but these errors were encountered: