-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge "develop" into "master" for release 1.4.0 #69
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This Householder activation (n=2 and order 1) is a generalization of GroupSort2 activation. It is gradient-norm preserving and continuous. See https://openreview.net/pdf?id=tD7eCtaSkR
Householder activation
Two new classes Lorth and Lorth2D are introduced. They are respectively the abstract class and the 2D class for convolution Lorth.
This new Lorth regularizer is introduced to regularize any convolutional layer as a kernel regularizer.
A new regularizer is introduced to penalize a Dense kernel to be orthogonal.
Lorth regularizer
These checks are performed in multiple layers. To simplify and avoid errors, the checks are now done in a separate function.
A new layer is introduced, derived from Conv2DTranspose where weights are constrained with spectral normalization and Björck orthogonalization.
The computation of the Lipschitz correction factor is done in three different layers. This operation is externalized to be called by the three classes.
Setting the seed of the random values fixes the failure
SpectralConv2DTranspose
Because of bugs in TensorFlow 2.0 and 2.1 when saving/loading custom losses or metrics, these versions are not supported anymore. deel-lip now supports TensorFlow 2.2 and higher. Moreover, we tell pip that Python versions 3.9 and 3.10 are supported. Since Python 3.6 is not maintained anymore, we also removed it from the classifiers.
The tox configuration file is updated to add environments with different Python and TensorFlow versions. This allows to run unit tests with these environments and ensure that the library is stable across Python and TF versions. Note that TensorFlow 2.2 version requires an older version for protobuf package.
Since a unit test uses a Keras gelu activation which was introduced in Tensorflow 2.4, a check is added.
Since the tox configuration file now contains a large number of environments for multiple Python and TensorFlow versions, the targets "make test" and "make test-disable-gpu" now runs only a small subset of the tests. But these tests are representative of the different TF versions.
The current versions were deprecated due to an update on Github workflows. The latest versions are now used for actions/checkout and actions/setup-python
The Github workflows for unit testing are extended to test different Python and TensorFlow versions. Three tests are performed on Github CI with the following combinations: - Python 3.7 and TensorFlow 2.3 - Python 3.9 and TensorFlow 2.7 - Python 3.10 and TensorFlow latest (2.10 as of october 2022)
The Github workflows for linting and testing are done when: - a pull request is open (and updated) - pushing on master and develop Moreover, the unit tests are also performed: - every Sunday at 2am on master (i.e. the base branch)
The aim of this commit is to have the package version at a single place, and not across multiple files. The VERSION file contains the version number. It is now the only place where it is set. The setup.py and the Sphinx conf.py read this file. It is now possible to get the version number directly from the package: import deel.lip print(deel.lip.__version__) Note that to add non-Python files to the sdist package, it is required to add `include_package_data=True` in setup.py and a MANIFEST.in file containing the non-Python files to add in package.
CI updates (tox + Github workflows)
- The TauCategoricalCrossentropy is the cross-entropy loss with the temperature tau that can be set to tune the trade-off between accuracy and robustness - The CategoricalHinge loss is the standard CategoricalHinge loss from TensorFlow, i.e. the 1-vs-all uncentered version (max(y_others)). The margin is settable and is a tf.Variable object to be tuned during training.
The idea is to enforce a margin m between the true logit and other logits. Since the hinge losses enforce to be centered around zero, we want to have the true logit above m/2 and other logits below -m/2, then a margin m between the true logit and others. Unit tests have been modified in order to reflect this change: min_margin value must be doubled.
Some tests fail because of float64 data or 1-rank data. To ensure that all tests use the same type of data and succeed, we use "binary_tf_data" when possible.
The call() functions of HingeMargin and MulticlassHinge are externalized. This is done to prepare the use of tf.Variable objects for min_margin and alpha hyper-parameters (next commit). The HKR and MulticlassHKR now uses directly the functions (hinge_margin() and multiclass_hinge()) and not the classes (HingeMargin and MulticlassHinge).
The hyper-parameters in losses (margin, temperature, alpha for KR) are now tf.Variable objects. These hyper-parameters can now be changed during training using a parameter scheduler (see next commit).
For now, an if condition depending on alpha was implemented in the `call()` function of the HKR and MulticlassHKR losses. Moreover, the hyper-parameter alpha is now a tf.Variable which can be modified during training (and then the if condition can be misinterpreted). To avoid hidden problems and to handle TPU training, the if condition is moved to the `__init__()`. Note a consequence of this commit is that the `self.fct` is defined at initialization of the loss and we cannot pass during training from an alpha np.inf to a float alpha, and vice versa.
This new callback allows to tune the hyper-parameters in losses during training. The scheduler is defined by points (step, value) and linear interpolation is used to compute the parameter at each step.
This new callback allows to print at each epoch the value of a parameter in losses, e.g. the temperature tau or the margin.
To avoid problems at graph construction, the if...else condition is removed from the `multiclass_hinge()` function. Note that this function now only accepts 2-rank tensors of shape (batch_size, # classes) with 2 or more classes. Binary case must now use `hinge_margin()` function. Unit tests corresponding to binary data with multiclass_hinge are then removed.
Introduce new losses (part 1)
changed mutable default parameter of SpectralInitializer
Add comments on reference papers
…d _v concurrently (at different steps)
Feat/orthogonalization improvements added following optimizations in normalizers.py: - in bjorck: choose between (WWt)W and W(WtW) to save time and memory on non-square matrices - added maxiter_bjorck and maxiter_spectral parameters - added globals SWAP_MEMORY and STOP_GRAD_SPECTRAL and function to set those (default values fallback to original behavior)
Bump version to 1.4.0
franckma31
approved these changes
Jan 10, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Merge "develop" into "master" for release 1.4.0