Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge "develop" into "master" for release 1.4.0 #69

Merged
merged 54 commits into from
Jan 10, 2023
Merged

Merge "develop" into "master" for release 1.4.0 #69

merged 54 commits into from
Jan 10, 2023

Conversation

cofri
Copy link
Collaborator

@cofri cofri commented Jan 10, 2023

Merge "develop" into "master" for release 1.4.0

thib-s and others added 30 commits September 27, 2022 10:02
This Householder activation (n=2 and order 1) is a generalization of GroupSort2
activation. It is gradient-norm preserving and continuous.
See https://openreview.net/pdf?id=tD7eCtaSkR
Two new classes Lorth and Lorth2D are introduced. They are respectively
the abstract class and the 2D class for convolution Lorth.
This new Lorth regularizer is introduced to regularize any convolutional layer
as a kernel regularizer.
A new regularizer is introduced to penalize a Dense kernel to be orthogonal.
These checks are performed in multiple layers. To simplify and avoid errors,
the checks are now done in a separate function.
A new layer is introduced, derived from Conv2DTranspose where weights are
constrained with spectral normalization and Björck orthogonalization.
The computation of the Lipschitz correction factor is done in three
different layers. This operation is externalized to be called by the three
classes.
Setting the seed of the random values fixes the failure
Because of bugs in TensorFlow 2.0 and 2.1 when saving/loading
custom losses or metrics, these versions are not supported anymore.
deel-lip now supports TensorFlow 2.2 and higher.

Moreover, we tell pip that Python versions 3.9 and 3.10 are supported.
Since Python 3.6 is not maintained anymore, we also removed it from
the classifiers.
The tox configuration file is updated to add environments with different
Python and TensorFlow versions. This allows to run unit tests with these
environments and ensure that the library is stable across Python and TF
versions.

Note that TensorFlow 2.2 version requires an older version for protobuf
package.
Since a unit test uses a Keras gelu activation which was introduced in
Tensorflow 2.4, a check is added.
Since the tox configuration file now contains a large number of environments
for multiple Python and TensorFlow versions, the targets "make test" and
"make test-disable-gpu" now runs only a small subset of the tests. But these
tests are representative of the different TF versions.
The current versions were deprecated due to an update on Github workflows.
The latest versions are now used for actions/checkout and actions/setup-python
The Github workflows for unit testing are extended to test different Python
and TensorFlow versions. Three tests are performed on Github CI with the
following combinations:
- Python 3.7 and TensorFlow 2.3
- Python 3.9 and TensorFlow 2.7
- Python 3.10 and TensorFlow latest (2.10 as of october 2022)
The Github workflows for linting and testing are done when:
- a pull request is open (and updated)
- pushing on master and develop

Moreover, the unit tests are also performed:
- every Sunday at 2am on master (i.e. the base branch)
The aim of this commit is to have the package version at a single place, and
not across multiple files.

The VERSION file contains the version number. It is now the only place where
it is set. The setup.py and the Sphinx conf.py read this file.
It is now possible to get the version number directly from the package:
  import deel.lip
  print(deel.lip.__version__)

Note that to add non-Python files to the sdist package, it is required to add
`include_package_data=True` in setup.py and a MANIFEST.in file containing the
non-Python files to add in package.
- The TauCategoricalCrossentropy is the cross-entropy loss with the temperature
  tau that can be set to tune the trade-off between accuracy and robustness
- The CategoricalHinge loss is the standard CategoricalHinge loss from
  TensorFlow, i.e. the 1-vs-all uncentered version (max(y_others)). The margin
  is settable and is a tf.Variable object to be tuned during training.
The idea is to enforce a margin m between the true logit and other
logits. Since the hinge losses enforce to be centered around zero, we want to
have the true logit above m/2 and other logits below -m/2, then a margin m
between the true logit and others.

Unit tests have been modified in order to reflect this change: min_margin value
must be doubled.
Some tests fail because of float64 data or 1-rank data. To ensure that all
tests use the same type of data and succeed, we use "binary_tf_data" when
possible.
thib-s and others added 24 commits November 22, 2022 13:28
The call() functions of HingeMargin and MulticlassHinge are externalized.
This is done to prepare the use of tf.Variable objects for min_margin and alpha
hyper-parameters (next commit).

The HKR and MulticlassHKR now uses directly the functions (hinge_margin() and
multiclass_hinge()) and not the classes (HingeMargin and MulticlassHinge).
The hyper-parameters in losses (margin, temperature, alpha for KR) are now
tf.Variable objects. These hyper-parameters can now be changed during training
using a parameter scheduler (see next commit).
For now, an if condition depending on alpha was implemented in the `call()`
function of the HKR and MulticlassHKR losses. Moreover, the hyper-parameter
alpha is now a tf.Variable which can be modified during training (and then
the if condition can be misinterpreted).

To avoid hidden problems and to handle TPU training, the if condition is moved
to the `__init__()`.
Note a consequence of this commit is that the `self.fct` is defined at
initialization of the loss and we cannot pass during training from an alpha
np.inf to a float alpha, and vice versa.
This new callback allows to tune the hyper-parameters in losses during
training. The scheduler is defined by points (step, value) and linear
interpolation is used to compute the parameter at each step.
This new callback allows to print at each epoch the value of a parameter in
losses, e.g. the temperature tau or the margin.
To avoid problems at graph construction, the if...else condition is removed
from the `multiclass_hinge()` function. Note that this function now only
accepts 2-rank tensors of shape (batch_size, # classes) with 2 or more classes.
Binary case must now use `hinge_margin()` function.

Unit tests corresponding to binary data with multiclass_hinge are then removed.
changed mutable default parameter of SpectralInitializer
Feat/orthogonalization improvements

added following optimizations in normalizers.py:
- in bjorck: choose between (WWt)W and W(WtW) to save time and memory on non-square matrices
- added maxiter_bjorck and maxiter_spectral parameters
- added globals SWAP_MEMORY and STOP_GRAD_SPECTRAL and function to set those (default values fallback to original behavior)
@cofri cofri requested a review from franckma31 January 10, 2023 09:33
Copy link
Collaborator

@franckma31 franckma31 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine

@cofri cofri merged commit 9c2f1a7 into master Jan 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants