Merge "develop" into "master" for release 1.4.0 #69

cofri · 2023-01-10T08:24:55Z

Merge "develop" into "master" for release 1.4.0

This Householder activation (n=2 and order 1) is a generalization of GroupSort2 activation. It is gradient-norm preserving and continuous. See https://openreview.net/pdf?id=tD7eCtaSkR

Householder activation

Two new classes Lorth and Lorth2D are introduced. They are respectively the abstract class and the 2D class for convolution Lorth.

This new Lorth regularizer is introduced to regularize any convolutional layer as a kernel regularizer.

A new regularizer is introduced to penalize a Dense kernel to be orthogonal.

Lorth regularizer

These checks are performed in multiple layers. To simplify and avoid errors, the checks are now done in a separate function.

A new layer is introduced, derived from Conv2DTranspose where weights are constrained with spectral normalization and Björck orthogonalization.

The computation of the Lipschitz correction factor is done in three different layers. This operation is externalized to be called by the three classes.

Setting the seed of the random values fixes the failure

SpectralConv2DTranspose

Because of bugs in TensorFlow 2.0 and 2.1 when saving/loading custom losses or metrics, these versions are not supported anymore. deel-lip now supports TensorFlow 2.2 and higher. Moreover, we tell pip that Python versions 3.9 and 3.10 are supported. Since Python 3.6 is not maintained anymore, we also removed it from the classifiers.

The tox configuration file is updated to add environments with different Python and TensorFlow versions. This allows to run unit tests with these environments and ensure that the library is stable across Python and TF versions. Note that TensorFlow 2.2 version requires an older version for protobuf package.

Since a unit test uses a Keras gelu activation which was introduced in Tensorflow 2.4, a check is added.

Since the tox configuration file now contains a large number of environments for multiple Python and TensorFlow versions, the targets "make test" and "make test-disable-gpu" now runs only a small subset of the tests. But these tests are representative of the different TF versions.

The current versions were deprecated due to an update on Github workflows. The latest versions are now used for actions/checkout and actions/setup-python

The Github workflows for unit testing are extended to test different Python and TensorFlow versions. Three tests are performed on Github CI with the following combinations: - Python 3.7 and TensorFlow 2.3 - Python 3.9 and TensorFlow 2.7 - Python 3.10 and TensorFlow latest (2.10 as of october 2022)

The Github workflows for linting and testing are done when: - a pull request is open (and updated) - pushing on master and develop Moreover, the unit tests are also performed: - every Sunday at 2am on master (i.e. the base branch)

The aim of this commit is to have the package version at a single place, and not across multiple files. The VERSION file contains the version number. It is now the only place where it is set. The setup.py and the Sphinx conf.py read this file. It is now possible to get the version number directly from the package: import deel.lip print(deel.lip.__version__) Note that to add non-Python files to the sdist package, it is required to add `include_package_data=True` in setup.py and a MANIFEST.in file containing the non-Python files to add in package.

CI updates (tox + Github workflows)

- The TauCategoricalCrossentropy is the cross-entropy loss with the temperature tau that can be set to tune the trade-off between accuracy and robustness - The CategoricalHinge loss is the standard CategoricalHinge loss from TensorFlow, i.e. the 1-vs-all uncentered version (max(y_others)). The margin is settable and is a tf.Variable object to be tuned during training.

The idea is to enforce a margin m between the true logit and other logits. Since the hinge losses enforce to be centered around zero, we want to have the true logit above m/2 and other logits below -m/2, then a margin m between the true logit and others. Unit tests have been modified in order to reflect this change: min_margin value must be doubled.

Some tests fail because of float64 data or 1-rank data. To ensure that all tests use the same type of data and succeed, we use "binary_tf_data" when possible.

The call() functions of HingeMargin and MulticlassHinge are externalized. This is done to prepare the use of tf.Variable objects for min_margin and alpha hyper-parameters (next commit). The HKR and MulticlassHKR now uses directly the functions (hinge_margin() and multiclass_hinge()) and not the classes (HingeMargin and MulticlassHinge).

The hyper-parameters in losses (margin, temperature, alpha for KR) are now tf.Variable objects. These hyper-parameters can now be changed during training using a parameter scheduler (see next commit).

For now, an if condition depending on alpha was implemented in the `call()` function of the HKR and MulticlassHKR losses. Moreover, the hyper-parameter alpha is now a tf.Variable which can be modified during training (and then the if condition can be misinterpreted). To avoid hidden problems and to handle TPU training, the if condition is moved to the `__init__()`. Note a consequence of this commit is that the `self.fct` is defined at initialization of the loss and we cannot pass during training from an alpha np.inf to a float alpha, and vice versa.

This new callback allows to tune the hyper-parameters in losses during training. The scheduler is defined by points (step, value) and linear interpolation is used to compute the parameter at each step.

This new callback allows to print at each epoch the value of a parameter in losses, e.g. the temperature tau or the margin.

To avoid problems at graph construction, the if...else condition is removed from the `multiclass_hinge()` function. Note that this function now only accepts 2-rank tensors of shape (batch_size, # classes) with 2 or more classes. Binary case must now use `hinge_margin()` function. Unit tests corresponding to binary data with multiclass_hinge are then removed.

Introduce new losses (part 1)

…ng equivalent

changed mutable default parameter of SpectralInitializer

Add comments on reference papers

…d _v concurrently (at different steps)

Feat/orthogonalization improvements added following optimizations in normalizers.py: - in bjorck: choose between (WWt)W and W(WtW) to save time and memory on non-square matrices - added maxiter_bjorck and maxiter_spectral parameters - added globals SWAP_MEMORY and STOP_GRAD_SPECTRAL and function to set those (default values fallback to original behavior)

Bump version to 1.4.0

franckma31

Fine

thib-s and others added 30 commits September 27, 2022 10:02

feat: new Householder activation

678941a

This Householder activation (n=2 and order 1) is a generalization of GroupSort2 activation. It is gradient-norm preserving and continuous. See https://openreview.net/pdf?id=tD7eCtaSkR

test: unit tests for Householder activation

f2a4cac

Merge pull request #58 from deel-ai/householder

c468beb

Householder activation

feat(Lorth): Lorth and Lorth2D

44bfd16

Two new classes Lorth and Lorth2D are introduced. They are respectively the abstract class and the 2D class for convolution Lorth.

feat(Lorth): Lorth regularizer for Conv2D layer

2cf15dc

This new Lorth regularizer is introduced to regularize any convolutional layer as a kernel regularizer.

feat(Lorth): OrthDenseRegularizer

bc45b9e

A new regularizer is introduced to penalize a Dense kernel to be orthogonal.

test: unit tests for Lorth regularizers

9e119e1

doc: sphinx doc for regularizers

8324716

add _alphaNormSpectral function for Lorth based normalisation

fda7eda

Merge pull request #53 from deel-ai/feat/lorth

e6aadaa

Lorth regularizer

fix: typos in error messages

e47926d

feat: check RKO parameters as an external function

9ed3963

These checks are performed in multiple layers. To simplify and avoid errors, the checks are now done in a separate function.

feat: SpectralConv2DTranspose

ee96481

A new layer is introduced, derived from Conv2DTranspose where weights are constrained with spectral normalization and Björck orthogonalization.

feat: unify conv Lipschitz factor computation

be5bd39

The computation of the Lipschitz correction factor is done in three different layers. This operation is externalized to be called by the three classes.

test: unit tests for SpectralConv2DTranspose

75d0dd9

fix: random test failed

fa6a604

Setting the seed of the random values fixes the failure

Merge pull request #59 from deel-ai/feat/conv2d_transpose

7f40676

SpectralConv2DTranspose

fix: test with "gelu" activation only for TensorFlow >= 2.4

ab3911a

Since a unit test uses a Keras gelu activation which was introduced in Tensorflow 2.4, a check is added.

feat: update versions of Github actions

ea0b04f

The current versions were deprecated due to an update on Github workflows. The latest versions are now used for actions/checkout and actions/setup-python

feat: update CI Github workflows

f38b857

The Github workflows for linting and testing are done when: - a pull request is open (and updated) - pushing on master and develop Moreover, the unit tests are also performed: - every Sunday at 2am on master (i.e. the base branch)

Merge pull request #60 from deel-ai/feat/ci_multiple_tf_versions

b4210f7

CI updates (tox + Github workflows)

test: unit tests for two new losses TauCE and CategoricalHinge

c8e9982

test: use binary_tf_data() to ensure float32 2-rank TF values

c95d0ca

Some tests fail because of float64 data or 1-rank data. To ensure that all tests use the same type of data and succeed, we use "binary_tf_data" when possible.

thib-s and others added 24 commits November 22, 2022 13:28

feat: loss hyper-parameters are now tf.Variable

b19b5b6

The hyper-parameters in losses (margin, temperature, alpha for KR) are now tf.Variable objects. These hyper-parameters can now be changed during training using a parameter scheduler (see next commit).

feat: new callback to schedule the parameters in losses

c97db51

This new callback allows to tune the hyper-parameters in losses during training. The scheduler is defined by points (step, value) and linear interpolation is used to compute the parameter at each step.

feat: new callback for parameter logging in losses

66cd1a7

This new callback allows to print at each epoch the value of a parameter in losses, e.g. the temperature tau or the margin.

Merge pull request #61 from deel-ai/feat/new_losses_part1

8f3b1e2

Introduce new losses (part 1)

add comments on reference papers

5878966

changed mutable default parameter of SpectralInitializer to it's stri…

3ce7d75

…ng equivalent

Merge pull request #65 from deel-ai/fix/initializer_warning

7f7d764

changed mutable default parameter of SpectralInitializer

change link to jmlr paper

c87fe44

Merge pull request #63 from deel-ai/feat/lorth_regul_comments

d7a4b39

Add comments on reference papers

code refactoring in normalizers.py

f84b7cb

added maxiter param in power iteration and bjorck

84bf6ae

mapped maxiter into layers.py interfaces

2d2ebd0

added swap_memory option in normalizers while loops

0373960

optimization in bjorck matrix multiplication

931edba

enabled parallel_iterations, allowing to perform computation of _u an…

37d57d5

…d _v concurrently (at different steps)

SWAP_MEMORY and STOP_GRAD_SPECTRAL are defined as globals

94e006b

added function to set SWAP_MEMORY and STOP_GRAD_SPECTRAL globals

4735979

chore: bump version to 1.4.0

e79aef9

Merge pull request #66 from deel-ai/chore/bump_version_1.4.0

2d433b3

Bump version to 1.4.0

Merge branch 'master' into develop

d3f26f1

cofri requested a review from franckma31 January 10, 2023 09:33

franckma31 approved these changes Jan 10, 2023

View reviewed changes

cofri merged commit 9c2f1a7 into master Jan 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge "develop" into "master" for release 1.4.0 #69

Merge "develop" into "master" for release 1.4.0 #69

cofri commented Jan 10, 2023

franckma31 left a comment

Merge "develop" into "master" for release 1.4.0 #69

Merge "develop" into "master" for release 1.4.0 #69

Conversation

cofri commented Jan 10, 2023

franckma31 left a comment

Choose a reason for hiding this comment