Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update SequenceLoss for TensorFlow 2.2 compatibility #1371

Merged
merged 2 commits into from
Mar 27, 2020
Merged

Update SequenceLoss for TensorFlow 2.2 compatibility #1371

merged 2 commits into from
Mar 27, 2020

Conversation

guillaumekln
Copy link
Contributor

For previous versions, we removed the reduction attribute to force tf.keras to call the main __call__ method. Otherwise, it calls the call method where we have no way of controlling the reduction logic.

In 2.2, tf.keras has a new way of dealing with loss objects (see the LossContainer class). It now always calls __call__ and requires the reduction attribute to exist.

This new logic is only enabled for V2 execution mode so this commit also disables the deprecated V1 graph mode for this test.

Related to #1320.

Note: this change is only compatible with TensorFlow 2.2 so I'm opening a draft PR for now.

@gabrieldemarmiesse
Copy link
Member

I guess the easiest solution is to do a version check before deleting the reduction.

But having a custom reduction prevents us from using a distributed strategy (from what I remember). Would it be possible to use keras' built-in reductions for some use cases (not all)?

@guillaumekln
Copy link
Contributor Author

I guess the easiest solution is to do a version check before deleting the reduction.

I suppose the project will shortly move to require TensorFlow 2.2, right? Then we could merge this PR as is (after a rebase).

But having a custom reduction prevents us from using a distributed strategy (from what I remember). Would it be possible to use keras' built-in reductions for some use cases (not all)?

I'm not sure it prevents the use of distribution strategies, but it requires the user to carefully scale the loss (or gradients) based on the global batch size.

We could indeed rely on the built-in reduction for some combinations. But these combinations are actually the ones that can be implemented with tf.keras.losses.{Sparse}CategoricalCrossentropy.

@gabrieldemarmiesse
Copy link
Member

gabrieldemarmiesse commented Mar 25, 2020

I suppose the project will shortly move to require TensorFlow 2.2, right?

To improve the UX, we'd like the python code to be compatible with as many versions of TF as possible. The custom ops will only be compatible with a specific python wheel. See #1317 (comment)

We could indeed rely on the built-in reduction for some combinations. But these combinations are actually the ones that can be implemented with tf.keras.losses.{Sparse}CategoricalCrossentropy.

Iet's let you do what you think is best here.

@gabrieldemarmiesse gabrieldemarmiesse self-assigned this Mar 25, 2020
@seanpmorgan
Copy link
Member

Thanks @guillaumekln!

Just wanted to note that this is the last component of #1320 and will block a release built with TF2.2 once it is released.

@qlzh727 Could you review when time allows and advise how best to handle this change in 2.2

@seanpmorgan seanpmorgan requested a review from qlzh727 March 26, 2020 01:58
@qlzh727
Copy link
Member

qlzh727 commented Mar 26, 2020

@pavithrasv who works on all Keras losses for more input.

For previous versions, we removed the `reduction` attribute to force
tf.keras to call the main `__call__` method. Otherwise, it calls the
`call` method where we have no way of controlling the reduction logic.

In 2.2, tf.keras has a new way of dealing with loss objects (see the
`LossContainer` class). It now always calls `__call__` and requires
the `reduction` attribute to exist.

This new logic is only enabled for V2 execution mode so this commit
also disables the deprecated V1 graph mode for this test.
@guillaumekln
Copy link
Contributor Author

guillaumekln commented Mar 27, 2020

Iet's let you do what you think is best here.

@gabrieldemarmiesse I propose to just make a bug fix in this PR and try to not change the behavior. Maybe @pavithrasv has more recommendation on how to deal with custom loss reduction.

@guillaumekln guillaumekln marked this pull request as ready for review March 27, 2020 08:42
@gabrieldemarmiesse
Copy link
Member

Let's merge it as it's a good step forward. We can open an issue to see how to replace the __call__ which is a private keras API.

Copy link
Member

@gabrieldemarmiesse gabrieldemarmiesse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for the fix!

@gabrieldemarmiesse gabrieldemarmiesse merged commit b8cd9bf into tensorflow:master Mar 27, 2020
@guillaumekln guillaumekln deleted the sequence-loss-tf2.2-compat branch March 27, 2020 10:34
jrruijli pushed a commit to jrruijli/addons that referenced this pull request Dec 23, 2020
* Update SequenceLoss for TensorFlow 2.2 compatibility

For previous versions, we removed the `reduction` attribute to force
tf.keras to call the main `__call__` method. Otherwise, it calls the
`call` method where we have no way of controlling the reduction logic.

In 2.2, tf.keras has a new way of dealing with loss objects (see the
`LossContainer` class). It now always calls `__call__` and requires
the `reduction` attribute to exist.

This new logic is only enabled for V2 execution mode so this commit
also disables the deprecated V1 graph mode for this test.

* Fix format
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants