introduce new `norm` operator that can differentiate through `lambda x: norm(x-x)` + lighter tests across the board #411

marcocuturi · 2023-08-14T14:32:28Z

Computing the gradient of Sinkhorn divergences using the Euclidean cost results in NaN values, because of the reliance on jnp.linalg.norm, and in particular, differentiation of the distance of a point against itself, e.g. norm(x-x) in the diagonal of a symmetric cost matrix.

The fact that such a gradient is (and should be, in general) NaN is well documented, see e.g. jax-ml/jax#6484

However, In the context of OT, this poses problems, since it is safe to ignore these contributions, and therefore treat them as having 0 gradient.

This PR introduces a new norm function that does not blow up with a NaN at 0.

f = lambda x: ott.math.utils.norm(x - x)
g = lambda x: jnp.linalg.norm(x - x)
x = jnp.array([1.2, 3.2, 4.1])
print(jax.grad(f)(x))
print(jax.grad(g)(x))

results in

[0. 0. 0.]
[nan nan nan]

as a result it should be now possible to differentiate through a sinkhorn_divergence with a more elaborate cost without producing NaN's.

Also, in order to speed things up in CI, prune out some tests.

marcocuturi · 2023-08-14T14:52:09Z

also of interest to @theouscidda6 as potentially useful in Monge gap.

src/ott/math/utils.py

tests/math/math_utils_test.py

tests/tools/sinkhorn_divergence_test.py

michalk8 · 2023-08-14T15:42:11Z

tests/tools/sinkhorn_divergence_test.py

-    # Test div of x to itself close to 0.
+    # Check differentiability of Sinkhorn divergence works, without NaN's.
+    grad = jax.grad(lambda x: div(x).divergence)(x)
+    assert jnp.all(jnp.logical_not(jnp.isnan(grad)))


Use np.testing....

not sure I can find the adequate test in there, https://numpy.org/doc/stable/reference/routines.testing.html

You can use np.testing.assert_array_equal(jnp.isnan(grad), False).

tests/math/math_utils_test.py

src/ott/math/utils.py

marcocuturi · 2023-08-15T02:02:04Z

thanks Michal!

codecov · 2023-08-15T03:00:40Z

Codecov Report

Merging #411 (c9f5780) into main (f275dc4) will decrease coverage by 0.07%.
The diff coverage is 100.00%.

❗ Current head c9f5780 differs from pull request most recent head 704c98a. Consider uploading reports for the commit 704c98a to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #411      +/-   ##
==========================================
- Coverage   91.51%   91.44%   -0.07%     
==========================================
  Files          54       54              
  Lines        5891     5902      +11     
  Branches      856      860       +4     
==========================================
+ Hits         5391     5397       +6     
- Misses        364      370       +6     
+ Partials      136      135       -1

Files Changed	Coverage Δ
src/ott/geometry/costs.py	`92.83% <100.00%> (ø)`
src/ott/math/utils.py	`94.54% <100.00%> (+1.36%)`	⬆️

... and 3 files with indirect coverage changes

src/ott/math/utils.py

michalk8 · 2023-08-15T08:36:31Z

tests/tools/sinkhorn_divergence_test.py

-    # Test div of x to itself close to 0.
+    # Check differentiability of Sinkhorn divergence works, without NaN's.
+    grad = jax.grad(lambda x: div(x).divergence)(x)
+    assert jnp.all(jnp.logical_not(jnp.isnan(grad)))


You can use np.testing.assert_array_equal(jnp.isnan(grad), False).

tests/math/math_utils_test.py

…x: norm(x-x)` + lighter tests across the board (#411) * introduce new norm operator * add * resolve name conflict in (math).utils_test.py * docs * vjp -> jvp * fix * add axis in test * type error * speed up some tests * fix * fix * fix apply_jacobian test * fix mem size * batch size * yet another mem fix * minor fixes + add in docs * minor docs fixes * docs

marcocuturi added 3 commits August 14, 2023 23:25

introduce new norm operator

1d5a0cb

add

cebb75e

resolve name conflict in (math).utils_test.py

ea9ea2f

marcocuturi changed the title ~~introduce new norm operator~~ introduce new norm operator that can differentiate through lambda x: norm(x-x) Aug 14, 2023

marcocuturi requested a review from michalk8 August 14, 2023 14:51

marcocuturi added 2 commits August 14, 2023 23:56

docs

312b8ec

vjp -> jvp

f2dc31a

michalk8 requested changes Aug 14, 2023

View reviewed changes

marcocuturi added 3 commits August 15, 2023 09:01

fix

b43e504

add axis in test

a1e423d

type error

458b5f4

speed up some tests

cca799d

marcocuturi changed the title ~~introduce new norm operator that can differentiate through lambda x: norm(x-x)~~ introduce new norm operator that can differentiate through lambda x: norm(x-x) + lighter tests across the board Aug 15, 2023

marcocuturi added 6 commits August 15, 2023 12:56

fix

cbd8d35

fix

78e673d

fix apply_jacobian test

57a1a2d

fix mem size

c2d6418

batch size

0f23f47

yet another mem fix

a859060

marcocuturi requested a review from michalk8 August 15, 2023 08:40

michalk8 requested changes Aug 15, 2023

View reviewed changes

marcocuturi added 3 commits August 15, 2023 19:41

minor fixes + add in docs

7710df9

minor docs fixes

c9f5780

docs

704c98a

marcocuturi merged commit 12e78a7 into main Aug 15, 2023

marcocuturi deleted the new_norm branch August 15, 2023 14:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

introduce new `norm` operator that can differentiate through `lambda x: norm(x-x)` + lighter tests across the board #411

introduce new `norm` operator that can differentiate through `lambda x: norm(x-x)` + lighter tests across the board #411

marcocuturi commented Aug 14, 2023 •

edited

Loading

marcocuturi commented Aug 14, 2023

michalk8 Aug 14, 2023

marcocuturi Aug 14, 2023

michalk8 Aug 15, 2023

marcocuturi commented Aug 15, 2023

codecov bot commented Aug 15, 2023 •

edited

Loading

michalk8 Aug 15, 2023

introduce new norm operator that can differentiate through lambda x: norm(x-x) + lighter tests across the board #411

introduce new norm operator that can differentiate through lambda x: norm(x-x) + lighter tests across the board #411

Conversation

marcocuturi commented Aug 14, 2023 • edited Loading

marcocuturi commented Aug 14, 2023

michalk8 Aug 14, 2023

Choose a reason for hiding this comment

marcocuturi Aug 14, 2023

Choose a reason for hiding this comment

michalk8 Aug 15, 2023

Choose a reason for hiding this comment

marcocuturi commented Aug 15, 2023

codecov bot commented Aug 15, 2023 • edited Loading

Codecov Report

michalk8 Aug 15, 2023

Choose a reason for hiding this comment

introduce new `norm` operator that can differentiate through `lambda x: norm(x-x)` + lighter tests across the board #411

introduce new `norm` operator that can differentiate through `lambda x: norm(x-x)` + lighter tests across the board #411

marcocuturi commented Aug 14, 2023 •

edited

Loading

codecov bot commented Aug 15, 2023 •

edited

Loading