Normalisation for RGB imshow #1819

Zac-HD · 2018-01-11T11:09:12Z

Follow-up to #1796, where normalisation and clipping of RGB[A] values were deferred so that we could match any upstream API. matplotlib/matplotlib#10220 implements clipping to the valid range, but a strong consensus against RGB normalisation in matplotlib has emerged.

This pull therefore implements normalisation, and clips values only where our normalisation has pushed them out of range.

Zac-HD · 2018-01-11T11:10:22Z

CC @shoyer - hopefully I'm in time for Xarray 0.10.1 😉

jklymak · 2018-01-11T17:10:47Z

xarray/plot/plot.py

+                    q=[ROBUST_PERCENTILE, 1 - ROBUST_PERCENTILE])
+                vmin = vmin if vmin is not None else vmin_tmp
+                vmax = vmax if vmax is not None else vmax_tmp
+                del vmin_tmp, vmax__tmp


typo. vmax_tmp

jklymak · 2018-01-11T17:15:11Z

xarray/plot/plot.py

+            if vmin is not None or vmax is not None:
+                vmin = vmin if vmin is not None else darray.min()
+                vmax = vmax if vmax is not None else darray.max()
+                darray = (darray.astype(np.float32) - vmin) / (vmax - vmin)


We ran into dynamic range issues w/ uint32 being converted to float32, particularly as some people use an extreme high value as a bad-data flag. matplotlib/matplotlib#10133

jklymak · 2018-01-11T17:17:00Z

xarray/plot/plot.py

@@ -625,6 +641,10 @@ def imshow(x, y, z, ax, **kwargs):
    dimension can be interpreted as RGB or RGBA color channels and
    allows this dimension to be specified via the kwarg ``rgb=``.

+    Unlike matplotlib, Xarray can apply ``vmin`` and ``vmax`` to RGB or RGBA
+    data, by applying a single scaling factor and offset to all bands.
+    Passing  ``robust=True`` infers ``vmin`` and ``vmax`` in the usual way.


... infers vmin and vmax from the 96-th percentile of the data distribution. (Or reference the “usual way” somehow).

jhamman

Looks good, thanks for this. Can you add an additional note to the whats new page in the docs?

jhamman · 2018-01-11T17:24:30Z

xarray/plot/plot.py

+                vmin = vmin if vmin is not None else vmin_tmp
+                vmax = vmax if vmax is not None else vmax_tmp
+                del vmin_tmp, vmax__tmp
+                robust=False


2 nits: 1) let garbage collection handle deleting your temp variables, 2) space in robust = False

I think we use locals() somewhere below, which is why this is necessary (but yes it's pretty ugly)

jhamman · 2018-01-11T17:25:11Z

xarray/tests/test_plot.py

+        # Test for crash when normalising RGB data.  (further tests require a
+        # reference image, as the args used to be silently ignored instead)
+        da = easy_array((5, 5, 3), start=-0.6, stop=1.4)
+        da.plot.imshow(**kwds)


I think you can test that the range of the values in this plot are correct.

jhamman · 2018-01-11T17:27:22Z

xarray/plot/plot.py

+                vmin_tmp, vmax_tmp = darray.quantile(
+                    q=[ROBUST_PERCENTILE, 1 - ROBUST_PERCENTILE])
+                vmin = vmin if vmin is not None else vmin_tmp
+                vmax = vmax if vmax is not None else vmax_tmp


perhaps we should skip the quantile calculation when both vmin/vmax are set. Then again, if robust is True, the behavior is poorly defined and maybe an error would be better.

jhamman · 2018-01-13T22:27:03Z

xarray/plot/plot.py

+                vmin, vmax = None, None
+                # There's a cyclic dependency via DataArray, so we can't
+                # import xarray.ufuncs in global or outer scope.
+                import xarray.ufuncs as xu


we should be able to fix this.

jhamman · 2018-01-13T22:30:46Z

xarray/plot/plot.py

+            if vmin is not None or vmax is not None:
+                vmin = vmin if vmin is not None else darray.min()
+                vmax = vmax if vmax is not None else darray.max()
+                darray = darray.astype('f4' if vmax - vmin < 2 ** 32 else 'f8')


I think you can use np.finfo('f4').max here instead of 2**32.

I think it was uint32 that gave us trouble on Matplotlib. i.e. way less than f4.max (I think?). See matplotlib/matplotlib#10133. I was too lazy to figure out the actual limit (I guess its 2^N, where N is how many bits a float has dynamic range over. I think its 32-1-8 = 2**23?

So you can use np.iinfo instead:

Out[12]: iinfo(min=0, max=4294967295, dtype=uint32)

Yeah, but thats not the problem. The problem is that if you convert (a-vmin)/(1e9) then you can't tell the difference between a=1 and a=2 anymore if you only use float32.

This came up in Matplotlib because the conversion to float happened with an automatic vmin and vmax, and the user had a large value as a bad-data flag. Then the user subsequently set vmin - vmax = 10 or so, but the convert to float32 had erased all the dynamic between 0 and 10.

Maybe xarray doesn't have this problem because you trust the user to set vmin and vmax during the imshow call. But if someone grabs the image handle and tries to set the limits after the call they could have this problem.

The more deeply I analyse matplotlib/matplotlib#10133, the more suspicious I get - there are actually two places you could get loss of precision: subtracting vmin (if <<0, loss of precision for high values), and dividing by (vmin - vmax) (loss if vmax is large).

I think for Xarray, the best option is actually just to use np.float64 in all cases. We already have a habit of upcasting all the time anyway*, and if that's lossy you're in trouble anyway.

* (including uint16 to float64 when decoding variables from a file, which is silly - issue incoming for that)

vmin and A are still ints when vmin is removed from A.

In which case you are vulnerable to underflow, and hence wraparound of values - that's why I cast to float64 here first!

Not sure I follow.

The case we were running into was

A = np.array( [[1, 2, 1e9], [3, 4, 5]], dtype=np.uint32) im = plt.imshow(A) im.set_clim([0., 5.])

When the data in the image is first input, vmin = 1 and vmax = 1e9, and the underflow occurred when we cast to np.float32, which we did to help accommodate very large images. Hence the check on vmax - vmin to see if it is larger than 1e8 and to use np.float64 for the case instead.

Casting to float64 first is great but apparently people sometimes have huge images, hence the extra rigamarole w/ using float32 if possible.

vmin and A are still ints when vmin is removed from A.

Going and reading the code again, it's not - a_min is subtracted from A_scaled, which has a floating dtype. In which case you're also safe from overflow/underflow, and have just added handling for loss of precision 😄

I remain happy with this solution - for Xarray, including passing non-RGB images directly to matplotlib - and suggest we take any further discussion to the upstream pull.

Zac-HD · 2018-01-14T23:08:16Z

@jhamman - all done 😄

jhamman · 2018-01-15T00:07:54Z

xarray/plot/plot.py

+                # There's a cyclic dependency via DataArray, so we can't
+                # import xarray.ufuncs in global or outer scope.
+                import xarray.ufuncs as xu
+                darray = xu.minimum(xu.maximum(darray.astype('f4'), 0), 1)


do we need to type cast this back to a 4byte float?

@shoyer - do you have a solution off had for the cyclic import problem here?

We don't need to downcast, in that it will work even if we don't, but downcasting ~halves memory use further inside matplotlib which can be a 5-10x multiple of the input data size. Happy to remove if you don't think the performance hit is worth it.

The best alternative solution I can see is refactoring xarray.ufuncs to do all the imports inside a function or class, and pay the performance cost when creating ufuncs and slightly increased overhead in dispatch. Personally, I prefer a slight slowdown when creating scaled RGB images - much less common, and it's probably negligible given the array operations involved anyway.

I'm happy with downcasting, let's just add a comment to explain why.

I don't think we can avoid the cycle import issue. This is a downside of our method heavy design. However, it would be nice if we moved the import statement to the top of the method.

Zac-HD · 2018-01-17T01:44:45Z

Ping @shoyer, @jhamman - I think we're out of blockers and ready to merge 😄

shoyer · 2018-01-17T21:38:27Z

xarray/plot/plot.py

+            # Scale interval [vmin .. vmax] to [0 .. 1] and clip to bounds
+            if vmin is not None or vmax is not None:
+                vmin = vmin if vmin is not None else darray.min()
+                vmax = vmax if vmax is not None else darray.max()


Shouldn't the default here for vmin/vmax be 0/1 rather than the min/max values of the array? That would seem a little less surprising to me. Otherwise setting one of vmin/vmax would cause the other to be automatically set which seems strange.

shoyer · 2018-01-17T21:40:37Z

xarray/plot/plot.py

+                # There's a cyclic dependency via DataArray, so we can't
+                # import xarray.ufuncs in global or outer scope.
+                import xarray.ufuncs as xu
+                darray = xu.minimum(xu.maximum(darray.astype('f4'), 0), 1)


I'm happy with downcasting, let's just add a comment to explain why.

I don't think we can avoid the cycle import issue. This is a downside of our method heavy design. However, it would be nice if we moved the import statement to the top of the method.

shoyer · 2018-01-17T21:42:11Z

xarray/plot/plot.py

@@ -449,10 +449,28 @@ def newplotfunc(darray, x=None, y=None, figsize=None, size=None,
        if imshow_rgb:
            # Don't add a colorbar when showing an image with explicit colors
            add_colorbar = False
+            # Calculate vmin and vmax automatically for `robust=True`


Can we move this logic to a separate helper function, e.g., _rescale_imshow_rgb(darray, vmin, vmax, robust)? This method is already getting pretty long.

Zac-HD · 2018-01-18T01:54:35Z

Method extracted, default for missing bounds fixed, comment added, import moved. Also added a check that you haven't accidentally reversed the bounds while supplying only one.

shoyer · 2018-01-18T02:04:14Z

xarray/plot/plot.py

+    elif vmax is None:
+        vmax = 255 if np.issubdtype(darray.dtype, np.integer) else 1
+        if vmax < vmin:
+            raise ValueError(


These error checks look great, thanks! Can you add a test that covers them?

shoyer

Looks good to me. @jhamman any further concerns?

Zac-HD · 2018-01-19T00:48:21Z

Ping @jhamman?

jhamman · 2018-01-19T05:01:19Z

In it goes. Thanks @Zac-HD.

jhamman mentioned this pull request Jan 11, 2018

v0.10.1 Release #1821

Closed

5 tasks

jklymak reviewed Jan 11, 2018

View reviewed changes

jhamman requested changes Jan 11, 2018

View reviewed changes

Zac-HD force-pushed the rgb-maps branch 5 times, most recently from 007ebff to 4d3a1b7 Compare January 12, 2018 13:51

jhamman reviewed Jan 13, 2018

View reviewed changes

Zac-HD force-pushed the rgb-maps branch from 4d3a1b7 to 83f567c Compare January 14, 2018 13:45

jhamman reviewed Jan 15, 2018

View reviewed changes

shoyer reviewed Jan 17, 2018

View reviewed changes

Zac-HD force-pushed the rgb-maps branch from 83f567c to a1f475d Compare January 17, 2018 23:09

Normalisation for RGB imshow

ac540b1

Zac-HD force-pushed the rgb-maps branch from a1f475d to ac540b1 Compare January 18, 2018 01:52

shoyer reviewed Jan 18, 2018

View reviewed changes

Add test for error checking

8a2bdfc

shoyer approved these changes Jan 18, 2018

View reviewed changes

jhamman approved these changes Jan 19, 2018

View reviewed changes

jhamman merged commit 6aa225f into pydata:master Jan 19, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Normalisation for RGB imshow #1819

Normalisation for RGB imshow #1819

Zac-HD commented Jan 11, 2018

Zac-HD commented Jan 11, 2018

jklymak Jan 11, 2018

jklymak Jan 11, 2018

jklymak Jan 11, 2018

jhamman left a comment

jhamman Jan 11, 2018

shoyer Jan 11, 2018

jhamman Jan 11, 2018

jhamman Jan 11, 2018

jhamman Jan 13, 2018

jhamman Jan 13, 2018

jklymak Jan 13, 2018

jhamman Jan 13, 2018

jklymak Jan 13, 2018

Zac-HD Jan 14, 2018 •

edited

Loading

jklymak Jan 14, 2018

Zac-HD Jan 14, 2018

jklymak Jan 14, 2018

Zac-HD Jan 14, 2018

Zac-HD commented Jan 14, 2018

jhamman Jan 15, 2018

jhamman Jan 15, 2018

Zac-HD Jan 15, 2018 •

edited

Loading

shoyer Jan 17, 2018

Zac-HD commented Jan 17, 2018

shoyer Jan 17, 2018

shoyer Jan 17, 2018

shoyer Jan 17, 2018

Zac-HD commented Jan 18, 2018

shoyer Jan 18, 2018

Zac-HD Jan 18, 2018

shoyer left a comment

Zac-HD commented Jan 19, 2018

jhamman commented Jan 19, 2018

Normalisation for RGB imshow #1819

Normalisation for RGB imshow #1819

Conversation

Zac-HD commented Jan 11, 2018

Zac-HD commented Jan 11, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jhamman left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zac-HD Jan 14, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zac-HD commented Jan 14, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zac-HD Jan 15, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zac-HD commented Jan 17, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zac-HD commented Jan 18, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shoyer left a comment

Choose a reason for hiding this comment

Zac-HD commented Jan 19, 2018

jhamman commented Jan 19, 2018

Zac-HD Jan 14, 2018 •

edited

Loading

Zac-HD Jan 15, 2018 •

edited

Loading