Support RGB[A] arrays in plot.imshow() #1796

Zac-HD · 2017-12-20T13:43:16Z

Tests added (for all bug fixes or enhancements)
Tests passed (for all non-documentation changes)
Passes git diff upstream/master **/*py | flake8 --diff
Fully documented, including whats-new.rst for all changes

This patch brings xarray.plot.imshow up to parity with matplotlib.pyplot.imshow:

As well as 2D images (greyscale / luminance, using a colormap), it now supports a third dimension for RGB or RGBA channels. For consistency with 2D arrays, missing data is plotted as transparent pixels
Being Xarray, users need not care about the order of their dimensions - we infer the right one for color, and warn if it's ambiguous.
~~Using robust=True for easy saturation is really nice. Having it adjust each channel and facet in the same way is essential for this to work, which it does.~~
~~Matplotlib wraps out-of-range colors, leading to crazy maps and serious interpretation problems if it's only a small region. Xarray clips (ie saturates) to the valid range instead.~~

I'm going to implement clip-to-range and color normalization upstream in matplotlib, then open a second PR here so that Xarray can use the same interface.

And that's the commit log! It's not really a big feature, but each of the parts can be fiddly so I've broken
the commits up logically 😄

Finally, a motivating example: visible-light Landsat data before, during (top-right), and after a fire at Sampson's Flat, Australia:

arr = ds['red green blue'.split()].to_array(dim='band') / (2 ** 12)
arr.plot.imshow(col='time', col_wrap=5, robust=True)

rabernat · 2017-12-20T14:17:34Z

xarray/tests/test_plot.py

+        DataArray(
+            easy_array((10, 15, 3), start=0),
+            dims=['y', 'x', 'band'],
+        ).plot.imshow()


Is it possible to check that the colors have actually rendered correctly by introspecting deeper into the figures / axes that are generated by imshow?

Well, it's almost certainly possible - but I have absolutely no idea where to start!

I'm okay relying on matplotlib to plot colors properly. One sane option would be to mock plt.imshow to ensure it gets called properly, but we don't bother with that for our current tests. But that shouldn't get in the way of integration tests that verify that we can actually call matplotlib without any errors being raised.

shoyer · 2017-12-20T15:27:39Z

I like this. My main concern is that I'd like a fully explicit way to opt into this behavior. Perhaps rgb='band' and/or rgba='band'?

Zac-HD · 2017-12-21T00:51:35Z

I like this. My main concern is that I'd like a fully explicit way to opt into this behavior. Perhaps rgb='band' and/or rgba='band'?

I have a pretty strong preference for RGB images 'just working' - requiring a new argument would mean imshow has a different signature to all other plot methods without adding any information. If it's an optional argument, it barely does anything - in the rare case where the heuristic fails, you can supply x and y dims instead.

In many ways showing an RGB image is a special case, but this is the least-special-case I could make work - and it's consistent with the obvious meaning and upstream behavior of imshow.

shoyer · 2017-12-21T01:32:10Z

I should clarify: I don't think we should require the explicit name for the RGB dimension, but I think we should have an argument around so that's an option if desired. This is nice because it gives better error messages when things go wrong and makes it easier to write self-documented code. This would be consistent with the rest of our visualization functions, which provide an option for being fully explicit but don't require it.

Zac-HD · 2017-12-21T05:21:41Z

Optional argument coming up then 😄

Anything else I need to do?

Zac-HD · 2017-12-21T13:31:28Z

Done - now with optional argument and nicer commit history, Anything else?

shoyer · 2017-12-24T02:48:20Z

xarray/plot/plot.py

+                robust = False
+                del flat
+            # Clip range to [0, 1] to avoid visual artefacts
+            darray.values[:] = np.clip(darray.values, 0, 1)


Can you verify that this doesn't modify the DataArray object on which the plot method is being called?

Yep, I've tossed in a .copy(deep=True) to guarantee that there's no mutation.

shoyer · 2017-12-24T03:01:57Z

xarray/plot/plot.py

+                    # All data will be masked, so skip percentile calculation
+                    vmin, vmax = 0, 1
+                if vmin is None:
+                    vmin = np.percentile(flat, ROBUST_PERCENTILE)


I'm not quite sure that this handling of the robust argument makes sense. Does this resemble what vmin/vmax do by default on imshow?

Parts of this that seem strange to me:

robust=True does more than merely clipping the data to the robust range, unlike the behavior with colormaps. The data gets extra normalization, even if it all already fell within the central 95%.

all channels are treated together, rather than independently, including the alpha channel!

I'm sure this behavior is useful sometimes, but maybe it's better to leave it out for now. We could make robust=True an error for now, and potentially save color normalization for another keyword argument.

This logic makes robust=True work exactly the same way on single-band and RGB greyscale images. The handling is only distinct because the 'normalise to [0, 1]' step is implicit for single bands but must be explicit for RGB. Scaling each band independently would completely change the colour of the image, which is almost never what you'd want to do.

~~It does make sense to normalise RGB without considering or changing the alpha channel though; I'll play with this a little more.~~ I can't think of a way to describe this that isn't "here's a weird edge case...", and the implementation would be even worse.

As to usefulness, it's basically essential to sensibly visualise surface reflectance data (ie high-quality satellite images) - they're represented for analysis as albedo, ie fraction of light reflected, and if you display that directly it's a very dark and muddy image (example below; same scene as above). So I'd strongly prefer to keep this in!

This still feels strange to me. With colormaps, all robust=True does is explicitly set vmin and vmax based on the data. But vmin and vmax are entirely ignored here.

What if we interpreted vmin/vmax with rgb plots as setting the bounds for normalization? The defaults would be (0, 1) for floats and (smallest int, largest int) for int dtypes. Then, robust=True could again simply set vmin/vmax based on the data.

I actually agree that the current behavior is strange and that your suggestion is an improvement (though I'd use [0, 255] for all ints) - but it's consistent with matplotlib! My proposed plan has some extra steps though:

Merge this pull. We will need the stretching machinery regardless, unless Xarray drops support for matplotlib<2.2

Patch imshow doesn't normalize the color range in RGB images matplotlib/matplotlib#9391 upstream. (I'm already working on a patch for this)

Implement the new matplotlib interface in Xarray. I could do that in this pull, but then if the matplotlib implementation doesn't match we'd have to break API compatibility - I'd really prefer to have a smaller API now and get it right the first time, later.

I could do that in this pull, but then if the matplotlib implementation doesn't match we'd have to break API compatibility - I'd really prefer to have a smaller API now and get it right the first time, later.

I agree. But I'm also concerned about the behavior for robust=True potentially needing to change to match upstream.

My preference would be to leave out support for robust with RGB plots until the matplotlib behavior lands. You could write a small helper function for RGB normalization (maybe even it include it in the docs as an example), e.g.,

import xarray.ufuncs as xu def robust_stretch(data): vmin, vmax = data.quantile(q=[0.02, 0.98]) bounded = xu.minimum(xu.maximum(data, vmin), vmax) returned (bounded - vmin) / (vmax - vmin)

Then you could plot things like arr.pipe(robust_stretch).plot.imshow(col='time', col_wrap=5).

It's a few more characters but not a terrible work around.

shoyer · 2017-12-24T03:03:40Z

xarray/plot/plot.py

@@ -446,12 +468,13 @@ def newplotfunc(darray, x=None, y=None, figsize=None, size=None,
                          "Use colors keyword instead.",
                          DeprecationWarning, stacklevel=3)



Can you add rgb as an actual argument inline, after x and y?

I could, but this would be a breaking change for anyone using positional arguments. I'd therefore prefer not to, as it's just as easy to do the breaking bit later.

Please confirm if you'd like the breaking change; otherwise I'll leave it as-is.

shoyer · 2017-12-24T03:05:53Z

xarray/plot/utils.py

    """
+    if imshow and darray.ndim == 3:
+        return _infer_xy_labels_3d(darray, x, y, rgb)


Please make sure this raises an informative error message if RGB is provided and the input has the wrong number of dimensions or isn't being plotted with imshow.

shoyer · 2017-12-24T03:11:49Z

xarray/plot/utils.py

+    """
+    Determine x and y labels for showing RGB images.
+
+    Attempts to infer which dimension is RGB/RGBA by size and order of dims.


Rather than all this complex logic with warnings in ambiguous cases, why not always treat the last and/or remaining (after explicit x/y labels) dimension as RGB? I think that solves the convenience use cases, without hard to understand/predict inference logic.

shoyer · 2017-12-24T03:13:20Z

xarray/tests/test_plot.py

+        # but doesn't warn if dimensions specified
+        arr.plot.imshow(rgb='band')
+        arr.plot.imshow(x='x', y='y')
+


Please add tests for errors related to rgb.

shoyer · 2017-12-24T03:16:34Z

xarray/tests/test_plot.py

+        DataArray(
+            easy_array((10, 15, 3), start=0),
+            dims=['y', 'x', 'band'],
+        ).plot.imshow()


I'm okay relying on matplotlib to plot colors properly. One sane option would be to mock plt.imshow to ensure it gets called properly, but we don't bother with that for our current tests. But that shouldn't get in the way of integration tests that verify that we can actually call matplotlib without any errors being raised.

shoyer · 2017-12-24T03:19:57Z

xarray/plot/plot.py

+    In this case, ``robust=True`` will saturate the image in the
+    usual way, consistenly between all bands and facets.
+
+    This method will clip oversaturated pixels to the valid range,


Is this actually strictly better? If so, why didn't it get changed during matplotlib's recent style overhaup? Was it simply overlooked or a more careful decision?

IMO it is strictly better, because it avoids the ambiguity a colour could actually represent any value at all modulo the range.

I believe it was overlooked because relatively few people start with crudely-scaled RGB images, though I wasn't involved closely enough to know for sure (or fix it then!). Unfortunately such "scaling" is pretty common to go from albedo to useful dynamic range in imagery, as I mentioned about robust above. Reference: matplotlib/matplotlib#9391 (and links from that issue).

Zac-HD · 2017-12-24T09:08:10Z

Thanks for the review @shoyer! I'll do what I can in the next few days, but that might not be much at all before I get back from a no-internet camping trip around Jan 8th. Items:

add validation and tests for rgb argument
don't mutate the input data! (I think it doesn't, but obviously this needs to be checked)
simplify order heuristic, to "use last dimension of size 3 or 4 as rgb; warn if there are multiple possible"
update rasterio example in gallery

fmaussion · 2018-01-03T14:51:46Z

I'll add another todo (can be done in a separate PR if you don't have time ;-): modify the gallery example to plot the RGB image instead of the BW with our rasterio example here.

Zac-HD · 2018-01-06T12:18:56Z

OK, @shoyer & @fmaussion - I think I'm done again!

The only thing I haven't done is change the handling of robust=True, as anything beyond "scale and clip 2nd-to-98th percentiles to [0, 1]" seems fraught with complications. If this is not what the user intended, it's at least the obvious default behaviour and should be simple to understand.

Hopefully this covers the substantial changes; but let me know if there's anything else as I'd really like this merged before the 0.10.1 release - and ideally released in time for demos at my summer school in January 😄

shoyer · 2018-01-09T01:50:45Z

xarray/plot/plot.py

@@ -445,10 +445,38 @@ def newplotfunc(darray, x=None, y=None, figsize=None, size=None,
        # Decide on a default for the colorbar before facetgrids
        if add_colorbar is None:
            add_colorbar = plotfunc.__name__ != 'contour'
+        imshow_rgb = plotfunc.__name__ == 'imshow' and \


nit: per PEP8, please use parentheses to group this expression instead of an explicit line continuation.

shoyer · 2018-01-09T01:55:28Z

xarray/plot/utils.py

+    assert darray.ndim == 3
+    not_none = [a for a in (x, y, rgb) if a is not None]
+    if len(set(not_none)) < len(not_none):
+        raise ValueError('Dimensions passed as x, y, and rgb must be unique.')


can you include the values for these dimensions in the error message?

shoyer · 2018-01-09T02:12:05Z

xarray/plot/plot.py

+        # there isn't one, and set it to transparent where data is masked.
+        if z.shape[-1] == 3:
+            z = np.ma.concatenate((z, np.ma.ones(z.shape[:2] + (1,))), 2)
+        z[np.sum(z.mask, axis=-1, dtype=bool),-1] = 0


We're sure this doesn't mutate the inputs, right?

style note: add a space after the comma.

I think this would also be clearer as with np.any(z.mask, axis=-1) rather than np.sum.

shoyer · 2018-01-09T06:03:30Z

xarray/plot/plot.py

+                    # All data will be masked, so skip percentile calculation
+                    vmin, vmax = 0, 1
+                if vmin is None:
+                    vmin = np.percentile(flat, ROBUST_PERCENTILE)


This still feels strange to me. With colormaps, all robust=True does is explicitly set vmin and vmax based on the data. But vmin and vmax are entirely ignored here.

What if we interpreted vmin/vmax with rgb plots as setting the bounds for normalization? The defaults would be (0, 1) for floats and (smallest int, largest int) for int dtypes. Then, robust=True could again simply set vmin/vmax based on the data.

Zac-HD · 2018-01-09T09:07:58Z

More review fixes 😄

I'm pretty sure the build failure is just Travis being flaky - it passes on my machine. Could someone restart it?

fmaussion · 2018-01-09T09:51:17Z

Just restarted them, let's see how it goes

Zac-HD · 2018-01-09T12:38:28Z

Okay... I've now seen three Travis runs. In every one there's been a pass, a fail, and three errors... but different jobs each time. At this point I'm ready to give up and wait for the Travis team to fix it 😕

The code is ready to go though! 🎉

Zac-HD · 2018-01-10T04:42:44Z

@shoyer makes some good points about things that should be fixed upstream in Matplotlib - namely normalization of RGB images, but I'm also going to move my color fixes (clipping instead of modulo) upstream instead. Timeline:

I edit this PR, it gets merged, Xarray 10.1 (ETA soonish?) can display RGB images just like Matplotlib
I finish my upstream PR, Matplotlib 2.2 (ETA later this month (!)) supports normalization of RGB images and fixes the modulo/clip bug
Once the matplotlib API is merged, I make a new PR against Xarray to support the same interface (including on older matplotlib versions)

shoyer · 2018-01-10T04:45:51Z

@Zac-HD Sounds good. At this point I don't think there are any blockers left for the 0.10.1 release (I wanted to get the pydap fix in, which we recently merged).

Zac-HD · 2018-01-10T07:14:16Z

Looks like TestDatetimePlot.test_datetime_line_plot is failing, but I'm pretty sure that's nothing to do with me (and it passed twice on Travis!).

shoyer · 2018-01-10T07:14:39Z

Should be fixed by #1814 which I just merged

shoyer

This looks good to me now. If there are no other comments I will merge tomorrow.

Includes new `rgb` keyword to tell imshow about that dimension, and much error handling in inference.

shoyer · 2018-01-11T03:14:41Z

Thanks @Zac-HD !

rabernat reviewed Dec 20, 2017

View reviewed changes

Zac-HD force-pushed the rgb-maps branch 4 times, most recently from b378a48 to 540300d Compare December 21, 2017 12:45

shoyer reviewed Dec 24, 2017

View reviewed changes

Zac-HD force-pushed the rgb-maps branch from 540300d to c12612d Compare January 6, 2018 12:05

shoyer reviewed Jan 9, 2018

View reviewed changes

Zac-HD force-pushed the rgb-maps branch from e2b5580 to cfa59fb Compare January 9, 2018 12:13

Zac-HD force-pushed the rgb-maps branch from cfa59fb to ca7d1ed Compare January 10, 2018 06:34

shoyer approved these changes Jan 10, 2018

View reviewed changes

Zac-HD added 4 commits January 10, 2018 22:43

Allow RGB plots from DataArray.plot.imshow

32a896c

Allow RGB[A] dim for imshow to be in any order

7cabea9

Includes new `rgb` keyword to tell imshow about that dimension, and much error handling in inference.

Use true RGB color for Rasterio gallery page

90ca3b1

Add whats-new entry

868f9ea

Zac-HD force-pushed the rgb-maps branch from ca7d1ed to 868f9ea Compare January 10, 2018 11:43

Zac-HD mentioned this pull request Jan 10, 2018

Clip RGB data to valid range for imshow matplotlib/matplotlib#10220

Merged

5 tasks

shoyer merged commit 289f95a into pydata:master Jan 11, 2018

Zac-HD deleted the rgb-maps branch January 11, 2018 03:20

Zac-HD mentioned this pull request Jan 11, 2018

Normalisation for RGB imshow #1819

Merged

This was referenced Feb 2, 2018

Should imshow() recognise 0-255 images? #1880

Closed

Fix rasterio example in docs #1881

Merged

Zac-HD mentioned this pull request Feb 25, 2018

v0.10.1 Release #1821

Closed

5 tasks

		@@ -446,12 +468,13 @@ def newplotfunc(darray, x=None, y=None, figsize=None, size=None,
		"Use colors keyword instead.",
		DeprecationWarning, stacklevel=3)

Support RGB[A] arrays in plot.imshow() #1796

Support RGB[A] arrays in plot.imshow() #1796

Conversation

Zac-HD commented Dec 20, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

shoyer commented Dec 20, 2017

Zac-HD commented Dec 21, 2017 • edited Loading

shoyer commented Dec 21, 2017

Zac-HD commented Dec 21, 2017

Zac-HD commented Dec 21, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zac-HD Dec 24, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zac-HD Dec 24, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zac-HD Dec 24, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zac-HD Dec 24, 2017 • edited Loading

Choose a reason for hiding this comment

Zac-HD commented Dec 24, 2017 • edited Loading

fmaussion commented Jan 3, 2018

Zac-HD commented Jan 6, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Zac-HD commented Jan 9, 2018

fmaussion commented Jan 9, 2018

Zac-HD commented Jan 9, 2018

Zac-HD commented Jan 10, 2018 • edited Loading

shoyer commented Jan 10, 2018

Zac-HD commented Jan 10, 2018

shoyer commented Jan 10, 2018

shoyer left a comment

Choose a reason for hiding this comment

shoyer commented Jan 11, 2018

Zac-HD commented Dec 20, 2017 •

edited

Loading

Zac-HD commented Dec 21, 2017 •

edited

Loading

Zac-HD Dec 24, 2017 •

edited

Loading

Zac-HD Dec 24, 2017 •

edited

Loading

Zac-HD Dec 24, 2017 •

edited

Loading

Zac-HD Dec 24, 2017 •

edited

Loading

Zac-HD commented Dec 24, 2017 •

edited

Loading

Zac-HD commented Jan 6, 2018 •

edited

Loading

Zac-HD commented Jan 10, 2018 •

edited

Loading