Skip to content

Commit

Permalink
ENH: Scatter plots of one variable vs another (#2277)
Browse files Browse the repository at this point in the history
* initial commit

* formatting

* fix bug

* refactor map_scatter

* colorbar

* formatting

* refactor

* refactor _infer_data

* added tests

* minor formatting

* fixed tests

* Refactor out to dataset_plot.py + move utilities to utils.py

* Fix tests.

* Fixes.

* discrete_legend → add_colorbar

* Revert "discrete_legend → add_colorbar"

This reverts commit d3e1308.

* Only use scatter instead of alternating between scatter and plot.

* Create and use plot.utils._add_colorbar

* fix tests.

* More fixes to hue, cmap_kwargs.

* doc fixes.

* Dataset plotting docs.

* group existing docs under "DataArrays."

* bugfix.

* Fix.

* Add whats-new

* Add api.rst.

* Add hue_style.

* Update tests.

* cleanup imports.

* facetgrid: Refactor out cmap_params, cbar_kwargs processing

* Dataset.plot.scatter obeys cmap_params, cbar_kwargs.

* _determine_cmap_params supports datetime64

* dataset.plot.scatter supports hue=datetime64, timedelta64

* pep8

* Update docs.

* bugfix: facetgrid now uses hue_style

* minor fixes.

* Scatter docs

* Refactor out more code to utils.py

* map_scatter → map_dataset

* Use some wrapping magic to generalize code.

* Add hist as test of generalization.

* Get facetgrid working again

* Refactor out utility functions.

* facetgrid refactor

1. refactor out _easy_facetgrid
2. Combine map_dataarray_line with map_dataarray

* flake8

* Refactor out colorbar making to plot.utils._add_colorbar

* Refactor out cmap_params, cbar_kwargs processing

* Back to map_dataarray_line

* lint

* small rename

* review comment.

* Bugfix merge

* hue, hue_style aren't needed for all functions.

* lint

* Use _process_cmap_cbar_kwargs.

* Update whats-new

* Some doc fixes.

* Fix tests?

* another attempt to fix tests.

* small

* remove py2 line

* remove extra _infer_line_data

* Use _is_facetgrid flag.

* Revert "_determine_cmap_params supports datetime64"

This reverts commit 0a01e7c.

* Remove datetime/timedelta hue support

* _meta_data → meta_data.

* isort

* Add doc line

* Switch to add_guide.

* Save hist for a future PR.

* rename _numeric to _is_numeric.

* Raise error if add_colorbar or add_legend are passed to scatter.

* Add scatter_example_dataset to tutorial.py

* Support scattering against coordinates, dimensions or data vars

* Support 'scatter_size' kwarg

* color → hue and other changes.

* Facetgrid support for scatter_size.

* add_guide in docs.

* Avoid top-level matplotlib import

* Fix lint errors.

* Follow shoyer's suggestions.

* scatter_size → markersize.

* Update more error messages.

* lint errors

* lint errors again

* some more lints

* docstrings

* fix legend bug in line plots

* unittest for legend in lineplot

* bug fix

* add figlegend to __init__

* remove import from facetgrid.py

* Remove xr.plot.scatter.

* facetgrid._hue_var is always a DataArray.

* scatter_size bugfix.

* Update for latest _process_cmap_params_cbar_kwargs

* Fix whats-new

* Fix tests.

* Make add_guide=False work.
  • Loading branch information
yohai authored and dcherian committed Aug 8, 2019
1 parent 8d46bf0 commit f172c67
Show file tree
Hide file tree
Showing 10 changed files with 743 additions and 55 deletions.
1 change: 1 addition & 0 deletions doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -602,6 +602,7 @@ Plotting
.. autosummary::
:toctree: generated/

Dataset.plot
DataArray.plot
plot.plot
plot.contourf
Expand Down
159 changes: 119 additions & 40 deletions doc/plotting.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ xarray's plotting capabilities are centered around
:py:class:`xarray.DataArray` objects.
To plot :py:class:`xarray.Dataset` objects
simply access the relevant DataArrays, ie ``dset['var1']``.
Dataset specific plotting routines are also available (see :ref:`plot-dataset`).
Here we focus mostly on arrays 2d or larger. If your data fits
nicely into a pandas DataFrame then you're better off using one of the more
developed tools there.
Expand Down Expand Up @@ -83,11 +84,15 @@ For these examples we'll use the North American air temperature dataset.
Until :issue:`1614` is solved, you might need to copy over the metadata in ``attrs`` to get informative figure labels (as was done above).


DataArrays
----------

One Dimension
-------------
~~~~~~~~~~~~~

Simple Example
~~~~~~~~~~~~~~
================
Simple Example
================

The simplest way to make a plot is to call the :py:func:`xarray.DataArray.plot()` method.

Expand All @@ -104,8 +109,9 @@ xarray uses the coordinate name along with metadata ``attrs.long_name``, ``attr
air1d.attrs
Additional Arguments
~~~~~~~~~~~~~~~~~~~~~
======================
Additional Arguments
======================

Additional arguments are passed directly to the matplotlib function which
does the work.
Expand Down Expand Up @@ -133,8 +139,9 @@ Keyword arguments work the same way, and are more explicit.
@savefig plotting_example_sin3.png width=4in
air1d[:200].plot.line(color='purple', marker='o')
Adding to Existing Axis
~~~~~~~~~~~~~~~~~~~~~~~
=========================
Adding to Existing Axis
=========================

To add the plot to an existing axis pass in the axis as a keyword argument
``ax``. This works for all xarray plotting methods.
Expand All @@ -159,8 +166,9 @@ On the right is a histogram created by :py:func:`xarray.plot.hist`.

.. _plotting.figsize:

Controlling the figure size
~~~~~~~~~~~~~~~~~~~~~~~~~~~
=============================
Controlling the figure size
=============================

You can pass a ``figsize`` argument to all xarray's plotting methods to
control the figure size. For convenience, xarray's plotting methods also
Expand Down Expand Up @@ -199,8 +207,9 @@ entire figure (as for matplotlib's ``figsize`` argument).

.. _plotting.multiplelines:

Multiple lines showing variation along a dimension
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
====================================================
Multiple lines showing variation along a dimension
====================================================

It is possible to make line plots of two-dimensional data by calling :py:func:`xarray.plot.line`
with appropriate arguments. Consider the 3D variable ``air`` defined above. We can use line
Expand All @@ -221,8 +230,9 @@ If required, the automatic legend can be turned off using ``add_legend=False``.
``hue`` can be passed directly to :py:func:`xarray.plot` as `air.isel(lon=10, lat=[19,21,22]).plot(hue='lat')`.


Dimension along y-axis
~~~~~~~~~~~~~~~~~~~~~~
========================
Dimension along y-axis
========================

It is also possible to make line plots such that the data are on the x-axis and a dimension is on the y-axis. This can be done by specifying the appropriate ``y`` keyword argument.

Expand All @@ -231,8 +241,9 @@ It is also possible to make line plots such that the data are on the x-axis and
@savefig plotting_example_xy_kwarg.png
air.isel(time=10, lon=[10, 11]).plot(y='lat', hue='lon')
Step plots
~~~~~~~~~~
============
Step plots
============

As an alternative, also a step plot similar to matplotlib's ``plt.step`` can be
made using 1D data.
Expand Down Expand Up @@ -263,7 +274,7 @@ is ignored.


Other axes kwargs
-----------------
~~~~~~~~~~~~~~~~~


The keyword arguments ``xincrease`` and ``yincrease`` let you control the axes direction.
Expand All @@ -277,11 +288,12 @@ In addition, one can use ``xscale, yscale`` to set axes scaling; ``xticks, ytick


Two Dimensions
--------------

Simple Example
~~~~~~~~~~~~~~

================
Simple Example
================

The default method :py:meth:`xarray.DataArray.plot` calls :py:func:`xarray.plot.pcolormesh` by default when the data is two-dimensional.

.. ipython:: python
Expand All @@ -307,8 +319,9 @@ and ``xincrease``.
If speed is important to you and you are plotting a regular mesh, consider
using ``imshow``.

Missing Values
~~~~~~~~~~~~~~
================
Missing Values
================

xarray plots data with :ref:`missing_values`.

Expand All @@ -321,8 +334,9 @@ xarray plots data with :ref:`missing_values`.
@savefig plotting_missing_values.png width=4in
bad_air2d.plot()
Nonuniform Coordinates
~~~~~~~~~~~~~~~~~~~~~~
========================
Nonuniform Coordinates
========================

It's not necessary for the coordinates to be evenly spaced. Both
:py:func:`xarray.plot.pcolormesh` (default) and :py:func:`xarray.plot.contourf` can
Expand All @@ -337,8 +351,9 @@ produce plots with nonuniform coordinates.
@savefig plotting_nonuniform_coords.png width=4in
b.plot()
Calling Matplotlib
~~~~~~~~~~~~~~~~~~
====================
Calling Matplotlib
====================

Since this is a thin wrapper around matplotlib, all the functionality of
matplotlib is available.
Expand Down Expand Up @@ -370,8 +385,9 @@ matplotlib is available.
@savefig plotting_2d_call_matplotlib2.png width=4in
plt.draw()
Colormaps
~~~~~~~~~
===========
Colormaps
===========

xarray borrows logic from Seaborn to infer what kind of color map to use. For
example, consider the original data in Kelvins rather than Celsius:
Expand All @@ -386,8 +402,9 @@ Kelvins do not have 0, so the default color map was used.

.. _robust-plotting:

Robust
~~~~~~
========
Robust
========

Outliers often have an extreme effect on the output of the plot.
Here we add two bad data points. This affects the color scale,
Expand Down Expand Up @@ -417,8 +434,9 @@ Observe that the ranges of the color bar have changed. The arrows on the
color bar indicate
that the colors include data points outside the bounds.

Discrete Colormaps
~~~~~~~~~~~~~~~~~~
====================
Discrete Colormaps
====================

It is often useful, when visualizing 2d data, to use a discrete colormap,
rather than the default continuous colormaps that matplotlib uses. The
Expand Down Expand Up @@ -462,7 +480,7 @@ since levels are chosen automatically).
.. _plotting.faceting:

Faceting
--------
~~~~~~~~

Faceting here refers to splitting an array along one or two dimensions and
plotting each group.
Expand All @@ -488,8 +506,9 @@ So let's use a slice to pick 6 times throughout the first year.
t = air.isel(time=slice(0, 365 * 4, 250))
t.coords
Simple Example
~~~~~~~~~~~~~~
================
Simple Example
================

The easiest way to create faceted plots is to pass in ``row`` or ``col``
arguments to the xarray plotting methods/functions. This returns a
Expand All @@ -507,8 +526,9 @@ Faceting also works for line plots.
@savefig plot_facet_dataarray_line.png
g_simple_line = t.isel(lat=slice(0,None,4)).plot(x='lon', hue='lat', col='time', col_wrap=3)
4 dimensional
~~~~~~~~~~~~~
===============
4 dimensional
===============

For 4 dimensional arrays we can use the rows and columns of the grids.
Here we create a 4 dimensional array by taking the original data and adding
Expand All @@ -525,8 +545,9 @@ one were much hotter.
@savefig plot_facet_4d.png
t4d.plot(x='lon', y='lat', col='time', row='fourth_dim')
Other features
~~~~~~~~~~~~~~
================
Other features
================

Faceted plotting supports other arguments common to xarray 2d plots.

Expand All @@ -546,8 +567,9 @@ Faceted plotting supports other arguments common to xarray 2d plots.
robust=True, cmap='viridis',
cbar_kwargs={'label': 'this has outliers'})
FacetGrid Objects
~~~~~~~~~~~~~~~~~
===================
FacetGrid Objects
===================

:py:class:`xarray.plot.FacetGrid` is used to control the behavior of the
multiple plots.
Expand Down Expand Up @@ -589,6 +611,63 @@ they have been plotted.
TODO: add an example of using the ``map`` method to plot dataset variables
(e.g., with ``plt.quiver``).

.. _plot-dataset:

Datasets
--------

``xarray`` has limited support for plotting Dataset variables against each other.
Consider this dataset

.. ipython:: python
ds = xr.tutorial.scatter_example_dataset()
ds
Suppose we want to scatter ``A`` against ``B``

.. ipython:: python
@savefig ds_simple_scatter.png
ds.plot.scatter(x='A', y='B')
The ``hue`` kwarg lets you vary the color by variable value

.. ipython:: python
@savefig ds_hue_scatter.png
ds.plot.scatter(x='A', y='B', hue='w')
When ``hue`` is specified, a colorbar is added for numeric ``hue`` DataArrays by
default and a legend is added for non-numeric ``hue`` DataArrays (as above).
You can force a legend instead of a colorbar by setting ``hue_style='discrete'``.
Additionally, the boolean kwarg ``add_guide`` can be used to prevent the display of a legend or colorbar (as appropriate).

.. ipython:: python
ds.w.values = [1, 2, 3, 5]
@savefig ds_discrete_legend_hue_scatter.png
ds.plot.scatter(x='A', y='B', hue='w', hue_style='discrete')
The ``markersize`` kwarg lets you vary the point's size by variable value. You can additionally pass ``size_norm`` to control how the variable's values are mapped to point sizes.

.. ipython:: python
@savefig ds_hue_size_scatter.png
ds.plot.scatter(x='A', y='B', hue='z', hue_style='discrete', markersize='z')
Faceting is also possible

.. ipython:: python
@savefig ds_facet_scatter.png
ds.plot.scatter(x='A', y='B', col='x', row='z', hue='w', hue_style='discrete')
For more advanced scatter plots, we recommend converting the relevant data variables to a pandas DataFrame and using the extensive plotting capabilities of ``seaborn``.


.. _plot-maps:

Maps
Expand Down
4 changes: 4 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,10 @@ New functions/methods

By `Guido Imperiale <https://github.com/crusaderky>`_

- Dataset plotting API for visualizing dependences between two `DataArray`s!
Currently only :py:meth:`Dataset.plot.scatter` is implemented.
By `Yohai Bar Sinai <https://github.com/yohai>`_ and `Deepak Cherian <https://github.com/dcherian>`_

Enhancements
~~~~~~~~~~~~

Expand Down
12 changes: 12 additions & 0 deletions xarray/core/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@
decode_numpy_dict_values, either_dict_or_kwargs, hashable,
maybe_wrap_array)
from .variable import IndexVariable, Variable, as_variable, broadcast_variables
from ..plot.dataset_plot import _Dataset_PlotMethods

if TYPE_CHECKING:
from ..backends import AbstractDataStore, ZarrStore
Expand Down Expand Up @@ -4769,6 +4770,17 @@ def imag(self):
return self._unary_op(lambda x: x.imag,
keep_attrs=True)(self)

@property
def plot(self):
"""
Access plotting functions. Use it as a namespace to use
xarray.plot functions as Dataset methods
>>> ds.plot.scatter(...) # equivalent to xarray.plot.scatter(ds,...)
"""
return _Dataset_PlotMethods(self)

def filter_by_attrs(self, **kwargs):
"""Returns a ``Dataset`` with variables that match specific conditions.
Expand Down
Loading

0 comments on commit f172c67

Please sign in to comment.