Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Scatter plots of one variable vs another #2277

Merged
merged 119 commits into from
Aug 8, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
119 commits
Select commit Hold shift + click to select a range
31019d8
initial commit
yohai Jul 11, 2018
5b3714c
formatting
yohai Jul 11, 2018
e89c27f
fix bug
yohai Jul 12, 2018
9ef73cb
refactor map_scatter
yohai Jul 13, 2018
e6b286a
colorbar
yohai Jul 15, 2018
4c92a62
formatting
yohai Jul 16, 2018
355870a
refactor
yohai Jul 16, 2018
16a5d18
refactor _infer_data
yohai Jul 16, 2018
ff27ef5
added tests
yohai Jul 16, 2018
3cee41d
minor formatting
yohai Jul 16, 2018
fe7f16f
fixed tests
yohai Jul 16, 2018
7d19ae3
Merge remote-tracking branch 'upstream/master' into yohai-ds_scatter
Nov 19, 2018
b80ff5d
Refactor out to dataset_plot.py + move utilities to utils.py
Nov 22, 2018
b839295
Fix tests.
Nov 22, 2018
746930b
Fixes.
Nov 22, 2018
d3e1308
discrete_legend → add_colorbar
Nov 22, 2018
ef3b9d1
Revert "discrete_legend → add_colorbar"
Nov 22, 2018
be9d09a
Only use scatter instead of alternating between scatter and plot.
Dec 15, 2018
0d2b126
Create and use plot.utils._add_colorbar
Dec 15, 2018
a938d24
fix tests.
Dec 15, 2018
6440365
More fixes to hue, cmap_kwargs.
Dec 16, 2018
6975b9e
Merge remote-tracking branch 'upstream/master' into yohai-ds_scatter
Dec 16, 2018
15d8066
doc fixes.
Dec 16, 2018
e98fc7e
Dataset plotting docs.
Dec 16, 2018
2f91c3d
group existing docs under "DataArrays."
Dec 16, 2018
269518c
bugfix.
Dec 16, 2018
14379ea
Fix.
Dec 16, 2018
ca1d44b
Add whats-new
Dec 16, 2018
396f148
Add api.rst.
Dec 16, 2018
ab48350
Add hue_style.
Dec 18, 2018
8f41aee
Update tests.
Dec 18, 2018
5bb2ef6
cleanup imports.
Dec 19, 2018
08a3481
facetgrid: Refactor out cmap_params, cbar_kwargs processing
Dec 19, 2018
c2923b2
Dataset.plot.scatter obeys cmap_params, cbar_kwargs.
Dec 19, 2018
0a01e7c
_determine_cmap_params supports datetime64
Dec 18, 2018
c3bd7c8
dataset.plot.scatter supports hue=datetime64, timedelta64
Dec 19, 2018
84d4cbc
Merge branch 'master' into yohai-ds_scatter
Dec 19, 2018
80fc91a
pep8
Dec 19, 2018
f2704f8
Update docs.
Dec 19, 2018
caef62a
bugfix: facetgrid now uses hue_style
Dec 21, 2018
9b9478b
minor fixes.
Dec 21, 2018
3d40dab
Scatter docs
Dec 21, 2018
faf4302
Merge branch 'master' into yohai-ds_scatter
Jan 2, 2019
b5653a0
Merge branch 'master' into yohai-ds_scatter
Jan 8, 2019
1f0b1b1
Refactor out more code to utils.py
Jan 14, 2019
07bdf54
map_scatter → map_dataset
Jan 14, 2019
6df10c1
Use some wrapping magic to generalize code.
Jan 14, 2019
a12378c
Add hist as test of generalization.
Jan 14, 2019
1d939af
Get facetgrid working again
Jan 14, 2019
361f7a8
Refactor out utility functions.
Jan 14, 2019
f0f1480
facetgrid refactor
Jan 14, 2019
a998cfc
flake8
Jan 14, 2019
ce9e2ae
Refactor out colorbar making to plot.utils._add_colorbar
Dec 15, 2018
159bb25
Refactor out cmap_params, cbar_kwargs processing
Dec 19, 2018
29d276a
Merge remote-tracking branch 'upstream/master' into refactor-plot-utils
Jan 14, 2019
3b4e4a0
Back to map_dataarray_line
Jan 15, 2019
1217ab1
lint
Jan 15, 2019
792291c
small rename
Jan 24, 2019
43057ef
Merge branch 'master' into refactor-plot-utils
Jan 24, 2019
351a466
review comment.
Jan 24, 2019
57a6c64
Merge branch 'refactor-plot-utils' into yohai-ds_scatter
Jan 24, 2019
62679d9
Bugfix merge
Jan 24, 2019
afa92a3
hue, hue_style aren't needed for all functions.
Jan 24, 2019
18199cf
lint
Jan 24, 2019
8e47189
Use _process_cmap_cbar_kwargs.
Jan 24, 2019
b25ad6b
Update whats-new
Jan 24, 2019
072d83d
Some doc fixes.
Jan 24, 2019
3fe8557
Fix tests?
Jan 24, 2019
fab84a9
another attempt to fix tests.
Jan 25, 2019
3309d2a
small
Jan 28, 2019
bce0152
Merge remote-tracking branch 'upstream/master' into yohai-ds_scatter
Jan 30, 2019
ecc8b3c
remove py2 line
Jan 30, 2019
09d067f
remove extra _infer_line_data
Jan 30, 2019
c64fbba
Use _is_facetgrid flag.
Feb 4, 2019
7a65d28
Revert "_determine_cmap_params supports datetime64"
Feb 4, 2019
6e8c92c
Remove datetime/timedelta hue support
Feb 4, 2019
4c82009
_meta_data → meta_data.
Feb 4, 2019
7392c81
isort
Feb 4, 2019
4e41fc3
Merge branch 'master' into yohai-ds_scatter
Feb 4, 2019
f755cb8
Add doc line
Feb 5, 2019
13a411b
Switch to add_guide.
Feb 14, 2019
d7e9a0f
Save hist for a future PR.
Feb 14, 2019
0c20fc8
Merge branch 'master' into yohai-ds_scatter
Feb 14, 2019
ce41d4e
rename _numeric to _is_numeric.
Feb 14, 2019
4b59672
Raise error if add_colorbar or add_legend are passed to scatter.
Feb 14, 2019
50468da
Add scatter_example_dataset to tutorial.py
Feb 15, 2019
68906e2
Support scattering against coordinates, dimensions or data vars
Feb 15, 2019
4b6a4ef
Support 'scatter_size' kwarg
Feb 15, 2019
ccd9c42
color → hue and other changes.
Feb 15, 2019
4006531
Facetgrid support for scatter_size.
Feb 15, 2019
7f46f03
add_guide in docs.
Feb 17, 2019
194ff85
Avoid top-level matplotlib import
Mar 3, 2019
1e66a3e
Fix lint errors.
Mar 4, 2019
d5151df
Follow shoyer's suggestions.
Mar 6, 2019
cffaf44
scatter_size → markersize.
Mar 6, 2019
8cd8722
Update more error messages.
Mar 6, 2019
ee662b4
Merge remote-tracking branch 'upstream/master' into yohai-ds_scatter
dcherian Mar 18, 2019
41cca04
Merge branch 'master' into yohai-ds_scatter
dcherian Apr 19, 2019
42ea5a7
Merge branch 'master' into ds_scatter
yohai Jun 20, 2019
6af0263
lint errors
yohai Jun 20, 2019
9abca60
lint errors again
yohai Jun 20, 2019
f3de227
some more lints
yohai Jun 21, 2019
5b453a7
docstrings
yohai Jun 21, 2019
7116020
fix legend bug in line plots
yohai Jun 21, 2019
fa37607
unittest for legend in lineplot
yohai Jun 21, 2019
2bc6107
bug fix
yohai Jun 21, 2019
00df847
Merge branch 'master' into ds_scatter
yohai Jun 22, 2019
3793166
Merge branch 'master' into ds_scatter
yohai Jun 26, 2019
436f7af
add figlegend to __init__
yohai Jun 26, 2019
2135388
remove import from facetgrid.py
yohai Jun 28, 2019
318abaa
Merge branch 'master' into yohai-ds_scatter
dcherian Aug 3, 2019
3db1610
Remove xr.plot.scatter.
dcherian Aug 3, 2019
fc7fc96
facetgrid._hue_var is always a DataArray.
dcherian Aug 3, 2019
2825c35
scatter_size bugfix.
dcherian Aug 3, 2019
d4844bc
Update for latest _process_cmap_params_cbar_kwargs
dcherian Aug 3, 2019
dbe09b7
Fix whats-new
dcherian Aug 3, 2019
a805c59
Fix tests.
dcherian Aug 5, 2019
63a3bc7
Merge branch 'master' into yohai-ds_scatter
dcherian Aug 7, 2019
d56f7d1
Make add_guide=False work.
dcherian Aug 7, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -602,6 +602,7 @@ Plotting
.. autosummary::
:toctree: generated/

Dataset.plot
DataArray.plot
plot.plot
plot.contourf
Expand Down
159 changes: 119 additions & 40 deletions doc/plotting.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ xarray's plotting capabilities are centered around
:py:class:`xarray.DataArray` objects.
To plot :py:class:`xarray.Dataset` objects
simply access the relevant DataArrays, ie ``dset['var1']``.
Dataset specific plotting routines are also available (see :ref:`plot-dataset`).
Here we focus mostly on arrays 2d or larger. If your data fits
nicely into a pandas DataFrame then you're better off using one of the more
developed tools there.
Expand Down Expand Up @@ -83,11 +84,15 @@ For these examples we'll use the North American air temperature dataset.
Until :issue:`1614` is solved, you might need to copy over the metadata in ``attrs`` to get informative figure labels (as was done above).


DataArrays
----------

One Dimension
-------------
~~~~~~~~~~~~~

Simple Example
~~~~~~~~~~~~~~
================
Simple Example
================

The simplest way to make a plot is to call the :py:func:`xarray.DataArray.plot()` method.

Expand All @@ -104,8 +109,9 @@ xarray uses the coordinate name along with metadata ``attrs.long_name``, ``attr
air1d.attrs
Additional Arguments
~~~~~~~~~~~~~~~~~~~~~
======================
Additional Arguments
======================

Additional arguments are passed directly to the matplotlib function which
does the work.
Expand Down Expand Up @@ -133,8 +139,9 @@ Keyword arguments work the same way, and are more explicit.
@savefig plotting_example_sin3.png width=4in
air1d[:200].plot.line(color='purple', marker='o')
Adding to Existing Axis
~~~~~~~~~~~~~~~~~~~~~~~
=========================
Adding to Existing Axis
=========================

To add the plot to an existing axis pass in the axis as a keyword argument
``ax``. This works for all xarray plotting methods.
Expand All @@ -159,8 +166,9 @@ On the right is a histogram created by :py:func:`xarray.plot.hist`.

.. _plotting.figsize:

Controlling the figure size
~~~~~~~~~~~~~~~~~~~~~~~~~~~
=============================
Controlling the figure size
=============================

You can pass a ``figsize`` argument to all xarray's plotting methods to
control the figure size. For convenience, xarray's plotting methods also
Expand Down Expand Up @@ -199,8 +207,9 @@ entire figure (as for matplotlib's ``figsize`` argument).

.. _plotting.multiplelines:

Multiple lines showing variation along a dimension
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
====================================================
Multiple lines showing variation along a dimension
====================================================

It is possible to make line plots of two-dimensional data by calling :py:func:`xarray.plot.line`
with appropriate arguments. Consider the 3D variable ``air`` defined above. We can use line
Expand All @@ -221,8 +230,9 @@ If required, the automatic legend can be turned off using ``add_legend=False``.
``hue`` can be passed directly to :py:func:`xarray.plot` as `air.isel(lon=10, lat=[19,21,22]).plot(hue='lat')`.


Dimension along y-axis
~~~~~~~~~~~~~~~~~~~~~~
========================
Dimension along y-axis
========================

It is also possible to make line plots such that the data are on the x-axis and a dimension is on the y-axis. This can be done by specifying the appropriate ``y`` keyword argument.

Expand All @@ -231,8 +241,9 @@ It is also possible to make line plots such that the data are on the x-axis and
@savefig plotting_example_xy_kwarg.png
air.isel(time=10, lon=[10, 11]).plot(y='lat', hue='lon')
Step plots
~~~~~~~~~~
============
Step plots
============

As an alternative, also a step plot similar to matplotlib's ``plt.step`` can be
made using 1D data.
Expand Down Expand Up @@ -263,7 +274,7 @@ is ignored.


Other axes kwargs
-----------------
~~~~~~~~~~~~~~~~~


The keyword arguments ``xincrease`` and ``yincrease`` let you control the axes direction.
Expand All @@ -277,11 +288,12 @@ In addition, one can use ``xscale, yscale`` to set axes scaling; ``xticks, ytick


Two Dimensions
--------------

Simple Example
~~~~~~~~~~~~~~

================
Simple Example
================

The default method :py:meth:`xarray.DataArray.plot` calls :py:func:`xarray.plot.pcolormesh` by default when the data is two-dimensional.

.. ipython:: python
Expand All @@ -307,8 +319,9 @@ and ``xincrease``.
If speed is important to you and you are plotting a regular mesh, consider
using ``imshow``.

Missing Values
~~~~~~~~~~~~~~
================
Missing Values
================

xarray plots data with :ref:`missing_values`.

Expand All @@ -321,8 +334,9 @@ xarray plots data with :ref:`missing_values`.
@savefig plotting_missing_values.png width=4in
bad_air2d.plot()
Nonuniform Coordinates
~~~~~~~~~~~~~~~~~~~~~~
========================
Nonuniform Coordinates
========================

It's not necessary for the coordinates to be evenly spaced. Both
:py:func:`xarray.plot.pcolormesh` (default) and :py:func:`xarray.plot.contourf` can
Expand All @@ -337,8 +351,9 @@ produce plots with nonuniform coordinates.
@savefig plotting_nonuniform_coords.png width=4in
b.plot()
Calling Matplotlib
~~~~~~~~~~~~~~~~~~
====================
Calling Matplotlib
====================

Since this is a thin wrapper around matplotlib, all the functionality of
matplotlib is available.
Expand Down Expand Up @@ -370,8 +385,9 @@ matplotlib is available.
@savefig plotting_2d_call_matplotlib2.png width=4in
plt.draw()
Colormaps
~~~~~~~~~
===========
Colormaps
===========

xarray borrows logic from Seaborn to infer what kind of color map to use. For
example, consider the original data in Kelvins rather than Celsius:
Expand All @@ -386,8 +402,9 @@ Kelvins do not have 0, so the default color map was used.

.. _robust-plotting:

Robust
~~~~~~
========
Robust
========

Outliers often have an extreme effect on the output of the plot.
Here we add two bad data points. This affects the color scale,
Expand Down Expand Up @@ -417,8 +434,9 @@ Observe that the ranges of the color bar have changed. The arrows on the
color bar indicate
that the colors include data points outside the bounds.

Discrete Colormaps
~~~~~~~~~~~~~~~~~~
====================
Discrete Colormaps
====================

It is often useful, when visualizing 2d data, to use a discrete colormap,
rather than the default continuous colormaps that matplotlib uses. The
Expand Down Expand Up @@ -462,7 +480,7 @@ since levels are chosen automatically).
.. _plotting.faceting:

Faceting
--------
~~~~~~~~

Faceting here refers to splitting an array along one or two dimensions and
plotting each group.
Expand All @@ -488,8 +506,9 @@ So let's use a slice to pick 6 times throughout the first year.
t = air.isel(time=slice(0, 365 * 4, 250))
t.coords
Simple Example
~~~~~~~~~~~~~~
================
Simple Example
================

The easiest way to create faceted plots is to pass in ``row`` or ``col``
arguments to the xarray plotting methods/functions. This returns a
Expand All @@ -507,8 +526,9 @@ Faceting also works for line plots.
@savefig plot_facet_dataarray_line.png
g_simple_line = t.isel(lat=slice(0,None,4)).plot(x='lon', hue='lat', col='time', col_wrap=3)
4 dimensional
~~~~~~~~~~~~~
===============
4 dimensional
===============

For 4 dimensional arrays we can use the rows and columns of the grids.
Here we create a 4 dimensional array by taking the original data and adding
Expand All @@ -525,8 +545,9 @@ one were much hotter.
@savefig plot_facet_4d.png
t4d.plot(x='lon', y='lat', col='time', row='fourth_dim')
Other features
~~~~~~~~~~~~~~
================
Other features
================

Faceted plotting supports other arguments common to xarray 2d plots.

Expand All @@ -546,8 +567,9 @@ Faceted plotting supports other arguments common to xarray 2d plots.
robust=True, cmap='viridis',
cbar_kwargs={'label': 'this has outliers'})
FacetGrid Objects
~~~~~~~~~~~~~~~~~
===================
FacetGrid Objects
===================

:py:class:`xarray.plot.FacetGrid` is used to control the behavior of the
multiple plots.
Expand Down Expand Up @@ -589,6 +611,63 @@ they have been plotted.
TODO: add an example of using the ``map`` method to plot dataset variables
(e.g., with ``plt.quiver``).

.. _plot-dataset:

Datasets
--------

``xarray`` has limited support for plotting Dataset variables against each other.
Consider this dataset
dcherian marked this conversation as resolved.
Show resolved Hide resolved

.. ipython:: python
ds = xr.tutorial.scatter_example_dataset()
ds
Suppose we want to scatter ``A`` against ``B``

.. ipython:: python
@savefig ds_simple_scatter.png
ds.plot.scatter(x='A', y='B')
The ``hue`` kwarg lets you vary the color by variable value

.. ipython:: python
@savefig ds_hue_scatter.png
ds.plot.scatter(x='A', y='B', hue='w')
When ``hue`` is specified, a colorbar is added for numeric ``hue`` DataArrays by
default and a legend is added for non-numeric ``hue`` DataArrays (as above).
You can force a legend instead of a colorbar by setting ``hue_style='discrete'``.
Additionally, the boolean kwarg ``add_guide`` can be used to prevent the display of a legend or colorbar (as appropriate).

.. ipython:: python
ds.w.values = [1, 2, 3, 5]
@savefig ds_discrete_legend_hue_scatter.png
ds.plot.scatter(x='A', y='B', hue='w', hue_style='discrete')
The ``markersize`` kwarg lets you vary the point's size by variable value. You can additionally pass ``size_norm`` to control how the variable's values are mapped to point sizes.

.. ipython:: python
@savefig ds_hue_size_scatter.png
ds.plot.scatter(x='A', y='B', hue='z', hue_style='discrete', markersize='z')
Faceting is also possible

.. ipython:: python
@savefig ds_facet_scatter.png
ds.plot.scatter(x='A', y='B', col='x', row='z', hue='w', hue_style='discrete')
dcherian marked this conversation as resolved.
Show resolved Hide resolved
For more advanced scatter plots, we recommend converting the relevant data variables to a pandas DataFrame and using the extensive plotting capabilities of ``seaborn``.


.. _plot-maps:

Maps
Expand Down
4 changes: 4 additions & 0 deletions doc/whats-new.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,10 @@ New functions/methods

By `Guido Imperiale <https://github.com/crusaderky>`_

- Dataset plotting API for visualizing dependences between two `DataArray`s!
Currently only :py:meth:`Dataset.plot.scatter` is implemented.
By `Yohai Bar Sinai <https://github.com/yohai>`_ and `Deepak Cherian <https://github.com/dcherian>`_

Enhancements
~~~~~~~~~~~~

Expand Down
12 changes: 12 additions & 0 deletions xarray/core/dataset.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@
decode_numpy_dict_values, either_dict_or_kwargs, hashable,
maybe_wrap_array)
from .variable import IndexVariable, Variable, as_variable, broadcast_variables
from ..plot.dataset_plot import _Dataset_PlotMethods

if TYPE_CHECKING:
from ..backends import AbstractDataStore, ZarrStore
Expand Down Expand Up @@ -4769,6 +4770,17 @@ def imag(self):
return self._unary_op(lambda x: x.imag,
keep_attrs=True)(self)

@property
def plot(self):
"""
Access plotting functions. Use it as a namespace to use
xarray.plot functions as Dataset methods

>>> ds.plot.scatter(...) # equivalent to xarray.plot.scatter(ds,...)

"""
return _Dataset_PlotMethods(self)

def filter_by_attrs(self, **kwargs):
"""Returns a ``Dataset`` with variables that match specific conditions.

Expand Down
Loading