Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add a plotting example #61

Merged
merged 12 commits into from
Feb 20, 2021
5 changes: 5 additions & 0 deletions docs/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@
"sphinx_autosummary_accessors",
"IPython.sphinxext.ipython_directive",
"IPython.sphinxext.ipython_console_highlighting",
"nbsphinx",
]

# Add any paths that contain templates here, relative to this directory.
Expand Down Expand Up @@ -100,6 +101,10 @@
"unit-like": ":term:`unit-like`",
}

# nbsphinx
nbsphinx_timeout = 600
nbsphinx_execute = "always"

# -- Options for intersphinx extension ---------------------------------------

intersphinx_mapping = {
Expand Down
6 changes: 5 additions & 1 deletion docs/examples.rst
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
Examples
========
There are no examples yet.

.. toctree::
:maxdepth: 1

examples/plotting
150 changes: 150 additions & 0 deletions docs/examples/plotting.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,150 @@
{
"cells": [
{
"cell_type": "markdown",
"id": "round-optimization",
"metadata": {},
"source": [
"# plotting quantified data"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "greatest-smart",
"metadata": {},
"outputs": [],
"source": [
"import xarray as xr\n",
"import pint_xarray"
]
},
{
"cell_type": "markdown",
"id": "fuzzy-maintenance",
"metadata": {},
"source": [
"## load the data"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "proved-racing",
"metadata": {},
"outputs": [],
"source": [
"ds = xr.tutorial.open_dataset(\"air_temperature\")\n",
"data = ds.air\n",
"data"
]
},
{
"cell_type": "markdown",
"id": "medium-backup",
"metadata": {},
"source": [
"## convert units into a format understood by pint\n",
"\n",
"<div class=\"alert alert-info\">\n",
"<strong>Note:</strong> this example uses the data provided by the <code>xarray.tutorial</code> functions. As such, the <code>units</code> attributes follow the CF conventions, which <code>pint</code> does not understand by default. To work around that, we are modifying the <code>units</code> attributes here, but in general it is better to use a library that adds support for the units used by the CF conventions to <code>pint</code>.\n",
"</div>"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "published-powell",
"metadata": {},
"outputs": [],
"source": [
"data.lat.attrs[\"units\"] = \"degree\"\n",
"data.lon.attrs[\"units\"] = \"degree\""
]
},
{
"cell_type": "markdown",
"id": "banned-tolerance",
"metadata": {},
"source": [
"## quantify the data"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "divine-boost",
"metadata": {},
"outputs": [],
"source": [
"quantified = data.pint.quantify()\n",
"quantified"
]
},
{
"cell_type": "markdown",
"id": "whole-momentum",
"metadata": {},
"source": [
"## work with the data"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "dried-friday",
"metadata": {},
"outputs": [],
"source": [
"monthly_means = (\n",
" quantified\n",
" .pint.to(\"degC\")\n",
" .sel(time=\"2013\")\n",
" .groupby(\"time.month\").mean()\n",
")\n",
"monthly_means"
]
},
{
"cell_type": "markdown",
"id": "still-ebony",
"metadata": {},
"source": [
"## plot\n",
"\n",
"`xarray`'s plotting functions will cast the data to `numpy.ndarray`, so we need to \"dequantify\" first."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "united-machine",
"metadata": {},
"outputs": [],
"source": [
"monthly_means.pint.dequantify(format=\"~P\").plot.imshow(col=\"month\", col_wrap=4)"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be totally fine to modify xarray.plot.utils.label_from_attrs to do this for pint arrays with some reasonable-default choice of format.

Copy link
Collaborator Author

@keewis keewis Feb 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indeed. However, I would like to keep the tight coupling to a minimum. Maybe hooks or entrypoints would help?

For now, I think asking people to dequantify before plotting / saving the data is fine.

Copy link
Member

@TomNicholas TomNicholas Jul 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@keewis I feel like this is an example of "you ain't gonna need it". I would be in favour of special-casing xarray.plot.utils.label_from_attrs until such a time that we actually need an entrypoint for other array libraries. After all, xarray's plotting methods already special-case dask arrays by calling .compute(), it's not that big of a deal if they call .dequantify() too, surely?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue with hard-coding is that we do expect more duck arrays in the future (e.g. pytorch and maybe uncertainties), and adding special cases for all of them would quickly become impossible.

I was hoping to make use of matplotlib.units for this. For example, using:

In [9]: ureg = pint.UnitRegistry()
   ...: ureg.setup_matplotlib()
   ...:
   ...: t = ureg.Quantity(np.arange(10), "s")
   ...: v = ureg.Quantity(5, "m / s")
   ...: x = v * t
   ...: 
   ...: fig, ax = plt.subplots(1, 1)
   ...: ax.plot(t, x)
   ...: 
   ...: plt.show()

would be pretty nice. However, while investigating #91 I found that right now the xarray plotting code (sometimes? always?) converts to np.ma.masked_array so we can't use this, yet. Additionally, this would be matplotlib specific, so once we add the plotting entrypoints we would need to find something new.

Which makes me wonder: should we introduce hooks based on the type of .data to prepare xarray objects for plotting? I think these hooks could be used to call cupy.Array.get, sparse.COO.todense or da.Array.compute before plotting, and because the argument to the hook would be a xarray object we could also have the hook for pint call .pint.dequantify. I'll open a new issue on the xarray issue tracker to see if I'm missing anything.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The issue with hard-coding is that we do expect more duck arrays in the future (e.g. pytorch and maybe uncertainties), and adding special cases for all of them would quickly become impossible.

I know, I'm not suggesting we special-case every combination forever, what I'm saying is that we don't always need to make something totally general right at the start. If we can make this work for pint and dask quickly I think we should, and worry about entrypoints when other duck-array libraries get to a similar point in their integration. We're not introducing much technical debt because this would be a small change for us and no external API change for the user (we will always want .plot() to work bare, however it is implemented behind the scenes).

I was hoping to make use of matplotlib.units for this.

This would be pretty nice. But again, I think we should make what we have work, then improve it later.

Which makes me wonder: should we introduce hooks based on the type of .data to prepare xarray objects for plotting?

I think this is a good idea for a long-term solution, which deserves it's own issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which makes me wonder: should we introduce hooks based on the type of .data to prepare xarray objects for plotting? I think these hooks could be used to call cupy.Array.get, sparse.COO.todense or da.Array.compute before plotting,

This pretty much already happens in .values here

    data = data.get() if isinstance(data, cupy_array_type) else np.asarray(data)

it's just hardcoded instead of an entrypoint.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've implemented this in pydata/xarray#5561

]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.8.6"
}
},
"nbformat": 4,
"nbformat_minor": 5
}
3 changes: 3 additions & 0 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
Expand Up @@ -4,5 +4,8 @@ netCDF4
sphinx>=3.2
sphinx_rtd_theme
ipython
ipykernel
jupyter_client
nbsphinx
matplotlib
sphinx-autosummary-accessors
3 changes: 3 additions & 0 deletions readthedocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,6 @@ python:
- requirements: docs/requirements.txt
- method: pip
path: .

sphinx:
fail_on_warning: true