Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve discussion of array types in "Working With Units" tutorial #2185

Open
jthielen opened this issue Nov 2, 2021 · 6 comments
Open

Improve discussion of array types in "Working With Units" tutorial #2185

jthielen opened this issue Nov 2, 2021 · 6 comments
Labels
Area: Docs Affects documentation Area: Units Pertains to unit information Type: Enhancement Enhancement to existing functionality
Milestone

Comments

@jthielen
Copy link
Collaborator

jthielen commented Nov 2, 2021

What can be better?

The apparent/user-facing failure to attach units to an array by typically-shown "multiplication on the right" has been a long-standing issue (almost always with masked arrays) which frequently arises in support questions (e.g., https://mobile.twitter.com/Nolan_Meister/status/1455629079110209553) and off-handed gripes I've heard about MetPy. While the masked array issue in particular continues to await resolution of numpy/numpy#15200, I think the crux of the issue is teaching users to know what kind of array they are working with, and then providing very clear direction on how to treat units in combination with that array type.

Here's what I'd have in mind (but please do suggest improvements/alternatives!):

  1. In "Getting Started," change "In general, using units is only a small step on top of using the numpy.ndarray object." to something like "It's important to know what kind of array you are using units with. Luckily, for each kind of array, using units is just a small step on top of using that object!"

  2. In the next section "Adding Units to Data," replace the first two snippets "The easiest way..." and "It is also possible..." with a table like the following:

How should I add units to my data? That depends on your data type!

Data Type Recommended Approach Example
Scalars (like floats and integers), numpy.ndarray, Sparse.COO, Dask.Array Construct Quantity object
or multiply by unit
units.Quantity(array, 'hPa') or array * units.hPa
xarray.DataArray with a 'units' attribute 1) If using only as function input, just use as-is
2) If doing operations outside of MetPy calculations, call .metpy.quantify() after subsetting
1) mpcalc.geostrophic_wind(ds['geopotential_height_isobaric'])
2) temperature = ds['temperature'].metpy.quantify()
xarray.DataArray without a 'units' attribute Multiply by unit temperature = ds['temperature'] * units.K
numpy.MaskedArray (like those from netcdf4-python) Construct Quantity object units.Quantity(array, 'hPa')

Curious about these differences? Take a look at "{INSERT TITLE HERE}" below to learn more.

  1. Change the "When using masked arrays..." in "Common Mistakes" to recommend the Quantity constructor and point to the table in (2).

  2. Add a comment in the "Common Mistakes" section to the effect of: "Do you still have a units problem that isn't addressed here? Reach out to support {...}. Unit arrays involve lots of different libraries interacting with each other, so we want to be able to help sort out any issues when those libraries aren't cooperating as expected."

  3. Add a brief explainer after "Common Mistakes" about why there are different ways to add units for those that are curious (e.g., a much less technical/more applied version of https://pint.readthedocs.io/en/stable/numpy.html#Technical-Commentary)

@jthielen jthielen added Area: Docs Affects documentation Area: Units Pertains to unit information labels Nov 2, 2021
@dopplershift dopplershift added the Type: Enhancement Enhancement to existing functionality label Nov 2, 2021
@sgdecker
Copy link
Contributor

sgdecker commented Nov 8, 2021

Should there be a row for xarray.Dataset in point 2?

@jthielen
Copy link
Collaborator Author

jthielen commented Nov 8, 2021

Should there be a row for xarray.Dataset in point 2?

No, as an xarray.Dataset is a collection of arrays, rather than an array type itself. However, perhaps a comment to that effect (something along the lines of "an xarray.DataArray can be treated independently or as part of an xarray.Dataset") may help reduce confusion?

@sgdecker
Copy link
Contributor

sgdecker commented Nov 8, 2021

Yes, a comment works. Anything that alerts the reader that metpy.quantify can be applied to an entire Dataset and not necessarily individual DataArrays one at a time (which is what the examples given might imply if interpreted a certain way).

@jthielen
Copy link
Collaborator Author

jthielen commented Feb 4, 2022

Not sure if this falls here or should be it's own issue, but #2333 made me notice that the following error message in check_units is incorrect:

MetPy/src/metpy/units.py

Lines 262 to 264 in 94ef0e0

msg += ('\nAny variable `x` can be assigned a unit as follows:\n'
' from metpy.units import units\n'
' x = units.Quantity(x, "m/s")')

This approach to assigning units will not (and is never intended to) work for xarray data types. Should we fix this error message to point to an updated docs page with the above information, or provide the conditional information itself in the error message?

@jthielen jthielen added this to the 1.3.0 milestone Feb 4, 2022
@dcamron
Copy link
Member

dcamron commented Feb 4, 2022

I think that is relevant here and probably worth its own issue and fix unless we only want to fix it by tying it to this docs update. IMO providing helpful (and valid) errors conditionally is a good direction for users here and in spirit of the feedback intended by this error. Still supported by the docs enhancements here though, of course.

@sgdecker
Copy link
Contributor

sgdecker commented Dec 7, 2022

Perhaps a discussion of adding units to coordinate arrays a la:
#2118 (comment)
would be a useful addition as well.

@dopplershift dopplershift modified the milestones: 1.4.1, April 2023 Mar 16, 2023
@dopplershift dopplershift modified the milestones: 1.7.0, 1.8.0 Jan 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Docs Affects documentation Area: Units Pertains to unit information Type: Enhancement Enhancement to existing functionality
Projects
Status: No status
Development

No branches or pull requests

4 participants