Insuring interoperability of pint with other (non-unit) array-like types #845

jthielen · 2019-08-21T03:46:45Z

When working with the new __array_function__ implementation in #764, seeing what it takes to get pint Quantity's working inside xarray (pydata/xarray#525), and trying to wrap my head around NEP 18, I realized that (to the best of what I could find) pint's expected behavior with other ndarray-like types is not well-documented or tested for. And so, I hope that this issue could be a discussion on clarifying those expected behaviors and determining how to document and test them.

To start that discussion, I think it would make sense to split other array-like types into two categories, those that should wrap pint, and those that should be wrapped by pint, or in other words, determine where pint should fall in the "type casting hierarchy". Based on pydata/xarray#525 (comment) and other comments I've seen, a split like the following seems to be desired:

Wrap pint Quantity

xarray DataArray

Wrapped by pint Quantity

dask array
cupy array
sparse array
numpy masked array (and of course ndarray)

(Pandas seems to have special handling through pint-pandas, and hopefully others that I'm missing can be added to either list as appropriate)

Does this make sense as a division, and should a list like this be documented anywhere?

Also, once #764 is in place, it seems like there will be a need for explicit identification of types in each category...types that should wrap pint are used in

pint/pint/quantity.py

Line 70 in 4dcbe78

if other.__class__.__name__ in ["PintArray", "Series"]:

, and types that should be wrapped by pint would be used with implementing the functionality mentioned in #764. However, what should be done about array-like types that are unknown to pint, and how can consistent behavior between different Quantity-related operations be assured?

Finally, should pint implement tests for all the types it expects to be able to wrap? I got the impression from pydata/xarray#525 and pydata/xarray#2956 that tests fit best with the type higher in the type casting hierarchy (the one doing the wrapping).

Please do correct me on anything I am mistaken about with the above rambling 😄.

xref pydata/xarray#525, pydata/xarray#2956, #764, #633, Unidata/MetPy#996

EDIT: Discussion about arbitrarily nested metadata with NEP 18 implementers is ongoing at dask/dask#5329, which will be good to keep in mind here too.

The text was updated successfully, but these errors were encountered:

shoyer · 2019-08-27T21:53:24Z

From a coordination perspective, I think the main task is to ensure that there's a well defined type casting hierarchy among all NEP-18 duck arrays, so it's clear which project is responsible for handling which interaction. This will help avoid incompatible nestings, e.g., dask wrapping pint and pint wrapping dask.

This might be something that makes sense as a continuously updated "Informational NEP" defining a defacto community standard.

Here's a partial attempt, with most of the duck-arrays I can think of off hand. The arrow X -> Y means that X handles interactions between X and Y arrays:

The general principle here is that if library X can wrap Y arrays, then it should also handle their interaction. This keeps the logic about how different array library interact in one place, as much as possible.

Black lines indicate existing relationships; gray lines indicate proposed relationships.

hgrecco · 2019-08-28T10:13:29Z

looks good to me

shoyer · 2019-08-28T16:21:58Z

Now generated with graphviz, which should be a little more maintainable:

crusaderky · 2019-08-28T17:26:26Z

@shoyer I changed the style of the arrows to dotted where there is only planned or untested support, and added arrows pint -> sparse and pint-> cupy

https://gist.github.com/crusaderky/1b00fd68ae1ce79109b6af546c88bd4a

pentschev · 2019-09-11T22:25:54Z

Those are nice graphs @shoyer, thanks for doing that.

The general principle here is that if library X can wrap Y arrays, then it should also handle their interaction. This keeps the logic about how different array library interact in one place, as much as possible.

I'm wondering if this would be a guideline that would be worth to add to the NEP-18? I feel it's quite a good idea to make it more visible, even if elsewhere.

905: NEP-18 Compatibility r=hgrecco a=jthielen Building off of the implementation of `__array_function__` in #764, this PR adds compatibility with NEP-18 in Pint (mostly in the sense of Quantity having `__array_function__` and being wrappable as a duck array; for Quantity wrapping other duck arrays, see #845). Many tests are added of NumPy functions being used with Pint Quantities by way of `__array_function__`. Accompanying changes that were needed as a part of this implementation include: - a complete refactor of `__array_ufunc__` and ufunc attribute fallbacks to work in parallel with `__array_function__` - promoting `_eq` in `quantity` to `eq` in `compat` - preliminary handling of array-like compatibility by defining upcast types and attempting to wrap and defer to all others (a follow-up PR, or set of PRs, will be needed to completely address #845 / #878) Closes #126 Closes #396 Closes #424 Closes #547 Closes #553 Closes #617 Closes #619 Closes #682 Closes #700 Closes #764 Closes #790 Closes #821 Co-authored-by: Jon Thielen <[email protected]>

jthielen · 2019-12-11T15:32:00Z

hgrecco · 2019-12-11T16:16:05Z

Thanks @jthielen for summarizing this. I think is important to have input from people using heavily those packages to make it not only correct but also that it feels right. We can start by collecting some tests.

Another question that I have: should we have related packages (like pint-pandas) as a way to speed up development?

Some progress towards this goal (at least within Pint) has been made with #905,

A large progress has been made in #905, don't underestimate your work! 😄

jthielen · 2019-12-26T23:28:11Z

In light of #955, I hope to be working on most of the areas listed above in the next couple days. However, the tests with downcast types are likely to not all be ready for 0.10 because

Dask is held up due to Type checking or duck typing inside dask.array.Array.__array_function__ dask/dask#4583
I don't want to worry about GPU tests to test CuPy

However, I think testing Sparse and masked arrays as downcast types should be sufficient for now.

Also, for simplicity, I think I'll just do the upcast type tests with xarray, and not worry about Pandas for now.

@keewis

959: Add tests and improve upcast type compatibility (part of #845) r=hgrecco a=jthielen As a part of #845, this PR adds tests for upcast type compatibility with xarray (just tests of deferral/commutativity, for full integration tests, see @keewis's work in xarray's test suite). Along the way came a check for upcast types on Quantity creation (closing #479), changing to checking actual types rather than names (otherwise xarray's and uncertainties `Variable` conflict), exposing the upcast type collection, and adding the `@check_implemented` decorator to several arithmetic operations where it was missing. Also, I hope that the xarray tests only being run on the latest available python and xarray versions is sufficient. If that should be changed, please let me know (if a more complete matrix is desired, it may be worth looking at reconfiguring the Travis configuration to use a [build matrix](https://docs.travis-ci.com/user/build-matrix/)). - [x] Closes #479; Progress towards #845 - [x] Executed ``black -t py36 . && isort -rc . && flake8`` with no errors - [x] The change is fully covered by automated unit tests - [x] Documented in docs/ as appropriate - [x] Added an entry to the CHANGES file Co-authored-by: Jon Thielen <[email protected]>

This was referenced Aug 21, 2019

Split Quantity into scalar and sequence classes #764

Closed

Creating a pint-xarray package/module #849

Closed

keewis mentioned this issue Aug 27, 2019

Design: nested _meta dask/dask#5329

Open

jthielen mentioned this issue Sep 2, 2019

tests for arrays with units pydata/xarray#3238

Merged

16 tasks

This was referenced Sep 12, 2019

NEP18 trouble when pint is being wrapped #878

Open

Integration with Dask (add tests; implement the Dask collection interface on Quantity) #883

Closed

jthielen mentioned this issue Nov 11, 2019

NEP 18, physical units, uncertainties, and the scipp library? pydata/xarray#3509

Closed

This was referenced Nov 29, 2019

NEP-18 Compatibility #905

Merged

Preserve ndarray-like magnitudes or reject them from the start #479

Closed

jthielen mentioned this issue Dec 10, 2019

Proper handling of __array_*__ attributes/methods #924

Closed

hgrecco added the numpy Numpy related bug/enhancement label Dec 11, 2019

This was referenced Dec 21, 2019

Switching to pytest for unit testing framework; Use doctest more extensively #947

Closed

Type checking or duck typing inside dask.array.Array.__array_function__ dask/dask#4583

Closed

NEP-18 and python scalars #950

Closed

keewis mentioned this issue Dec 23, 2019

ufunc calls with mixed args and upcast types #951

Merged

2 tasks

jthielen mentioned this issue Dec 26, 2019

What shall we include for 0.10? #955

Closed

hgrecco added this to the 0.10 milestone Dec 26, 2019

jthielen mentioned this issue Dec 27, 2019

Add tests and improve upcast type compatibility (part of #845) #959

Merged

5 tasks

This was referenced Dec 30, 2019

Inconsistency when multiplying NumPy masked arrays #633

Open

Add tests and documentation with improvement of downcast type compatibility (part of #845) #963

Merged

bors bot closed this as completed in 0d09f54 Dec 30, 2019

bors bot closed this as completed in #963 Dec 30, 2019

jthielen mentioned this issue Dec 30, 2019

Add tests with CuPy #964

Open

keewis mentioned this issue Apr 2, 2020

Pint support for DataArray pydata/xarray#3643

Merged

3 tasks

jthielen mentioned this issue Apr 7, 2020

Consistent Handling of Type Casting Hierarchy pydata/xarray#3950

Open

jthielen mentioned this issue Jul 9, 2020

What should define a "duck Dask Array"? dask/dask#6385

Open

keewis mentioned this issue Sep 14, 2020

construction of and interactions between different types of nested duck arrays dask/dask#6635

Open

greglucas mentioned this issue Jan 18, 2023

BUG: Subclass behavior of __array_function__ seem subtly flawed? numpy/numpy#23014

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Insuring interoperability of pint with other (non-unit) array-like types #845

Insuring interoperability of pint with other (non-unit) array-like types #845

jthielen commented Aug 21, 2019 •

edited

Loading

shoyer commented Aug 27, 2019 •

edited

Loading

hgrecco commented Aug 28, 2019

shoyer commented Aug 28, 2019

crusaderky commented Aug 28, 2019

pentschev commented Sep 11, 2019

jthielen commented Dec 11, 2019 •

edited

Loading

hgrecco commented Dec 11, 2019

jthielen commented Dec 26, 2019 •

edited

Loading

Insuring interoperability of pint with other (non-unit) array-like types #845

Insuring interoperability of pint with other (non-unit) array-like types #845

Comments

jthielen commented Aug 21, 2019 • edited Loading

shoyer commented Aug 27, 2019 • edited Loading

hgrecco commented Aug 28, 2019

shoyer commented Aug 28, 2019

crusaderky commented Aug 28, 2019

pentschev commented Sep 11, 2019

jthielen commented Dec 11, 2019 • edited Loading

hgrecco commented Dec 11, 2019

jthielen commented Dec 26, 2019 • edited Loading

jthielen commented Aug 21, 2019 •

edited

Loading

shoyer commented Aug 27, 2019 •

edited

Loading

jthielen commented Dec 11, 2019 •

edited

Loading

jthielen commented Dec 26, 2019 •

edited

Loading