-
Notifications
You must be signed in to change notification settings - Fork 478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: wrap any array API standard compatible array #2101
Comments
yes, there is interest. there have been a few issues relating to this in the past but it requires someone to start it! I suggest we create a seperate repo for this as it allows the array code to be iterated and released faster and independently to pint. maybe pint-array or pint-array-api? |
alright! Extremely rough POC at https://github.com/lucascolley/pint-array: In [1]: from pint import UnitRegistry
In [2]: from pint_array import get_namespace
In [3]: import array_api_strict as xp
In [4]: pxp = get_namespace(xp)
In [5]: ureg = UnitRegistry()
In [6]: x = pxp.asarray(xp.arange(6), units=ureg.meter)
In [7]: pxp.sum(x)
Out[7]: <Quantity(15, 'meter')>
In [9]: x = pxp.asarray(x, dtype=xp.float64)
In [10]: pxp.var(x)
Out[10]: <Quantity(2.9166666666666665, 'meter ** 2')>
In [11]: x = pxp.asarray(x, units=ureg.degC)
In [12]: pxp.sum(x)
---------------------------------------------------------------------------
OffsetUnitCalculusError Traceback (most recent call last)
Cell In[12], line 1
----> 1 pxp.sum(x)
File ~/programming/pint-array/pint_array/__init__.py:42, in get_namespace.<locals>.sum(x, axis, dtype, keepdims)
40 units = x.units
41 magnitude = xp.sum(x, axis=axis, dtype=dtype, keepdims=keepdims)
---> 42 units = (1 * units + 1 * units).units
43 return UnitRegistry.Quantity(magnitude, units)
File ~/programming/pint-array/.pixi/envs/default/lib/python3.13/site-packages/pint/facets/plain/quantity.py:849, in PlainQuantity.__add__(self, other)
846 if isinstance(other, datetime.datetime):
847 return self.to_timedelta() + other
--> 849 return self._add_sub(other, operator.add)
File ~/programming/pint-array/.pixi/envs/default/lib/python3.13/site-packages/pint/facets/plain/quantity.py:101, in check_implemented.<locals>.wrapped(self, *args, **kwargs)
99 elif isinstance(other, list) and other and isinstance(other[0], type(self)):
100 return NotImplemented
--> 101 return f(self, *args, **kwargs)
File ~/programming/pint-array/.pixi/envs/default/lib/python3.13/site-packages/pint/facets/plain/quantity.py:825, in PlainQuantity._add_sub(self, other, op)
823 units = other._units
824 else:
--> 825 raise OffsetUnitCalculusError(self._units, other._units)
827 return self.__class__(magnitude, units)
OffsetUnitCalculusError: Ambiguous operation with offset unit (degree_Celsius, degree_Celsius). See https://pint.readthedocs.io/en/stable/user/nonmult.html for guidance. With a bit of work it should be relatively straightforward based on the work already done to wrap NumPy (I hope). |
Thanks, that's very helpful to understand how it works. Assuming the function signatures are the same for numpy asn the array API, it may be possible to change the implements decorator to also create functions for pint_namespace |
Yes, it seems like we should be able to reuse a lot of that code. Given that NumpyWrapper supports functions (and ufuncs) which aren't in the array API standard, I think it will be easier to keep them separate for now. I just pushed a change to the repo such that we can define an |
This is probably as good a moment as any to mention that @nstarman and I have been working on a new As our hope in part is to replace astropy's Anyway, the part that is perhaps most relevant here is the unit-agnostic part - this would need some standardization of how units interact with arrays, perhaps new dunder methods of some agreement on operators (e.g., astropy can attach a unit via |
A shared units & Quantity API would be fantastic! |
@mhvk , when you say unit system, do you mean the unit module, pint or astropy.units etc? the linked pdf refers to a Quantity API. If I am understanding it correctly, if pint, astropy.units and any other unit library followed it, then there could be a quantity-array library that works for both pint and astropy.units. could then make pint-pandas and pint-xarray work for astropy and others too |
Yep, that sounds perfect. The namespaces I'm building in pint-array just need an array library, a quantity class from which to inherit, and some units compatible with that quantity. Currently, that's That's probably the hard part; in the meantime, I'll work on getting these array-quantity namespaces up to standard-compliance, while attempting to do the right thing with units. I could probably use some help writing tests for correct unit-handling at some point! |
@lucascolley - spurred on by this conversation, I've pushed my implementation of ufucs/element-wise operations to astropy/quantity-2.0#22 - |
Yes, there aren't all that many classes and functions needed for the array API, based on pint's numpy array_function implementation. I think a Quantity API defining the classes and functions needed to write the pint-array library would be a good first version. Is https://github.com/astropy/quantity-2.0 the best place for such a discussion on a Quantity API? There are a load of numpy tests that have been added as people implement numpy functions. |
What do we think about creation-like functions such as |
that was being changed to have the same units as the input array, but has got stuck #882 |
Up until this point everything has been working with pint/pint/facets/plain/registry.py Lines 1087 to 1097 in 74b7086
and so the multiplication here can throw an exception for integer arrays. Do you think we could try to address this problem in the plain registry @andrewgsavage ? It will be a bit hacky without adding proper array API awareness to Pint, but I think something like if hasattr(value, __array_namespace__) and value.__array_namespace__().isdtype(value.dtype, "integral"):
factor = int(factor) might fix the integer array case. I suppose the alternative is for me to subclass |
The conversion factor should not be converted to an int - that would lead to things like 4 inches converting to 8cm instead of 10cm. Can this be handled within the concat implementation by special casing the integer arrays to check consistent units? |
ah okay, so if consolidating the units would require a non-integral conversion factor and we have integer arrays, we can raise. Could you point me to a helper in Pint which checks for such a factor, if it exists already? Or otherwise, how would be best to go about writing it? |
I don't think we have that. I'd either raise or return a float array. Do you want to use your repo for these discussions to keep this cleaner? |
I was just made aware of @SimonHeybrock's https://pydims.github.io/pydims/index.html, which adds named dimensions and units to array API standard compatible arrays. It supports units from both Since pydims adds named dimensions, its arrays can't comply with the standard for the same reasons as xarray. So the scopes of pydims and (what is currently called) pint-array are orthogonal in that regard. pint-array should definitely look towards supporting both forms of units like pydims does, but perhaps that will be easier or superseded if we wait for the work that @nstarman and @mhvk have in mind. |
alright, https://github.com/lucascolley/pint-array is now passing array-api-tests 🎉 ! The behaviour with units is currently untested and known to be quite buggy, but all of the building blocks are there. I probably won't have time to work on proper tests, documentation etc. until this summer following my university exams, but contributions are very welcome! |
Example of unit-aware integration now working with SciPy 1.15.0 (released yesterday)!1
EDIT 3: fixed now! In [1]: import pint_array; import numpy as np; pnp = pint_array.pint_namespace(np); from pint import UnitRegistry; ureg = UnitRegistry()
In [2]: from scipy.integrate import cubature
In [3]: def f(x, n):
...: """f(x) = n * x"""
...: return n*x
...:
In [4]: a = pnp.asarray([0], units=ureg.pint)
In [5]: b = pnp.asarray([1], units=ureg.pint)
In [6]: n = pnp.arange(10)
In [7]: res = cubature(
...: f,
...: a=a,
...: b=b,
...: args=(n,),
...: )
In [8]: res.estimate
Out[8]:
<Quantity(
array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5]),
'pint ** 2'
)>
In [9]: n.magnitude * (1**2) / 2 # closed form
Out[9]: array([0. , 0.5, 1. , 1.5, 2. , 2.5, 3. , 3.5, 4. , 4.5])
In [10]: res.error
Out[10]:
<Quantity(
array([0.00000000e+00, 9.02056208e-17, 1.80411242e-16, 3.46944695e-16,
3.60822483e-16, 1.11022302e-16, 6.93889390e-16, 2.49800181e-16,
7.21644966e-16, 1.66533454e-15]),
'pint ** 2'
)>
In [11]: def f(x, n):
...: """f(x) = n * (x * x)"""
...: return n * (x * x)
...:
In [12]: res = cubature(
...: f,
...: a=a,
...: b=b,
...: args=(n,),
...: )
In [13]: res.estimate
Out[13]:
<Quantity(
array([0. , 0.33333333, 0.66666667, 1. , 1.33333333,
1.66666667, 2. , 2.33333333, 2.66666667, 3. ]),
'pint ** 3'
)>
In [14]: n.magnitude * (1**3) / 3 # closed form
Out[14]:
array([0. , 0.33333333, 0.66666667, 1. , 1.33333333,
1.66666667, 2. , 2.33333333, 2.66666667, 3. ])
In [15]: res.error
Out[15]:
<Quantity(
array([0.00000000e+00, 2.77555756e-17, 5.55111512e-17, 3.33066907e-16,
1.11022302e-16, 4.44089210e-16, 6.66133815e-16, 2.49800181e-16,
2.22044605e-16, 3.33066907e-16]),
'pint ** 3'
)> Still need to fix the unit behaviour of quite a few functions/methods, but this is promising :) Footnotes
|
Thanks for the shout-out @lucascolley! I would like to note that I am not actively working on PyDims. It was (for now) meant to gauge the communities interest in the approach, including in particular a "Python Units API Standard". As an aside, there is now yet another Python units library, since @phlptp added Python bindings to https://github.com/LLNL/units. |
@SimonHeybrock I have a made a topic proposing collabration on a Quantity API astropy/quantity-2.0#23 |
Preface: I'm not familiar with the history behind https://pint.readthedocs.io/en/stable/user/numpy.html#Array-Type-Support, but I have experience adding support for alternative array types to SciPy. For context, see https://numpy.org/neps/nep-0056-array-api-main-namespace.html#abstract, which superseded NEP 30.
The array API standard is being utilised in libraries like SciPy and scikit-learn to add support for array libraries like CuPy, PyTorch, and JAX. It seems feasible that Pint could wrap any library which is compatible with the standard.
The model for how this can be done is demonstrated in https://github.com/mdhaber/marray. Marray adds masked array support to standard-compatible arrays. Similar to Pint's
Quantity
, anMArray
is somedata
together with amask
:Marray exposes one function,
get_namespace
, which takes as input a standard-compatible namespace, and returns a new standard-compatible namespace with the additional capabilities of creating and correctly processing masked arrays. Under the hood, this namespace wraps each function to handle anything specific to masks, and otherwise dispatches to the underlying array namespace. So for pint,pint.get_namespace(torch)
would return a namespace for pint-wrapped PyTorch tensors.MArray
s also expose this namespace via the__array_namespace__
method, which allows libraries to operate on them like any other standard array:Marray currently passes almost the entirety of https://github.com/data-apis/array-api-tests. It would be cool to see if Pint could do the same!
This would bring a few benefits:
Quantity
s to any library which supports standard-compatible arrays, like SciPy (eventually!)@nabobalis I know you had expressed interest in this.
The text was updated successfully, but these errors were encountered: