-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Please consider checking ndarray types using the array interface #44617
Comments
Maybe numpy should actually add an ABC |
Also related issue: #44616 (but limited to just the DataFrame constructor) |
@jorisvandenbossche Yes, that looks like the same issue. Probably, pandas isn't interpreting the torch tensor as a numpy array, so they would benefit from the ABC as well. |
Checking for an |
How would you feel about adding an AbstractArray ABC like numpy/numpy#20459 and using it in the constructor-like functions? Maybe numpy will eventually upstream it, but it would solve the immediate problems in a somewhat elegant way? |
First thing that comes to mind is perf: for our internal ABCFoo classes Second is whether the idea would be to put your ndarray-like object directly in a Series/DataFrame or to first cast it to a for-realsies-ndarray. The former option seems liable to produce headaches. |
Good points. I'm not sure what's best since I haven't done any experiments. It may be that the |
might be able to implement an ExtensionArray backed by one of these objects? |
I'm not sure what you mean? |
Given further discussion on numpy/numpy#20459, it appears that I was mistaken about using an ABC. Do you think it would be better to block the construction of Pandas arrays with jax arrays and pytorch tensors, etc.? In other words, if you detect something that's array-like, but not an I'm not sure, but maybe when the array API standard is complete and adopted, there may be a more elegant solution. https://data-apis.org/array-api/latest/ |
ExtensionArrays (EAs) exist largely so that downstream libraries can implement subclasses and store non-np.ndarray objects directly in a Series/DataFrame. So if you don't want your arraylike object to be cast to an ndarray, implementing an EA is encouraged.
Well, no. Lots of other things fall into that category that currently work and we wouldn't want to raise on. |
Okay, thanks! I'll close this then. I guess I'll have to be extra careful when I pass arrays that may be Jax arrays into Pandas. |
@jbrockmendel Yes, conversion to array is efficient. I commented on that issue though because one of the fixes of the two might work. |
I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest version of pandas.
I have confirmed this bug exists on the master branch of pandas.
Reproducible Example
Issue Description
Please see here for a description: jax-ml/jax#8701
In short, there are various places in Pandas where
isinstance(x, np.ndarray)
is used. This only checks subclasses. With Numpy arrays, Pandas would ideally check for the array interface (using__array__
and__array_interface__
) so that Jax and other Numpy array types work.Expected Behavior
.
Installed Versions
INSTALLED VERSIONS
commit : 945c9ed
python : 3.9.8.final.0
python-bits : 64
OS : Linux
OS-release : 5.11.0-40-generic
Version : #44~20.04.2-Ubuntu SMP Tue Oct 26 18:07:44 UTC 2021
machine : x86_64
processor : x86_64
byteorder : little
LC_ALL : None
LANG : en_CA.UTF-8
LOCALE : en_CA.UTF-8
pandas : 1.3.4
numpy : 1.21.4
pytz : 2021.3
dateutil : 2.8.2
pip : 21.3.1
setuptools : 59.1.1
Cython : None
pytest : 6.2.5
hypothesis : None
sphinx : None
blosc : None
feather : None
xlsxwriter : None
lxml.etree : None
html5lib : None
pymysql : None
psycopg2 : None
jinja2 : 3.0.3
IPython : 7.29.0
pandas_datareader: None
bs4 : None
bottleneck : None
fsspec : None
fastparquet : None
gcsfs : None
matplotlib : 3.5.0
numexpr : None
odfpy : None
openpyxl : None
pandas_gbq : None
pyarrow : None
pyxlsb : None
s3fs : None
scipy : 1.7.2
sqlalchemy : None
tables : None
tabulate : 0.8.9
xarray : None
xlrd : None
xlwt : None
numba : None
The text was updated successfully, but these errors were encountered: