Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

modin 'Series' object has no attribute 'asi8' #4646

Open
litlep-nibbyt opened this issue Jul 5, 2022 · 17 comments
Open

modin 'Series' object has no attribute 'asi8' #4646

litlep-nibbyt opened this issue Jul 5, 2022 · 17 comments
Assignees
Labels
Blocked ❌ A pull request that is blocked bug 🦗 Something isn't working External Pull requests and issues from people who do not regularly contribute to modin P1 Important tasks that we should complete soon pandas concordance 🐼 Functionality that does not match pandas

Comments

@litlep-nibbyt
Copy link

System information

OS X 11.6.4
Modin version '0.15.2'
Python 3.9.12

Describe the problem

Subtracting a DatetimeIndex and a modin Series fails since modin Series doesn't have asi8 attribute.

Source code / logs

>>> import modin.pandas as pd
>>> df = pd.DataFrame({'date':[1,2,3,4]})
UserWarning: Distributing <class 'dict'> object. This may take some time.
>>> df['dob'] = pd.date_range(start='1/1/1930', periods=len(df), freq='D')
>>> df['date'] = pd.date_range(start='1/1/1982', periods=len(df), freq='D')
>>> df.set_index('date', inplace=True)
>>> print(type(df.index))
<class 'pandas.core.indexes.datetimes.DatetimeIndex'>
>>> print(type(df.dob))
<class 'modin.pandas.series.Series'>
>>> df.index - df.dob
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/Caskroom/miniforge/base/envs/enobase3/lib/python3.9/site-packages/pandas/core/ops/common.py", line 70, in new_method
    return method(self, other)
  File "/usr/local/Caskroom/miniforge/base/envs/enobase3/lib/python3.9/site-packages/pandas/core/arraylike.py", line 108, in __sub__
    return self._arith_method(other, operator.sub)
  File "/usr/local/Caskroom/miniforge/base/envs/enobase3/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 6717, in _arith_method
    return super()._arith_method(other, op)
  File "/usr/local/Caskroom/miniforge/base/envs/enobase3/lib/python3.9/site-packages/pandas/core/base.py", line 1295, in _arith_method
    result = ops.arithmetic_op(lvalues, rvalues, op)
  File "/usr/local/Caskroom/miniforge/base/envs/enobase3/lib/python3.9/site-packages/pandas/core/ops/array_ops.py", line 216, in arithmetic_op
    res_values = op(left, right)
  File "/usr/local/Caskroom/miniforge/base/envs/enobase3/lib/python3.9/site-packages/pandas/core/ops/common.py", line 70, in new_method
    return method(self, other)
  File "/usr/local/Caskroom/miniforge/base/envs/enobase3/lib/python3.9/site-packages/pandas/core/arrays/datetimelike.py", line 1340, in __sub__
    result = self._sub_datetime_arraylike(other)
  File "/usr/local/Caskroom/miniforge/base/envs/enobase3/lib/python3.9/site-packages/pandas/core/arrays/datetimes.py", line 738, in _sub_datetime_arraylike
    other_i8 = other.asi8
  File "/usr/local/Caskroom/miniforge/base/envs/enobase3/lib/python3.9/site-packages/modin/logging/logger_metaclass.py", line 68, in log_wrap
    return method(*args, **kwargs)
  File "/usr/local/Caskroom/miniforge/base/envs/enobase3/lib/python3.9/site-packages/modin/pandas/series.py", line 335, in __getattr__
    raise e
  File "/usr/local/Caskroom/miniforge/base/envs/enobase3/lib/python3.9/site-packages/modin/pandas/series.py", line 331, in __getattr__
    return object.__getattribute__(self, key)
AttributeError: 'Series' object has no attribute 'asi8'
@pyrito
Copy link
Collaborator

pyrito commented Jul 5, 2022

Thanks for opening the issue @meijiu ! I was able to reproduce the bug you've just opened. It looks like df.index is being treated as a Series instead of an Index (since this is the class that actually contains the asi8 property). We'll take a look at the issue and merge in a fix.

@pyrito pyrito added the bug 🦗 Something isn't working label Jul 5, 2022
@jbrockmendel
Copy link
Collaborator

FWIW Index.asi8 is deprecated in pandas

@pyrito
Copy link
Collaborator

pyrito commented Jul 6, 2022

FWIW Index.asi8 is deprecated in pandas

@jbrockmendel Correct me if I'm wrong here but it looks like it's still used in DateTimeIndex and a few others in that vein? That's what the change logs say at least.

@jbrockmendel
Copy link
Collaborator

DatetimeIndex, TimedeltaIndex, and PeriodIndex will still have .asi8, but pd.Index and the other subclasses will not.

I'm just starting to dig into the modin code base, will try to get a handle on what's going on with the arithmetic

@pyrito
Copy link
Collaborator

pyrito commented Jul 6, 2022

@jbrockmendel is this an issue you're interested in taking?

@jbrockmendel
Copy link
Collaborator

Looks like the fundamental problem here is that pd.Index.__sub__(ModinSeries) is not returning NotImplemented. The reversed op df["dob"].__rsub__(df.index) works as expected.

A hacky-but-feasible solution would be to implement Series.asi8 to return ser.to_numpy().view("i8").

The correct solution is going to have to live upstream, same underlying issue as pandas-dev/pandas#38946

@pyrito
Copy link
Collaborator

pyrito commented Jul 6, 2022

Does anyone from @modin-project/modin-contributors @modin-project/modin-core have any thoughts?

@jbrockmendel
Copy link
Collaborator

Another option to solve it entirely on modin's end would be for modin's Series to subclass pd.Series. im assuming this was considered and rejected though

@vnlitvinov vnlitvinov added the pandas concordance 🐼 Functionality that does not match pandas label Jul 6, 2022
@vnlitvinov
Copy link
Collaborator

I think we intentionally do not subclass pandas objects, as it would lead to passing isinstance() checks but failing the interop (without some serious efforts).

As for this case - we can try to subclass pd.Index instead...

@jbrockmendel
Copy link
Collaborator

we can try to subclass pd.Index instead

This is ostensibly supported but really not advisable pandas-dev/pandas#45289

@pyrito
Copy link
Collaborator

pyrito commented Jul 12, 2022

@vnlitvinov @jbrockmendel do either of you want to take on this issue?

@jbrockmendel
Copy link
Collaborator

but failing the interop (without some serious efforts).

can you expand on this?

@jbrockmendel
Copy link
Collaborator

Another option would be to set define Series._typ = "series". That would trick some of the isinstance checks within pandas into treating this like a pandas Series.

@pyrito
Copy link
Collaborator

pyrito commented Aug 31, 2022

@jbrockmendel @vnlitvinov do either of you have the bandwidth to make a PR for this?

@pyrito pyrito added the P1 Important tasks that we should complete soon label Aug 31, 2022
@jbrockmendel
Copy link
Collaborator

Will be addressable following pandas-dev/pandas#48347

@jbrockmendel
Copy link
Collaborator

jbrockmendel commented Mar 14, 2023

This should now be addressable by implementing __pandas_priority__ (in pandas 2.1+)

@anmyachev anmyachev added the External Pull requests and issues from people who do not regularly contribute to modin label Apr 19, 2023
@vnlitvinov vnlitvinov added the Blocked ❌ A pull request that is blocked label Jun 21, 2023
@vnlitvinov
Copy link
Collaborator

Adding blocked as we're waiting for pandas 2.1 release here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Blocked ❌ A pull request that is blocked bug 🦗 Something isn't working External Pull requests and issues from people who do not regularly contribute to modin P1 Important tasks that we should complete soon pandas concordance 🐼 Functionality that does not match pandas
Projects
None yet
Development

No branches or pull requests

5 participants