Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting a column of a Modin DataFrame to Categorical twice does not work #4275

Open
naren-ponder opened this issue Feb 28, 2022 · 1 comment
Labels
bug 🦗 Something isn't working P3 Very minor bugs, or features we can hopefully add some day. pandas concordance 🐼 Functionality that does not match pandas

Comments

@naren-ponder
Copy link
Collaborator

naren-ponder commented Feb 28, 2022

System information

  • OS Platform and Distribution (e.g., Linux Ubuntu 16.04): macOS Version 11.6
  • Modin version (modin.__version__): 0.13.2
  • Python version: Python 3.8.11
  • Code we can use to reproduce:
import modin.pandas as pd
from modin.pandas import Categorical
 
d = {'a': [1, 2, 3], 'b': [4, 5, 6]} 
df = pd.DataFrame(data=d)
df['a'] = Categorical(df['a'], ordered=True)
df['a'] = Categorical(df['a'], ordered=True)

Describe the problem

AttributeError                            Traceback (most recent call last)
<ipython-input-2-a6c6c5c96a53> in <module>
      5 df = pd.DataFrame(data=d)
      6 df['a'] = Categorical(df['a'], ordered=True)
----> 7 df['a'] = Categorical(df['a'], ordered=True)

~/opt/anaconda3/envs/ponder/lib/python3.8/site-packages/pandas/core/arrays/categorical.py in __init__(self, values, categories, ordered, dtype, fastpath, copy)
    454             # error: Item "ExtensionArray" of "Union[Any, ExtensionArray]" has no
    455             # attribute "_codes"
--> 456             old_codes = extract_array(values)._codes  # type: ignore[union-attr]
    457             codes = recode_for_categories(
    458                 old_codes, values.dtype.categories, dtype.categories, copy=copy

~/Desktop/modin/modin/pandas/series.py in __getattr__(self, key)
    372             if key not in _ATTRS_NO_LOOKUP and key in self.index:
    373                 return self[key]
--> 374             raise e
    375 
    376     def __int__(self):

~/Desktop/modin/modin/pandas/series.py in __getattr__(self, key)
    368         """
    369         try:
--> 370             return object.__getattribute__(self, key)
    371         except AttributeError as e:
    372             if key not in _ATTRS_NO_LOOKUP and key in self.index:

AttributeError: 'Series' object has no attribute '_codes'

Source code / logs

@jbrockmendel
Copy link
Collaborator

In the extract_array call a pd.Series object gets extracted to a Categorical, while a modin.pandas.Series object does not. Similar to #4646

@vnlitvinov vnlitvinov added pandas concordance 🐼 Functionality that does not match pandas P3 Very minor bugs, or features we can hopefully add some day. labels Aug 29, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🦗 Something isn't working P3 Very minor bugs, or features we can hopefully add some day. pandas concordance 🐼 Functionality that does not match pandas
Projects
None yet
Development

No branches or pull requests

4 participants