Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should ExtensionArray.take accept scalar inputs? #22215

Closed
TomAugspurger opened this issue Aug 6, 2018 · 3 comments
Closed

Should ExtensionArray.take accept scalar inputs? #22215

TomAugspurger opened this issue Aug 6, 2018 · 3 comments
Labels
API Design ExtensionArray Extending pandas with custom dtypes or arrays.
Milestone

Comments

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Aug 6, 2018

ndarray.take accepts scalars, and returns a scalar. We should probably make that part of the interface, or document that we don't support it.

In [18]: np.array([1, 2]).take(0)
Out[18]: 1

Categorical currently returns an invalid categorical:

In [19]: res = pd.Categorical([0, 1]).take(0)

In [20]: type(res)
Out[20]: pandas.core.arrays.categorical.Categorical
In [21]: res
Out[21]: ---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~/Envs/pandas-dev/lib/python3.6/site-packages/IPython/core/formatters.py in __call__(self, obj)
    700                 type_pprinters=self.type_printers,
    701                 deferred_pprinters=self.deferred_printers)
--> 702             printer.pretty(obj)
    703             printer.flush()
    704             return stream.getvalue()

~/Envs/pandas-dev/lib/python3.6/site-packages/IPython/lib/pretty.py in pretty(self, obj)
    398                         if cls is not object \
    399                                 and callable(cls.__dict__.get('__repr__')):
--> 400                             return _repr_pprint(obj, self, cycle)
    401
    402             return _default_pprint(obj, self, cycle)

~/Envs/pandas-dev/lib/python3.6/site-packages/IPython/lib/pretty.py in _repr_pprint(obj, p, cycle)
    693     """A pprint that just redirects to the normal repr function."""
    694     # Find newlines and replace them with p.break_()
--> 695     output = repr(obj)
    696     for idx,output_line in enumerate(output.splitlines()):
    697         if idx:

~/sandbox/pandas/pandas/core/base.py in __repr__(self)
     80         Yields Bytestring in Py2, Unicode String in py3.
     81         """
---> 82         return str(self)
     83
     84

~/sandbox/pandas/pandas/core/base.py in __str__(self)
     59
     60         if compat.PY3:
---> 61             return self.__unicode__()
     62         return self.__bytes__()
     63

~/sandbox/pandas/pandas/core/arrays/categorical.py in __unicode__(self)
   1942         """ Unicode representation. """
   1943         _maxlen = 10
-> 1944         if len(self._codes) > _maxlen:
   1945             result = self._tidy_repr(_maxlen)
   1946         elif len(self._codes) > 0:

TypeError: len() of unsized object
  • IntervalArray.take fails on take
  • SparseArray allows it.
@TomAugspurger TomAugspurger added API Design ExtensionArray Extending pandas with custom dtypes or arrays. labels Aug 6, 2018
@TomAugspurger TomAugspurger added this to the 0.24.0 milestone Aug 6, 2018
@TomAugspurger
Copy link
Contributor Author

To be clear, I'm not sure that we should support it.

Currently EA.take always returns an ExtensionArray. That's a nice piece of knowledge to rely on. I'd rather not have to litter all call-sites of EA.take with checks for whether indices is a scalar or not.

@TomAugspurger
Copy link
Contributor Author

I'm Ok with ExtensionArray.take requiring a sequence of indices.

@jorisvandenbossche
Copy link
Member

Small note here: we seem to have explicitly added support for a scalar input in 0.18, see: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#whatsnew-0181-sparse (a section in the docs that is now failing).

(re-opening as we at least have to fix the docs)

TomAugspurger added a commit to TomAugspurger/pandas that referenced this issue Nov 12, 2018
Closes pandas-dev#22215

SparseArray.take not accepting scalars is already in 0.24.0.txt
jreback pushed a commit that referenced this issue Nov 12, 2018
Closes #22215

SparseArray.take not accepting scalars is already in 0.24.0.txt
avolkov pushed a commit to avolkov/pandas that referenced this issue Nov 13, 2018
Closes pandas-dev#22215

SparseArray.take not accepting scalars is already in 0.24.0.txt
JustinZhengBC pushed a commit to JustinZhengBC/pandas that referenced this issue Nov 14, 2018
Closes pandas-dev#22215

SparseArray.take not accepting scalars is already in 0.24.0.txt
tm9k1 pushed a commit to tm9k1/pandas that referenced this issue Nov 19, 2018
Closes pandas-dev#22215

SparseArray.take not accepting scalars is already in 0.24.0.txt
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this issue Feb 28, 2019
Closes pandas-dev#22215

SparseArray.take not accepting scalars is already in 0.24.0.txt
Pingviinituutti pushed a commit to Pingviinituutti/pandas that referenced this issue Feb 28, 2019
Closes pandas-dev#22215

SparseArray.take not accepting scalars is already in 0.24.0.txt
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API Design ExtensionArray Extending pandas with custom dtypes or arrays.
Projects
None yet
Development

No branches or pull requests

2 participants