Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EA: require size instead of __len__ #28389

Closed
wants to merge 5 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions doc/source/whatsnew/v1.0.0.rst
Original file line number Diff line number Diff line change
Expand Up @@ -123,6 +123,7 @@ source, you should no longer need to install Cython into your build environment
Backwards incompatible API changes
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

- The :class:`ExtensionArray` interface now requires the author to implement ``size`` instead of ``__len__``. The ``size`` method must _not_ depend on either ``__len__`` or ``shape`` (:issue:`28389`)
- :class:`pandas.core.groupby.GroupBy.transform` now raises on invalid operation names (:issue:`27489`).
- :class:`pandas.core.arrays.IntervalArray` adopts a new ``__repr__`` in accordance with other array classes (:issue:`25022`)

Expand Down
17 changes: 13 additions & 4 deletions pandas/core/arrays/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,7 @@ class ExtensionArray:
* _from_sequence
* _from_factorized
* __getitem__
* __len__
* size
* dtype
* nbytes
* isna
Expand Down Expand Up @@ -319,7 +319,7 @@ def __len__(self) -> int:
-------
length : int
"""
raise AbstractMethodError(self)
return self.shape[0]

def __iter__(self):
"""
Expand All @@ -342,19 +342,28 @@ def dtype(self) -> ExtensionDtype:
"""
raise AbstractMethodError(self)

@property
def size(self) -> int:
"""
An instance of 'ExtensionDtype'.

Must *not* depend on self.shape or self.__len__
"""
raise AbstractMethodError(self)

@property
def shape(self) -> Tuple[int, ...]:
"""
Return a tuple of the array dimensions.
"""
return (len(self),)
return (self.size,)

@property
def ndim(self) -> int:
"""
Extension Arrays are only allowed to be 1-dimensional.
"""
return 1
return len(self.shape)

@property
def nbytes(self) -> int:
Expand Down
2 changes: 1 addition & 1 deletion pandas/core/arrays/datetimelike.py
Original file line number Diff line number Diff line change
Expand Up @@ -394,7 +394,7 @@ def __array__(self, dtype=None):
@property
def size(self) -> int:
"""The number of elements in this array."""
return np.prod(self.shape)
return self._data.size

def __len__(self):
return len(self._data)
Expand Down
5 changes: 3 additions & 2 deletions pandas/core/arrays/integer.py
Original file line number Diff line number Diff line change
Expand Up @@ -469,8 +469,9 @@ def __setitem__(self, key, value):
self._data[key] = value
self._mask[key] = mask

def __len__(self):
return len(self._data)
@property
def size(self) -> int:
return self._data.size

@property
def nbytes(self):
Expand Down
5 changes: 3 additions & 2 deletions pandas/core/arrays/numpy_.py
Original file line number Diff line number Diff line change
Expand Up @@ -240,8 +240,9 @@ def __setitem__(self, key, value):

self._ndarray[key] = value

def __len__(self) -> int:
return len(self._ndarray)
@property
def size(self) -> int:
return self._ndarray.size

@property
def nbytes(self) -> int:
Expand Down
3 changes: 2 additions & 1 deletion pandas/core/arrays/sparse/array.py
Original file line number Diff line number Diff line change
Expand Up @@ -526,7 +526,8 @@ def _valid_sp_values(self):
mask = notna(sp_vals)
return sp_vals[mask]

def __len__(self) -> int:
@property
def size(self) -> int:
return self.sp_index.length

@property
Expand Down
3 changes: 2 additions & 1 deletion pandas/tests/extension/arrow/arrays.py
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,8 @@ def __getitem__(self, item):
vals = self._data.to_pandas()[item]
return type(self).from_scalars(vals)

def __len__(self):
@property
def size(self) -> int:
return len(self._data)

def astype(self, dtype, copy=True):
Expand Down
3 changes: 2 additions & 1 deletion pandas/tests/extension/decimal/array.py
Original file line number Diff line number Diff line change
Expand Up @@ -137,7 +137,8 @@ def __setitem__(self, key, value):
value = decimal.Decimal(value)
self._data[key] = value

def __len__(self) -> int:
@property
def size(self) -> int:
return len(self._data)

@property
Expand Down
3 changes: 2 additions & 1 deletion pandas/tests/extension/json/array.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,8 @@ def __setitem__(self, key, value):
assert isinstance(v, self.dtype.type)
self.data[k] = v

def __len__(self) -> int:
@property
def size(self) -> int:
return len(self.data)

@property
Expand Down