-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API: ExtensionArrays and conversion to "native" types (eg in tolist, to_dict, iteration, ..) #29738
Comments
Actually, for Series, the
|
Let's consider this a bug in --- a/pandas/core/arrays/integer.py
+++ b/pandas/core/arrays/integer.py
@@ -456,7 +456,7 @@ class IntegerArray(ExtensionArray, ExtensionOpsMixin):
if self._mask[i]:
yield self.dtype.na_value
else:
- yield self._data[i]
+ yield self._data[i].item() |
So |
If we mimic what Series with plain numpy dtype does, then getitem should keep returning the numpy scalar. |
Hi, @marco-neumann-jdas and @jorisvandenbossche, are you guys working on this issue ? Could I help somehow and make a PR for it (it was marked as a good first issue). |
…ze the types on iteratively going through the collection to cast to the appropriate type. In addition added a basic test case test_native_calls_types to assert that the change works
I am not working on it. |
Similarly, should pd.NA (
|
Stumbled upon this issue when trying to serialize a DataFrame resulted from
Of course, this can be reduced to an example like those presented above by @jorisvandenbossche:
Just leaving this comment in case someone looks for "isocalendar" or "not JSON serializable". |
@Peque - your example cases have been fixed on the main branch and will be included in 2.0. On main, your first example works without raising and your second example returns |
@mroeschke - was there ever an answer to this? |
@lukemanley Great to know and thanks for sharing! 😊 |
Personally I think it makes sense to convert pd.NA to None as the Python native type. @jorisvandenbossche might have thoughts on this as well |
#50796 changed I'll note that changing the value for
I'm curious if that is too big of a change? Since Any suggestions for moving this forward from here? Its probably not ideal that cc @phofl @mroeschke |
One more comment here. I think there may be a case for making If we were to take this approach, the currently tested Example with pyarrow:
|
My initial feeling is that both |
Sorry for all the questions. Just want to confirm that you're talking about both
|
Note I think |
Thanks. One concern with replacing
or simply:
e.g. test_array_iterface
The test suite alone has hundreds of failures when replacing
If you still think |
We try to consistently return python objects (instead of numpy scalars) in certain functions like
tolist
,to_dict
,itertuples/items
, .. (we have had quite some issues fixing this in several cases).However, currently we don't do that for extension dtypes (and don't have any mechanism to ask for this):
Should we add some API to ExtensionArray to provide this? Eg a method to iterate through the elements that returns "native" objects?
The text was updated successfully, but these errors were encountered: