Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure conversion to "native" types for integer EA #31328

Closed
wants to merge 7 commits into from

Conversation

rushabh-v
Copy link
Contributor

@rushabh-v rushabh-v commented Jan 26, 2020

Comment on lines 776 to 777
result = type(pd.Series([1, 2], dtype="int64").tolist()[0])
assert expected == result
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's generally preferred to use isinstance, see https://www.python.org/dev/peps/pep-0008/

Object type comparisons should always use isinstance() instead of comparing types directly.

Yes: if isinstance(obj, int):

No: if type(obj) is type(1):

Copy link
Contributor Author

@rushabh-v rushabh-v Jan 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that would become isinstance(int, int) in this case, which returns False.
Are you asking to do it some other way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant was

isinstance(pd.Series([1, 2], dtype="int64").tolist()[0], int)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, okay. I will commit that soon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have done that. Can you review it, please?

@rushabh-v
Copy link
Contributor Author

One test is failing. see, https://dev.azure.com/pandas-dev/pandas/_build/results?buildId=26857&view=logs&j=bef1c175-2c1b-51ae-044a-2437c76fc339&t=770e7bb1-09f5-5ebf-b63b-578d2906aac9&l=169

I think it is because the series is being converted into int8 dtype somehow. Any thoughts?

@MarcoGorelli
Copy link
Member

MarcoGorelli commented Feb 9, 2020

Hey @rushabh-v - sorry for only getting round to this now.

One test is failing. see, https://dev.azure.com/pandas-dev/pandas/_build/results?buildId=26857&view=logs&j=bef1c175-2c1b-51ae-044a-2437c76fc339&t=770e7bb1-09f5-5ebf-b63b-578d2906aac9&l=169

The build no longer exists. I'll just make another couple of suggestions, if you then ping me when you've done them I'll take a look at the failing tests

Comment on lines 774 to 782
def test_integer_Series_iter_return_native():
assert isinstance(pd.Series([1, 2], dtype="int64").tolist()[0], int)
assert isinstance(pd.Series([1, 2], dtype="Int64").tolist()[0], int)
assert isinstance(pd.Series([1, 2], dtype="int64").to_dict()[0], int)
assert isinstance(pd.Series([1, 2], dtype="Int64").to_dict()[0], int)
assert isinstance(list(pd.Series([1, 2], dtype="int64").iteritems())[0][1], int)
assert isinstance(list(pd.Series([1, 2], dtype="Int64").iteritems())[0][1], int)
assert isinstance(list(iter(pd.Series([1, 2], dtype="int64")))[0], int)
assert isinstance(list(iter(pd.Series([1, 2], dtype="Int64")))[0], int)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Please leave a comment with the issue number, so
def test_integer_Series_iter_return_native():
    # GH <issue number goes here>
  1. Can this test be parametrised somewhat? See here, as well as several cases in the pandas tests (e.g. the one below this one) for examples. There's some examples in the contributing guide too

@MarcoGorelli
Copy link
Member

Also, a whatsnew entry will be required (v1.1.0, I believe)

@rushabh-v
Copy link
Contributor Author

@MarcoGorelli
Copy link
Member

Sure, I'll look at this later this week.

(note to self: to reproduce the error:

pytest pandas/tests/extension/test_integer.py

)

@MarcoGorelli
Copy link
Member

I think there might be another underlying problem here, which I've raised in #31899.

@rushabh-v
Copy link
Contributor Author

So can you review and merge this PR now or we should wait for #31899 to be resolved?

@MarcoGorelli
Copy link
Member

Yes, the tests should be passing before we can merge. I'll look into 31899 today

@MarcoGorelli
Copy link
Member

MarcoGorelli commented Feb 15, 2020

Current tests fail because of this:

>>> import pandas as pd
>>> s1 = pd.Series([100], dtype='Int8')                                                                     
>>> s2 = pd.Series([100], dtype='Int8')

So, without the current fix, we have

>>> [a+b for (a, b) in zip(s1.values, s2.values)]
[-56]
>>> s1.combine(s2, lambda x, y: x+y)                                                                        
0    -56
dtype: Int64

With the current fix:

>>> [a+b for (a, b) in zip(s1.values, s2.values)]
[200]
>>> s1.combine(s2, lambda x, y: x+y)                                                                        
0    -56
dtype: Int64

cc @jorisvandenbossche

@WillAyd
Copy link
Member

WillAyd commented Mar 14, 2020

@rushabh-v is this still active? Can you merge master and try to get green?

@rushabh-v
Copy link
Contributor Author

still fails

@rushabh-v
Copy link
Contributor Author

Any updates?

@MarcoGorelli
Copy link
Member

Not from me - I removed the 'good first issue' tag from the original issue as I think there's some underlying issues that need to be solved here first, and they aren't so easy

@jreback
Copy link
Contributor

jreback commented Jun 14, 2020

@rushabh-v can you merge master

@jreback jreback added ExtensionArray Extending pandas with custom dtypes or arrays. Indexing Related to indexing on series/frames, not to indexes themselves labels Jun 14, 2020
@@ -354,6 +354,13 @@ def __init__(self, values: np.ndarray, mask: np.ndarray, copy: bool = False):
)
super().__init__(values, mask, copy=copy)

def __iter__(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could actually move this to base masked and/or the base EA interface (maybe)

@dsaxton dsaxton added the Stale label Sep 17, 2020
@jreback
Copy link
Contributor

jreback commented Oct 24, 2020

closing in favor of #37377

@jreback jreback added this to the No action milestone Oct 24, 2020
@jreback jreback closed this Oct 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ExtensionArray Extending pandas with custom dtypes or arrays. Indexing Related to indexing on series/frames, not to indexes themselves Stale
Projects
None yet
Development

Successfully merging this pull request may close these issues.

API: ExtensionArrays and conversion to "native" types (eg in tolist, to_dict, iteration, ..)
5 participants