Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct type inference for UInt64Index during access #29420

Merged
merged 13 commits into from
Nov 27, 2019
Merged

Correct type inference for UInt64Index during access #29420

merged 13 commits into from
Nov 27, 2019

Conversation

oguzhanogreden
Copy link
Contributor

@oguzhanogreden oguzhanogreden commented Nov 5, 2019


def test_uint64_keys_in_list():
# https://github.com/pandas-dev/pandas/issues/28023
bug = pd.Series(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could both these test functions be combined into 1 func? Seem very similar

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are indeed similar. I'll wait for a few more comments to decide on this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea pls do we prefer parametized tests as much as possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revisiting the issues, I realized one of the tests would be redundant and removed that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Revisiting the issues, I realized one of the tests would be redundant and removed that.

How come? - it seems that test case was exactly covering issue #28023

Copy link
Member

@alimcmaster1 alimcmaster1 Nov 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

@oguzhanogreden oguzhanogreden Nov 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About the test function... Now that I'm rooting for a single test, I'll leave it as isinstance since I think the intention of test is clearer this way.

@alimcmaster1 alimcmaster1 added the Indexing Related to indexing on series/frames, not to indexes themselves label Nov 5, 2019
@alimcmaster1
Copy link
Member

Thanks for the PR!

@oguzhanogreden oguzhanogreden changed the title Indexing inference pr Correct type inference for UInt64Index during access Nov 6, 2019
@alimcmaster1
Copy link
Member

alimcmaster1 commented Nov 6, 2019

Test failures related to #29432

pandas/core/indexes/numeric.py Outdated Show resolved Hide resolved

assert isinstance(
bug.loc[[7606741985629028552, 17876870360202815256]].index, UInt64Index
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explicitly construct the expected index and use tm.assert_index_equal to verify they're the same:

result = s.loc[[7606741985629028552, 17876870360202815256]].index
expected = UInt64Index([7606741985629028552, 17876870360202815256])
tm.assert_index_equal(result, expected)

I'd rather not do a simple isinstance check here because it doesn't guard against potential precision loss with the values in the index, e.g. if someone makes a change where there's an intermediate coercion to Float64Index:

In [2]: idx = pd.UInt64Index([2**53, 2**53 + 1])

In [3]: idx
Out[3]: UInt64Index([9007199254740992, 9007199254740993], dtype='uint64')

In [4]: pd.UInt64Index(pd.Float64Index(idx))
Out[4]: UInt64Index([9007199254740992, 9007199254740992], dtype='uint64')

@pep8speaks
Copy link

pep8speaks commented Nov 10, 2019

Hello @oguzhanogreden! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-11-23 15:22:46 UTC

@oguzhanogreden
Copy link
Contributor Author

This will fail due to #29526.

@oguzhanogreden
Copy link
Contributor Author

Between two laptops and a rebase, git happened again. Sorry for the spam!

doc/source/whatsnew/v1.0.0.rst Outdated Show resolved Hide resolved
@jreback jreback added this to the 1.0 milestone Nov 20, 2019
doc/source/whatsnew/v1.0.0.rst Outdated Show resolved Hide resolved
pandas/tests/indexes/test_numeric.py Outdated Show resolved Hide resolved
@oguzhanogreden
Copy link
Contributor Author

@jschendel here you suggested that the What's New entry goes under Numeric. It's not clear to me why these are not under Indexing. Can you confirm that this is where the last 4 entries do belong?

@jschendel
Copy link
Member

jschendel commented Nov 26, 2019

The "Indexing" section is generally for operations that have to do with selecting data with an index, e.g. loc/iloc, and the associated index methods that support this, e.g. get_loc/get_indexer. For example, see the types of things tested in pandas/tests/indexing. The "Numeric" section deals more with issues related to numeric operations, e.g. computation, precision, inference, etc.

@jreback jreback merged commit 70f1c28 into pandas-dev:master Nov 27, 2019
@jreback
Copy link
Contributor

jreback commented Nov 27, 2019

thanks for the patch @oguzhanogreden

@oguzhanogreden oguzhanogreden deleted the indexing-inference-pr branch November 28, 2019 12:21
keechongtan added a commit to keechongtan/pandas that referenced this pull request Nov 29, 2019
…ndexing-1row-df

* upstream/master: (32 commits)
  DEPR: Series.cat.categorical (pandas-dev#29914)
  DEPR: infer_dtype default for skipna is now True (pandas-dev#29876)
  Fix broken asv (pandas-dev#29906)
  DEPR: Remove weekday_name (pandas-dev#29831)
  Fix mypy errors for pandas\tests\series\test_operators.py (pandas-dev#29826)
  CI: Setting path only once in GitHub Actions (pandas-dev#29867)
  DEPR: passing td64 data to DTA or dt64 data to TDA (pandas-dev#29794)
  CLN: remove unsupported sparse code from io.pytables (pandas-dev#29863)
  x.__class__ TO type(x) (pandas-dev#29889)
  DEPR: ftype, ftypes (pandas-dev#29895)
  REF: use named funcs instead of lambdas (pandas-dev#29841)
  Correct type inference for UInt64Index during access (pandas-dev#29420)
  CLN: follow-up to 29725 (pandas-dev#29890)
  CLN: trim unnecessary code in indexing tests (pandas-dev#29845)
  TST added test for groupby agg on mulitlevel column (pandas-dev#29772) (pandas-dev#29866)
  mypy fix (pandas-dev#29891)
  Typing annotations (pandas-dev#29850)
  Fix mypy error in pandas/tests.indexes.test_base.py (pandas-dev#29188)
  CLN: remove never-used kwargs, make kwargs explicit (pandas-dev#29873)
  TYP: Added typing to __eq__ functions (pandas-dev#29818)
  ...
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Indexing Related to indexing on series/frames, not to indexes themselves
Projects
None yet
5 participants