Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Follow up PR: #28097 Simplify branch statement #29243

Merged
merged 1 commit into from
Jan 26, 2020

Conversation

proost
Copy link
Contributor

@proost proost commented Oct 27, 2019

xref #28097

This is follow up pr. eliminate branch statement. "grouper" variable is assigned when handle NAs.


if len(level_index):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not what i meant

can u eliminate this branch here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback

Ok. I eliminate branch. But I don't know how not to use branch and not conflict with test cases together. Could you give me a clue?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jreback i think the blocker here is clarifying for this contributor what you're asking for

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think CI logs are around but do we know why this change fails CI? I think @jreback point is that this seems unnecessary so unclear why it is causing the failures

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WillAyd
Because "Index.take"

            if allow_fill and fill_value is not None:
                msg = "Unable to fill values because {0} cannot contain NA"
                raise ValueError(msg.format(self.__class__.__name__))

cause fail.
Some Indexes use ".take" from "Index.take". If grouper = level_index.take(codes, fill_value=True) all the time and "level_index" use "take" from "Index.take", then raise exception.
So I changed check "level_index" empty or not. this check avoids from exception.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm OK. I guess that goes back to this line:

f"Unable to fill values because {cls_name} cannot contain NA"

So I guess we would have to allow fill_value being passed to Index types that can't hold NA for this to work.

I think I'd be OK with that but any objections @jreback?

>>> pd.Index(range(5)).take([3], fill_value=1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/pandas/core/indexes/base.py", line 901, in take
    raise ValueError(msg.format(self.__class__.__name__))
ValueError: Unable to fill values because RangeIndex cannot contain NA

>>> pd.Index(range(5)).astype(float).take([3], fill_value=1)
Float64Index([3.0], dtype='float64')

Copy link
Contributor Author

@proost proost Dec 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@WillAyd
You are right. If Some level index in "MultiIndex", which can't hold NA do not pass flll_value. can exist. So there is need to check.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@proost I think you can replace the if check to be if level_index._can_hold_na: to at least be more explicit

@proost proost force-pushed the update-groupby branch 2 times, most recently from e97d879 to a212065 Compare October 28, 2019 06:17
@jreback
Copy link
Contributor

jreback commented Jan 1, 2020

can you merge master and see if you can get passing


if len(level_index):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@proost I think you can replace the if check to be if level_index._can_hold_na: to at least be more explicit

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the explicitness here is an improvement so lgtm. @jreback

@jreback jreback added this to the 1.1 milestone Jan 26, 2020
@jreback jreback merged commit 8e1dd1e into pandas-dev:master Jan 26, 2020
@jreback
Copy link
Contributor

jreback commented Jan 26, 2020

thanks @proost this is not exactly what I was looking for, but its ok. thanks.

@proost proost deleted the update-groupby branch January 26, 2020 09:46
keechongtan added a commit to keechongtan/pandas that referenced this pull request Jan 27, 2020
…ndexing-1row-df

* upstream/master: (194 commits)
  DOC Remove Python 2 specific comments from documentation (pandas-dev#31198)
  Follow up PR: pandas-dev#28097 Simplify branch statement (pandas-dev#29243)
  BUG: DatetimeIndex.snap incorrectly setting freq (pandas-dev#31188)
  Move DataFrame.info() to live with similar functions (pandas-dev#31317)
  ENH: accept a dictionary in plot colors (pandas-dev#31071)
  PERF: add shortcut to Timestamp constructor (pandas-dev#30676)
  CLN/MAINT: Clean and annotate stata reader and writers (pandas-dev#31072)
  REF: define _get_slice_axis in correct classes (pandas-dev#31304)
  BUG: DataFrame.floordiv(ser, axis=0) not matching column-wise bheavior (pandas-dev#31271)
  PERF: optimize is_scalar, is_iterator (pandas-dev#31294)
  BUG: Series rolling count ignores min_periods (pandas-dev#30923)
  xfail sparse warning; closes pandas-dev#31310 (pandas-dev#31311)
  REF: DatetimeIndex.get_value wrap DTI.get_loc (pandas-dev#31314)
  CLN: internals.managers (pandas-dev#31316)
  PERF: avoid copies if possible in fill_binop (pandas-dev#31300)
  Add test for multiindex json (pandas-dev#31307)
  BUG: passing TDA and wrong freq to TimedeltaIndex (pandas-dev#31268)
  BUG: inconsistency between PeriodIndex.get_value vs get_loc (pandas-dev#31172)
  CLN: remove _set_subtyp (pandas-dev#31301)
  CI: Updated version of macos image (pandas-dev#31292)
  ...
@jorisvandenbossche jorisvandenbossche modified the milestones: 1.1, 1.0.1 Feb 5, 2020
@jorisvandenbossche
Copy link
Member

@meeseeksdev backport to 1.0.x

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants