Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

using DataFrame.resample with 'agg' method on non-existant columns provides unexpected behavior #19552

Merged
merged 1 commit into from
Feb 7, 2018

Conversation

discort
Copy link
Contributor

@discort discort commented Feb 6, 2018

@@ -659,6 +659,21 @@ def f():
'B': {'rb': ['mean', 'std']}})
assert_frame_equal(result, expected, check_like=True)

def test_try_aggregate_non_existing_column(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure that this test is needed because test_agg_misc test covers this case. I added it to show reviewer that original issue is addressed. Let me know and I'll remove this new test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to keep it.

@TomAugspurger
Copy link
Contributor

Looks like your change is causing failures elsewhere. I'm not personally familiar with this code, I'm not sure what the best way forward is. Let me know if you need help though, and I can take a look.

@@ -646,6 +646,7 @@ Groupby/Resample/Rolling
- Bug in :func:`DataFrame.groupby` where tuples were interpreted as lists of keys rather than as keys (:issue:`17979`, :issue:`18249`)
- Bug in :func:`DataFrame.transform` where particular aggregation functions were being incorrectly cast to match the dtype(s) of the grouped data (:issue:`19200`)
- Bug in :func:`DataFrame.groupby` passing the `on=` kwarg, and subsequently using ``.apply()`` (:issue:`17813`)
- Bug in :func:`DataFrame.aggregate` on non-existant columns (:issue:`16766`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say what the issue was, also I think the issue was with DataFrame.resample().aggregate, not DataFrame.aggregate, right?

So something like

Bug in :func:`DataFrame.resample` not raising a a `KeyError` when aggregating a non-existent column.

@@ -659,6 +659,21 @@ def f():
'B': {'rb': ['mean', 'std']}})
assert_frame_equal(result, expected, check_like=True)

def test_try_aggregate_non_existing_column(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good to keep it.

@TomAugspurger TomAugspurger added the Resample resample method label Feb 6, 2018
@pep8speaks
Copy link

pep8speaks commented Feb 6, 2018

Hello @discort! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on February 07, 2018 at 08:24 Hours UTC

@discort discort force-pushed the fix_16766 branch 2 times, most recently from 0d1ebe4 to 72f328e Compare February 7, 2018 06:38
@codecov
Copy link

codecov bot commented Feb 7, 2018

Codecov Report

Merging #19552 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #19552      +/-   ##
==========================================
+ Coverage    91.6%    91.6%   +<.01%     
==========================================
  Files         150      150              
  Lines       48750    48752       +2     
==========================================
+ Hits        44656    44660       +4     
+ Misses       4094     4092       -2
Flag Coverage Δ
#multiple 89.97% <100%> (ø) ⬆️
#single 41.75% <0%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/base.py 96.78% <100%> (+0.01%) ⬆️
pandas/util/testing.py 83.85% <0%> (+0.2%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 983d71f...d142bc1. Read the comment docs.

@discort
Copy link
Contributor Author

discort commented Feb 7, 2018

@TomAugspurger

@jreback jreback added this to the 0.23.0 milestone Feb 7, 2018
@jreback jreback added the Error Reporting Incorrect or improved errors from pandas label Feb 7, 2018
@jreback jreback merged commit ccf9677 into pandas-dev:master Feb 7, 2018
@jreback
Copy link
Contributor

jreback commented Feb 7, 2018

thanks! keep em coming!

@@ -392,6 +392,10 @@ def nested_renaming_depr(level=4):

elif isinstance(obj, ABCSeries):
nested_renaming_depr()
elif isinstance(obj, ABCDataFrame) and \
k not in obj.columns:
raise ValueError(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actuallly I think this should be a KeyError. can you do another PR to fix?

jreback added a commit to jreback/pandas that referenced this pull request Feb 7, 2018
jreback added a commit to jreback/pandas that referenced this pull request Feb 7, 2018
jreback added a commit to jreback/pandas that referenced this pull request Feb 7, 2018
harisbal pushed a commit to harisbal/pandas that referenced this pull request Feb 28, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Error Reporting Incorrect or improved errors from pandas Resample resample method
Projects
None yet
Development

Successfully merging this pull request may close these issues.

using DataFrame.resample with 'agg' method on non-existant columns provides unexpected behavior
4 participants