using DataFrame.resample with 'agg' method on non-existant columns provides unexpected behavior #19552

discort · 2018-02-06T16:03:07Z

closes using DataFrame.resample with 'agg' method on non-existant columns provides unexpected behavior #16766
tests added / passed
passes git diff upstream/master -u -- "*.py" | flake8 --diff
whatsnew entry

discort · 2018-02-06T16:06:10Z

pandas/tests/test_resample.py

@@ -659,6 +659,21 @@ def f():
                                'B': {'rb': ['mean', 'std']}})
            assert_frame_equal(result, expected, check_like=True)

+    def test_try_aggregate_non_existing_column(self):


I'm not sure that this test is needed because test_agg_misc test covers this case. I added it to show reviewer that original issue is addressed. Let me know and I'll remove this new test.

Good to keep it.

TomAugspurger · 2018-02-06T17:42:44Z

Looks like your change is causing failures elsewhere. I'm not personally familiar with this code, I'm not sure what the best way forward is. Let me know if you need help though, and I can take a look.

TomAugspurger · 2018-02-06T17:40:29Z

doc/source/whatsnew/v0.23.0.txt

@@ -646,6 +646,7 @@ Groupby/Resample/Rolling
 - Bug in :func:`DataFrame.groupby` where tuples were interpreted as lists of keys rather than as keys (:issue:`17979`, :issue:`18249`)
 - Bug in :func:`DataFrame.transform` where particular aggregation functions were being incorrectly cast to match the dtype(s) of the grouped data (:issue:`19200`)
 - Bug in :func:`DataFrame.groupby` passing the `on=` kwarg, and subsequently using ``.apply()`` (:issue:`17813`)
+- Bug in :func:`DataFrame.aggregate` on non-existant columns (:issue:`16766`)


Say what the issue was, also I think the issue was with DataFrame.resample().aggregate, not DataFrame.aggregate, right?

So something like

Bug in :func:`DataFrame.resample` not raising a a `KeyError` when aggregating a non-existent column.

TomAugspurger · 2018-02-06T17:41:12Z

pandas/tests/test_resample.py

@@ -659,6 +659,21 @@ def f():
                                'B': {'rb': ['mean', 'std']}})
            assert_frame_equal(result, expected, check_like=True)

+    def test_try_aggregate_non_existing_column(self):


Good to keep it.

pep8speaks · 2018-02-06T18:21:30Z

Hello @discort! Thanks for updating the PR.

Cheers ! There are no PEP8 issues in this Pull Request. 🍻

Comment last updated on February 07, 2018 at 08:24 Hours UTC

codecov · 2018-02-07T06:38:30Z

Codecov Report

Merging #19552 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master   #19552      +/-   ##
==========================================
+ Coverage    91.6%    91.6%   +<.01%     
==========================================
  Files         150      150              
  Lines       48750    48752       +2     
==========================================
+ Hits        44656    44660       +4     
+ Misses       4094     4092       -2

Flag	Coverage Δ
#multiple	`89.97% <100%> (ø)`	⬆️
#single	`41.75% <0%> (-0.01%)`	⬇️

Impacted Files	Coverage Δ
pandas/core/base.py	`96.78% <100%> (+0.01%)`	⬆️
pandas/util/testing.py	`83.85% <0%> (+0.2%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 983d71f...d142bc1. Read the comment docs.

discort · 2018-02-07T09:26:40Z

@TomAugspurger

jreback · 2018-02-07T11:13:06Z

thanks! keep em coming!

jreback · 2018-02-07T11:14:32Z

pandas/core/base.py

@@ -392,6 +392,10 @@ def nested_renaming_depr(level=4):

                    elif isinstance(obj, ABCSeries):
                        nested_renaming_depr()
+                    elif isinstance(obj, ABCDataFrame) and \
+                            k not in obj.columns:
+                        raise ValueError(


actuallly I think this should be a KeyError. can you do another PR to fix?

xref pandas-dev#19552

xref #19552

)

…9566) xref pandas-dev#19552

discort commented Feb 6, 2018

View reviewed changes

TomAugspurger reviewed Feb 6, 2018

View reviewed changes

TomAugspurger added the Resample resample method label Feb 6, 2018

discort force-pushed the fix_16766 branch from 1aec805 to 57388b2 Compare February 6, 2018 18:21

discort force-pushed the fix_16766 branch 2 times, most recently from 0d1ebe4 to 72f328e Compare February 7, 2018 06:38

fixed bug in df.aggregate passing non-existent columns

d142bc1

discort force-pushed the fix_16766 branch from 72f328e to d142bc1 Compare February 7, 2018 08:24

jreback added this to the 0.23.0 milestone Feb 7, 2018

jreback added the Error Reporting Incorrect or improved errors from pandas label Feb 7, 2018

jreback merged commit ccf9677 into pandas-dev:master Feb 7, 2018

jreback reviewed Feb 7, 2018

View reviewed changes

jreback added a commit to jreback/pandas that referenced this pull request Feb 7, 2018

ERR: raise KeyError on invalid column name in aggregate

b2fe489

xref pandas-dev#19552

jreback mentioned this pull request Feb 7, 2018

ERR: raise KeyError on invalid column name in aggregate #19566

Merged

jreback added a commit to jreback/pandas that referenced this pull request Feb 7, 2018

ERR: raise KeyError on invalid column name in aggregate

36930dc

xref pandas-dev#19552

jreback added a commit to jreback/pandas that referenced this pull request Feb 7, 2018

ERR: raise KeyError on invalid column name in aggregate

9a07f9c

xref pandas-dev#19552

jreback added a commit that referenced this pull request Feb 7, 2018

ERR: raise KeyError on invalid column name in aggregate (#19566)

4e1fcba

xref #19552

harisbal pushed a commit to harisbal/pandas that referenced this pull request Feb 28, 2018

fixed bug in df.aggregate passing non-existent columns (pandas-dev#19552

0359bd6

)

harisbal pushed a commit to harisbal/pandas that referenced this pull request Feb 28, 2018

ERR: raise KeyError on invalid column name in aggregate (pandas-dev#1…

e5fa17c

…9566) xref pandas-dev#19552

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

using DataFrame.resample with 'agg' method on non-existant columns provides unexpected behavior #19552

using DataFrame.resample with 'agg' method on non-existant columns provides unexpected behavior #19552

discort commented Feb 6, 2018 •

edited

Loading

discort Feb 6, 2018

TomAugspurger Feb 6, 2018

TomAugspurger commented Feb 6, 2018

TomAugspurger Feb 6, 2018

TomAugspurger Feb 6, 2018

pep8speaks commented Feb 6, 2018 •

edited

Loading

codecov bot commented Feb 7, 2018 •

edited

Loading

discort commented Feb 7, 2018

jreback commented Feb 7, 2018

jreback Feb 7, 2018

using DataFrame.resample with 'agg' method on non-existant columns provides unexpected behavior #19552

using DataFrame.resample with 'agg' method on non-existant columns provides unexpected behavior #19552

Conversation

discort commented Feb 6, 2018 • edited Loading

discort Feb 6, 2018

Choose a reason for hiding this comment

TomAugspurger Feb 6, 2018

Choose a reason for hiding this comment

TomAugspurger commented Feb 6, 2018

TomAugspurger Feb 6, 2018

Choose a reason for hiding this comment

TomAugspurger Feb 6, 2018

Choose a reason for hiding this comment

pep8speaks commented Feb 6, 2018 • edited Loading

Comment last updated on February 07, 2018 at 08:24 Hours UTC

codecov bot commented Feb 7, 2018 • edited Loading

Codecov Report

discort commented Feb 7, 2018

jreback commented Feb 7, 2018

jreback Feb 7, 2018

Choose a reason for hiding this comment

discort commented Feb 6, 2018 •

edited

Loading

pep8speaks commented Feb 6, 2018 •

edited

Loading

codecov bot commented Feb 7, 2018 •

edited

Loading