Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Series.describe returns first and last for tz-aware datetimes #21332

Merged
merged 8 commits into from
Jul 6, 2018

Conversation

louispotok
Copy link
Contributor

@louispotok louispotok commented Jun 5, 2018

GH issue 21328

@@ -336,6 +336,31 @@ def test_describe(self):
index=['count', 'unique', 'top', 'freq'])
tm.assert_series_equal(result, expected)

start = Timestamp(2018, 1, 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u add a new test and/or parametrize on the tz

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@codecov
Copy link

codecov bot commented Jun 5, 2018

Codecov Report

Merging #21332 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #21332      +/-   ##
==========================================
+ Coverage   91.93%   91.94%   +<.01%     
==========================================
  Files         159      160       +1     
  Lines       49733    49738       +5     
==========================================
+ Hits        45721    45730       +9     
+ Misses       4012     4008       -4
Flag Coverage Δ
#multiple 90.31% <100%> (ø) ⬆️
#single 41.99% <0%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/core/generic.py 96.45% <100%> (+0.24%) ⬆️
pandas/core/arrays/categorical.py 95.93% <0%> (-0.09%) ⬇️
pandas/core/groupby/groupby.py 92.65% <0%> (-0.02%) ⬇️
pandas/core/indexes/category.py 97.01% <0%> (ø) ⬆️
pandas/util/testing.py 85.33% <0%> (ø) ⬆️
pandas/core/groupby/categorical.py 95.45% <0%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d24a950...0af4758. Read the comment docs.

@@ -336,6 +336,25 @@ def test_describe(self):
index=['count', 'unique', 'top', 'freq'])
tm.assert_series_equal(result, expected)

@pytest.mark.parametrize("tz", [None, "US/Eastern"])
def test_describe_dt(self, tz):
if tz is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can put this in pytest.mark.parametrize as well.

Copy link
Contributor Author

@louispotok louispotok Jun 6, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, will do

@@ -336,6 +336,25 @@ def test_describe(self):
index=['count', 'unique', 'top', 'freq'])
tm.assert_series_equal(result, expected)

@pytest.mark.parametrize("tz", [None, "US/Eastern"])
def test_describe_dt(self, tz):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you make the test name a bit more descriptive, like test_describe_timezone_dtype

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite think that's the right name since the test also tests tz-naive datetimes. How about test_describe_dt_dtype?

@mroeschke
Copy link
Member

It would be good to add (or see if there's) a test for DataFrame.describe() with a datetime tz dtype column as well.

@louispotok
Copy link
Contributor Author

I looked through the existing tests for DataFrame.describe() (in frame/test_analytics.py). It looks like they are all testing logic about include/exclude, or what happens with column names - not the core logic of what happens when each Series is passed to describe_1d. So I don't think it makes sense to add a new test there for this functionality, which is all within the describe_1d call and therefore tested by the tests on Series.

@gfyoung gfyoung added Bug Timezones Timezone data dtype labels Jun 6, 2018
@louispotok
Copy link
Contributor Author

@jreback made the suggested change, ready to merge or further changes desired?

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good, some comments, can you add a note in 0.24.0 in bug fix section.

@@ -336,6 +336,24 @@ def test_describe(self):
index=['count', 'unique', 'top', 'freq'])
tm.assert_series_equal(result, expected)

@pytest.mark.parametrize("tz, name", [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use the tz_naive_fixture here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

(None, "tz-naive"),
("US/Eastern", "tz-aware")
])
def test_describe_dt(self, tz, name):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you name: test_describe_with_tz

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

def test_describe_dt(self, tz, name):
start = Timestamp(2018, 1, 1)
end = Timestamp(2018, 1, 5)
s = Series(date_range(start, end, tz=tz), name=name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a test for frame as well (in the appropraite file)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@louispotok louispotok force-pushed the tz-aware-describe-21328 branch from 96ac31e to a269f5f Compare June 19, 2018 17:57
@louispotok
Copy link
Contributor Author

louispotok commented Jun 19, 2018

Changes made:

  • add to whatsnew
  • use tz_naive_fixture
  • change name of series test

Ready for re-review

@@ -131,7 +131,7 @@ Datetimelike

- Fixed bug where two :class:`DateOffset` objects with different ``normalize`` attributes could evaluate as equal (:issue:`21404`)
- Bug in :class:`Index` with ``datetime64[ns, tz]`` dtype that did not localize integer data correctly (:issue:`20964`)
-
- Fixed bug where `describe` method on tz-aware datetimes did not show `first` and `last` result (:issue:`21328`)
Copy link
Member

@mroeschke mroeschke Jul 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of describe, could you adjust it to :meth:DataFrame.describe and :meth:Series.describe. It will render nicer in the whatsnew doc

@@ -417,6 +417,27 @@ def test_describe_timedelta_values(self):
"max 5 days 00:00:00 0 days 05:00:00")
assert repr(res) == exp_repr

def test_describe_tz_values(self, tz_naive_fixture):
tz = tz_naive_fixture
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment with the issue number, like GH #21332

@@ -336,6 +336,22 @@ def test_describe(self):
index=['count', 'unique', 'top', 'freq'])
tm.assert_series_equal(result, expected)

def test_describe_with_tz(self, tz_naive_fixture):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a comment with the issue number, like GH #21332

@mroeschke
Copy link
Member

Just a few documentation comments, one merge conflict, otherwise LGTM.

@louispotok louispotok force-pushed the tz-aware-describe-21328 branch from 9a7deca to dd13740 Compare July 6, 2018 00:53
@louispotok
Copy link
Contributor Author

Thanks @mroeschke . fixed.

@@ -295,6 +295,8 @@ Datetimelike
^^^^^^^^^^^^

- Fixed bug where two :class:`DateOffset` objects with different ``normalize`` attributes could evaluate as equal (:issue:`21404`)
- Bug in :class:`Index` with ``datetime64[ns, tz]`` dtype that did not localize integer data correctly (:issue:`20964`)
- Fixed bug where :meth:`DataFrame.describe` and :meth:`Series.describe` on tz-aware datetimes did not show `first` and `last` result (:issue:`21328`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more nit: Could you move this entry to the timezones section below?

Also the entry above this one exists in the timezone section already i.e. you can remove it.

Thanks for you patience.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, no problem.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mroeschke merge when you are happy with this.

@jreback jreback added this to the 0.24.0 milestone Jul 6, 2018
@mroeschke mroeschke merged commit 620abc4 into pandas-dev:master Jul 6, 2018
@mroeschke
Copy link
Member

Thanks @louispotok!

@louispotok louispotok deleted the tz-aware-describe-21328 branch July 7, 2018 01:42
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
…-dev#21332)

* Series.describe returns first and last for tz-aware datetimes

GH issue 21328

* parameterize tests

* parameterize names

* use tz_naive_fixture and fix top

* add tz describe test for df

* add bugfix to whatsnew

* fix formatting in whatsnew and add issue number to tests

* final bugfix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Timezones Timezone data dtype
Projects
None yet
Development

Successfully merging this pull request may close these issues.

describe() does not report first and last on tz-aware datetime Series
4 participants