-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: agg() function on groupby dataframe changes dtype of datetime64[ns] column to float64 #12992
Conversation
can you rebase / update |
Sorry, @jreback. This month I was far busy than expected, and I'm planning to fix the bug on next Sunday. |
40922cb
to
1236422
Compare
Current coverage is 85.30% (diff: 100%)@@ master #12992 diff @@
==========================================
Files 139 139
Lines 50138 50140 +2
Methods 0 0
Messages 0 0
Branches 0 0
==========================================
+ Hits 42769 42770 +1
- Misses 7369 7370 +1
Partials 0 0
|
can you rebase / update? |
@@ -139,3 +139,5 @@ Bug Fixes | |||
- Bug in ``NaT`` - ``Period`` raises ``AttributeError`` (:issue:`13071`) | |||
- Bug in ``Period`` addition raises ``TypeError`` if ``Period`` is on right hand side (:issue:`13069`) | |||
- Bug in ``pd.set_eng_float_format()`` that would prevent NaN's from formatting (:issue:`11981`) | |||
|
|||
- Bug in ``agg()`` function on groupby dataframe changes dtype of ``datetime64[ns]`` column to ``float64`` (:issue:`12821`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug in .groupby(..).agg(..)
could change the dtype of ......
does this also address #12941 ? pls rebase / small update as well. ping on green. |
Hi, I have fixed #12941 and rebase again. |
pd.Timestamp('2012-05-01')]}) | ||
|
||
res = df.min() | ||
tm.assert_attr_equal('dtype', df['foo'], res) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
construct an expected result and use assert_series_equal
@@ -318,3 +319,7 @@ Bug Fixes | |||
|
|||
|
|||
- Bug in ``Categorical.remove_unused_categories()`` changes ``.codes`` dtype to platform int (:issue:`13261`) | |||
|
|||
- Bug in ``agg()`` function on groupby dataframe changes dtype of ``datetime64[ns]`` column to ``float64`` (:issue:`12821`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug in ``.agg()` on a groupby, changes the dtype of .....
ok looks good. pls rebase / squash (and build failed). |
rebase again. |
can you rebase and i'll review |
can you rebase. ping when green. |
@jreback, hi, agg function will eat up Name, is it a bug? |
not sure can u show s complete example |
In [12]: df = pd.Series(range(5)).to_frame()
In [13]: df
Out[13]:
0
0 0
1 1
2 2
3 3
4 4
In [14]: df.min()
Out[14]:
0 0
dtype: int64
In [15]: df.loc[0]
Out[15]:
0 0
Name: 0, dtype: int64 |
Giving a slightly more clear example. I suppose this could be done. Though its only for a very limited number of reducers, prob just min/max. For example, anything that does an aggregation like
|
@jreback Hi, I'll submit a issue later. And all checks have passed, so do you think that the pr is ok? |
# test for `first` function | ||
exp = df.loc[[0, 3, 4, 6]].set_index('class') | ||
|
||
res = df.groupby('class').first() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to test
df.groupby('class').time.first()
and so on (e.g. repeat with .agg
both forms as well)
if making a loop helps to simplify testing, pls do so
@jreback all checks have passed. |
exp = pd.Series([pd.Timestamp('2012-05-01')], index=["foo"]) | ||
tm.assert_series_equal(res, exp) | ||
|
||
# GH12941 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put a small comment here (1-line) that explains what you are testing
lgtm. pls rebase and ping on green. |
@@ -854,4 +854,9 @@ Bug Fixes | |||
- Bugs in ``Index.difference`` and ``DataFrame.join`` raise in Python3 when using mixed-integer indexes (:issue:`13432`, :issue:`12814`) | |||
|
|||
- Bug in ``.to_excel()`` when DataFrame contains a MultiIndex which contains a label with a NaN value (:issue:`13511`) | |||
|
|||
- Bug in ``pd.read_csv`` in Python 2.x with non-UTF8 encoded, multi-character separated data (:issue:`3404`) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put bug fixes in blanks spaces (or create them) in-between other fixes to avoid conflicts (IOW, don't put them at the end)
…f datetime64[ns] column to float64
thanks! |
Ok |
closes #12821
closes #12941