-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GroupBy.count() returns the grouping column as both index and column #5610
Comments
hmm.... these look right? are first/last just different?
|
IMO first should do the same as g.nth(0) and last as g.nth(-1), since as mentioned in the larger PR they are not aggregations (I think breaking these are on the roadmap for 0.14?). First and last are implemented as aggs atm. Original issue is that count includes name, I also don't think it should. Will have a look at this, may be simple fix. Related to cumsum etc. including the grouped by columns (so may be a generic fix in agg). |
hmm...though you already redefined first/last to be nth's...oh well...that sounds right to me |
duh, count's applying when it should be agg-ing. PR shortly. |
|
@hayd you are doing a PR for this one? or is it already out there but missing a reference? |
apparently I did on my machine, rebased and pushed will see if it passes... mañana |
gr8! |
These seem completely wrong (I haven't changed anything yet to exlucde the 'A' column
|
What is wrong with these? |
they are not grouping (indices should be 1, 3) |
but it are non-aggregating functions? |
hmm...you are right, then these should raise (rather than silenty 'work'). |
Is fillna in the white list? I think shift is correct here though, it is shifting within the groups. |
Why should they raise? Eg The use of |
I guess these do effectively a transform by default (which is ok) |
How could I count the number of 'a' under the column name? I only need this number. what command should I use? |
GroupBy.count()
(with the defaultas_index=True
) return the grouping column both as index and as column, while other methods asfirst
andsum
keep it only as the index (which is most logical I think). This seems a minor inconsistency to me:The text was updated successfully, but these errors were encountered: