Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

REF: use fused types for groupby_helper #28886

Merged
merged 5 commits into from
Oct 11, 2019
Merged

Conversation

jbrockmendel
Copy link
Member

There will be some nice cleanups we can do after we bump cython to 0.30 (which hasnt come out yet).

Also I think there is some na-checking code that we can share between the various fused-types functions after this.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. whitespace comment.

pandas/_libs/groupby_helper.pxi.in Show resolved Hide resolved
@@ -548,9 +600,10 @@ def group_cummin(groupby_t[:, :] out,
def group_cummax(groupby_t[:, :] out,
groupby_t[:, :] values,
const int64_t[:] labels,
int ngroups,
int ngroups,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weird indenting?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it was tabs, just fixed

@jreback jreback added Dtype Conversions Unexpected or buggy dtype conversions Groupby labels Oct 10, 2019
@jreback jreback added this to the 1.0 milestone Oct 10, 2019
Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK sounds good. I'd still prefer to keep as is and just add the uint template generation as required until 0.30 gets released, but am not strongly enough opposed to block if others like this

group_rank_float64 = group_rank["float64_t"]
group_rank_float32 = group_rank["float32_t"]
group_rank_int64 = group_rank["int64_t"]
# Note: we do not have a group_rank_object because that would require a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW I don't think we really even want this #19560

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. The comment seemed appropriate since we are using the same fused type. Could add a reference to 19560 in the comment?

for i in range(ncounts):
for j in range(K):
if nobs[i, j] == 0:
if rank_t is int64_t:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you see any harm in putting this condition in the object block as well? Not sure if this is covered by tests but could see someone mistakenly assuming that the gil and nogil blocks are identical when 0.30 gets released and missing this on refactor

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i like this idea, will do

@jbrockmendel
Copy link
Member Author

I'd still prefer to keep as is and just add the uint template generation as required until 0.30 gets released

Since we're probably going to wait for a while after 0.30 is released before bumping ourselves, that is longer than I'd prefer.

As for using uint template generation in the interim, I did that before deciding that I am much more likely to screw up using tempita than I am using fused types.

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm @jreback

@jreback jreback merged commit 3954fa7 into pandas-dev:master Oct 11, 2019
@jreback
Copy link
Contributor

jreback commented Oct 11, 2019

thanks @jbrockmendel

we cannot wait for deps, who knows when they land

@jbrockmendel jbrockmendel deleted the gbcy branch October 11, 2019 15:25
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
proost pushed a commit to proost/pandas that referenced this pull request Dec 19, 2019
bongolegend pushed a commit to bongolegend/pandas that referenced this pull request Jan 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Dtype Conversions Unexpected or buggy dtype conversions Groupby
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants