-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support more numeric types in Groupby.apply
with engine='jit'
#13729
Support more numeric types in Groupby.apply
with engine='jit'
#13729
Conversation
I ran into some issues compiling the shim library with
This means that to support shorter ints we will need to do a little more involved development on the c++ side that I think should be separated out into another PR. I'm updating the PR title to reflect that this PR itself only goes as far as rounding us out with |
raised #13736 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a huge fan of repeating the typecasting logic twice in Python and once in C++... we'll need to find a better architecture for this before we get too deep into adding more types.
_register_cuda_idx_reduction_caller("IdxMax", ty) | ||
_register_cuda_idx_reduction_caller("IdxMin", ty) | ||
|
||
_register_cuda_reduction_caller("Sum", types.int32, types.int64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It feels like the typecasting rules were hard-coded twice. Once here, and once in GroupSum
. Is there a way to reuse a single mapping / type promotion function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, this is a good point. There should only be one type mapping. Let me see what I can do here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the latest commit, I've refactored towards an attribute generating function which checks the existing registry of cuda functions for a match in order to return a signature. I think that makes it so that this is the only place in python where the signatures are written by hand. Let me know what you think.
@bdice thanks for your review. I've thought a bit about making it so that there's only one centralized place to keep all of the function signatures that could be read directly from c++ or python. Since the shim function library is built before the python is imported, I suppose it should be the source of truth as well. This leads my mind towards some kind of solution which inspects the shim file at import and generates signatures dynamically from it. @gmarkall can |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much better to only define these once in Python. I'm okay with redefining in C++ and Python if that is needed. It is probably difficult to do this dynamically. I have one question, otherwise approving.
/merge |
draft
This PR adds additional numeric dtypes to
GroupBy.apply
withengine='jit'
.