-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support args=
in Series.apply
#9982
Support args=
in Series.apply
#9982
Conversation
@brandon-b-miller I've been looking at this code a bit recently and we discussed some of this refactoring so feel free to explicitly request me whenever you convert this from a draft. |
I think this is getting there - cc @vyasr |
rerun tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One question (out of scope for this PR). Is it possible to enable users passing in kernels that they've already compiled, or is that too difficult? And if it is possible, should we at least throw a more friendly error if a user tries that? I remember testing this once and being surprised by the behavior.
Co-authored-by: Vyas Ramasubramani <[email protected]>
Codecov Report
@@ Coverage Diff @@
## branch-22.04 #9982 +/- ##
================================================
+ Coverage 10.37% 10.41% +0.03%
================================================
Files 119 122 +3
Lines 20149 20629 +480
================================================
+ Hits 2091 2148 +57
- Misses 18058 18481 +423
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a couple of small questions, but I think this looks mostly good from my end. Happy to move this forward once other reviewers are happy.
offsets.append(col.offset) | ||
launch_args += offsets | ||
launch_args += list(args) | ||
kernel.forall(len(self))(*launch_args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we always generate a kernel with len(self)
tasks, do we really need to pass len(self)
as one of the launch_args
? AFAICT that's just used to avoid out of bounds accesses, but it looks like we always launch a grid with one thread per row right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had taken the numba docs for forall
to mean that inserting this check was a requirement for kernels that expect to be configured this way. Indeed, taking it out leads to numerous tests failing due to nulls being in the wrong places everywhere. My assumption was that something was happening inside forall
that caused unpredictable behavior if this guard was not included. cc @gmarkall for more insight.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess maybe numba must be doing something where it generates only a limited set of kernels (maybe templates?) based on the block size and then dispatches to the closest size possible based on the argument to forall
? I would be curious to learn more about how this works from @gmarkall. It sounds like you should ignore my suggestion to change anything here though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Stellar work. I feel like reading a prose with current naming of functions.
Co-authored-by: Michael Wang <[email protected]>
@gpucibot merge |
Closes #9598
A lot of code was moved around but also slightly tweaked, making the diff a little harder to parse. Here's a summary of the changes:
Series.apply
used to simply turn the incoming scalar lambda function into a row UDF and then turn itself into a dataframe and run the code as normal. Now, it does its own separate unique processing and pipes throughFrame._apply
instead.pipeline.py
was separated out intorow_function.py
andlambda_function.py
which contain whatever is specific to each type of UDF, whereas everything that was common to both was migrated toutils.py
and generalized as much as possible.templates.py
area was created to hold all the templates and initializers needed to cat together the kernel that we need and a new template specific to series lambdas was created.compile_or_get
and this function now expects a python function object it can call that will produce the right kernel.DataFrame
andSeries
decide which one to use at the top level API._apply
fromFrame
toIndexedFrame