-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add limited support for captured vars and athrow #5487
Conversation
Only the primitive type variables captured from a method are supported. athrow is supported only with SparkException and is converted to raise_error. Following additional changes have also been made: * A check to reject lambdas with void return type has been added. * An operand stack bug fix for consturctor calls has been added. This commit also simplifies the code that handles method calls. Signed-off-by: Sean Lee <[email protected]>
I have created it as a draft PR because we need to decide what to do about |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass thinking about the type of catalyst exception we should convert the exception to.
// Empty the stack and convert the internal representation of | ||
// org.apache.spark.SparkException object to RaiseError, then push it to the | ||
// stack. | ||
State(locals, List(RaiseError(top.asInstanceOf[Repr.SparkExcept].message)), cond, expr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe one approach here would be to code up a new catalyst node RaiseErrorV2
or RaiseSpecificError
or something like this, where it has the CPU eval
and doGenCode
, as it would be very much like RaiseError
, but it has an extra parameter which is the specific type of exception to throw. It would then instantiate this exception and throw the specific type.
For the gpu, we could replace this new node with the GPU specific type, and for the CPU it would work the way the caller intended. @jlowe @revans2 for some 👀
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After discussing with @revans2 we were thinking it's OK if the exceptions are not the same for now. Could we make a note of this in: https://github.com/NVIDIA/spark-rapids/blob/branch-22.06/docs/additional-functionality/udf-to-catalyst-expressions.md?
We may need to revisit this if someone needs SparkException
thrown in the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated udf-to-catalyst-expressions.md
udf-compiler/src/main/scala/com/nvidia/spark/udf/CatalystExpressionBuilder.scala
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@seanprime7 ok to take this PR off draft mode? |
build |
took this PR off draft mode. |
Thanks @seanprime7 |
* Add limited support for captured vars and athrow Only the primitive type variables captured from a method are supported. athrow is supported only with SparkException and is converted to raise_error. Following additional changes have also been made: * A check to reject lambdas with void return type has been added. * An operand stack bug fix for consturctor calls has been added. This commit also simplifies the code that handles method calls. Signed-off-by: Sean Lee <[email protected]> * Update copyright years * Catch RuntimeException instead of Throwable * Update udf-to-catalyst doc
Only the primitive type variables captured from a method are supported.
athrow is supported only with SparkException and is converted to raise_error.
Following additional changes have also been made:
This commit also simplifies the code that handles method calls.
Signed-off-by: Sean Lee [email protected]