-
Notifications
You must be signed in to change notification settings - Fork 237
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experimental support for BloomFilterAggregate expression in a reduction context [databricks] #8892
Conversation
Signed-off-by: Jason Lowe <[email protected]>
Signed-off-by: Jason Lowe <[email protected]>
build |
build |
2 similar comments
build |
build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I missed a couple of things last time, I am mostly curious about. The last changes in the integration tests LGTM.
tests/src/test/spark330/scala/com/nvidia/spark/rapids/BloomFilterAggregateQuerySuite.scala
Outdated
Show resolved
Hide resolved
tests/src/test/spark330/scala/com/nvidia/spark/rapids/BloomFilterAggregateQuerySuite.scala
Outdated
Show resolved
Hide resolved
build |
Converting to draft as it needs the fixes from #8944 and the need to upmerge to the new tests there. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally looks good.
(ReductionAggExprContext, | ||
ContextChecks(TypeSig.BINARY, TypeSig.BINARY, | ||
Seq(ParamCheck("child", TypeSig.LONG, TypeSig.LONG), | ||
ParamCheck("estimatedItems", TypeSig.lit(TypeEnum.LONG), TypeSig.LONG), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Technically Spark is also checking to be sure that estimatedItems and estimatedBits are literals, actually foldable and > 0 and <= the config. So we could mark those as lit too and then the docs just show that we fully support it instead of adding a comment that we do not.
build |
build |
Relates to #7803. Depends on #8775. Closes #8955.
Implements GPU support for BloomFilterAggregate in a reduction context. This is used by Bloom filter optimized joins which are available in Spark 3.3.0 and enabled by default in Spark 3.4.0.