-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable build for Databricks 13.3 [databricks] #9677
Conversation
…replace_table_as_select
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow the approach #9508 to reduce bloat in poms
Signed-off-by: Raza Jafri <[email protected]>
build |
Latest failure is in fastparquet compatibility test which I could not reproduce on a Databricks 13.3 instance. Kicking again to see if it's reproducible. |
build |
I'm now able to reproduce the fastparquet failures, and it appears to be an issue with the fastparquet setup on Databricks 13.3. It's reading NaNs as nulls, whereas the GPU is reading NaNs as NaNs. Not sure yet why we're getting different fastparquet behavior in the DB 13.3 environment with an explicit install of fastparquet vs. what we get on the other Databricks environments. |
build |
1 similar comment
build |
build |
1 similar comment
build |
341db failed deltalake cases
|
build |
build |
pytest.param(FloatGen(nullable=False), | ||
marks=pytest.mark.xfail(is_databricks_runtime(), | ||
reason="https://github.com/NVIDIA/spark-rapids/issues/9778")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of including the following:
pytest.param(FloatGen(nullable=False), | |
marks=pytest.mark.xfail(is_databricks_runtime(), | |
reason="https://github.com/NVIDIA/spark-rapids/issues/9778")), | |
pytest.param(FloatGen(nullable=False), | |
marks=pytest.mark.xfail(is_databricks_runtime(), | |
reason="https://github.com/NVIDIA/spark-rapids/issues/9778")), | |
FloatGen(nullable=False, no_nans=True), |
Not strictly in the purview of this change. I can add this as a follow-on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, barring that single (optional) suggestion. Thanks for disabling the float-double tests.
build |
1 similar comment
build |
thanks! also cc @NvTimLiu to help setup nightly later, thanks |
Thanks for merging, @pxLi. I built three times to make sure CI would not be flaky with heap GC OOM or other problems, passed three times in a row. So we should be good with this enabled for premerge and nightly. |
This PR builds on previous PRs to add Databricks 13 support to the Spark Rapids plugin. This PR specifically adds pom changes to build the plugin with Databricks 13.3.
Changes Made:
POM changes: All the modules have been updated with a profile for 341db support
XFAIL failing tests: Tests were marked with xfail pytest marker which should be removed once the support is added for them.
PythonUDAF: Added support for PythonUDAF similar to Spark 3.5
Tests:
All the tests were updated
This is in draft mode because it should be merged only after #9644 is merged