Support bucketing write for GPU #10957

firestarman · 2024-06-02T13:50:57Z

close #22

This PR adds the GPU support for the bucketing write.

React the code of the dynamic partition single writer and concurrent writer to try to reuse the code as much as possible, and then add in the bucketing write logic for both of them.
Update the bucket check during the plan overriding for the write commands, including InsertIntoHadoopFsRelationCommand, CreateDataSourceTableAsSelectCommand, InsertIntoHiveTable, CreateHiveTableAsSelectCommand.
From 330, Spark also supports HiveHash to generate the bucket IDs, in addition to Murmur3Hash. So the shim object GpuBucketingUtils is introduced to handle the shim things.
This change also adds two functions (tagForHiveBucketingWrite and tagForBucketing) to do the overriding check for the two hashing functions separately. And the Hive write nodes will fall back to CPU when HiveHash is chosen, because HiveHash is not supported on GPU.
This change also adds the basic tests for this new feature.

Signed-off-by: Firestarman <[email protected]>

…ite-bucketed

firestarman · 2024-06-02T14:06:31Z

Make it draft for early reviews and running tests on DB by ci.

firestarman · 2024-06-02T14:06:37Z

build

Signed-off-by: Firestarman <[email protected]>

firestarman · 2024-06-03T02:58:24Z

build

Signed-off-by: Firestarman <[email protected]>

firestarman · 2024-06-03T03:45:21Z

build

Signed-off-by: Firestarman <[email protected]>

firestarman · 2024-06-07T06:38:13Z

build

firestarman · 2024-06-11T05:07:50Z

build

revans2

I honestly didn't make it though the entire patch. It is very large. I'll try to find time to finish it soon.

What performance and scale testing have we done with this?

integration_tests/src/main/python/orc_write_test.py

integration_tests/src/main/python/parquet_write_test.py

sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala

sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuFileFormatDataWriter.scala

firestarman · 2024-06-12T01:34:53Z

I honestly didn't make it though the entire patch. It is very large. I'll try to find time to finish it soon.

Yeah, it is big, thx a lot for the review.

What performance and scale testing have we done with this?

not yet, since I thought it is not necessary for feature PRs. But I am happy to run NDS with this if you prefer.

Signed-off-by: Firestarman <[email protected]>

firestarman · 2024-06-12T09:25:48Z

build

revans2 · 2024-06-12T13:53:06Z

not yet, since I thought it is not necessary for feature PRs. But I am happy to run NDS with this if you prefer.

I am not concerned much about NDS. We are not doing any bucketed writes/reads in NDS. If you think we should change that we can, but it would take some analysis to see how we wanted to do that. I am more concerned about how our performance compares to the CPU for similar situations. Especially for writes. We already know that reads should be much faster because we already are really good at joins and this should reduce the shuffle ahead of a join.

revans2

Still didn't get through everything, but I am getting closer.

integration_tests/src/main/python/parquet_write_test.py

integration_tests/src/main/python/orc_write_test.py

sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala

sql-plugin/src/main/scala/org/apache/spark/sql/rapids/GpuFileFormatDataWriter.scala

Signed-off-by: Firestarman <[email protected]>

firestarman · 2024-06-13T03:28:40Z

build

Signed-off-by: Firestarman <[email protected]>

revans2

I think I have made my way through all of the code now. But this is large enough I don't trust myself and would like at least one other person to review this too.

firestarman · 2024-06-14T00:59:30Z

build

revans2 · 2024-06-14T14:24:18Z

Looks like you had a few failures in databricks.

[2024-06-14T03:03:57.389Z] =================================== FAILURES ===================================
[2024-06-14T03:03:57.389Z] _____________________ test_buckets_write_fallback_for_map ______________________
[2024-06-14T03:03:57.389Z] [gw4] linux -- Python 3.8.10 /usr/bin/python
[2024-06-14T03:03:57.389Z] 
[2024-06-14T03:03:57.389Z] spark_tmp_path = '/tmp/pyspark_tests//0614-010744-asmgss3u-10-59-175-151-gw4-4457-1165034746/'
[2024-06-14T03:03:57.389Z] spark_tmp_table_factory = <conftest.TmpTableFactory object at 0x7f4e66d1d940>
[2024-06-14T03:03:57.389Z] 
[2024-06-14T03:03:57.389Z]     @allow_non_gpu('DataWritingCommandExec,ExecutedCommandExec,WriteFilesExec, SortExec')
[2024-06-14T03:03:57.389Z]     def test_buckets_write_fallback_for_map(spark_tmp_path, spark_tmp_table_factory):
[2024-06-14T03:03:57.389Z]         data_path = spark_tmp_path + '/ORC_DATA'
[2024-06-14T03:03:57.389Z]         gen_list = [["id", simple_string_to_string_map_gen], ["data", long_gen]]
[2024-06-14T03:03:57.389Z] >       assert_gpu_fallback_write(
[2024-06-14T03:03:57.389Z]             lambda spark, path: gen_df(spark, gen_list).selectExpr("id as b_id", "data").write
[2024-06-14T03:03:57.389Z]                 .bucketBy(4, "b_id").format('orc').mode('overwrite').option("path", path)
[2024-06-14T03:03:57.389Z]                 .saveAsTable(spark_tmp_table_factory.get()),
[2024-06-14T03:03:57.389Z]             lambda spark, path: spark.read.orc(path),
[2024-06-14T03:03:57.389Z]             data_path,
[2024-06-14T03:03:57.389Z]             'DataWritingCommandExec',
[2024-06-14T03:03:57.389Z]             conf={'spark.rapids.sql.format.orc.write.enabled': True})
[2024-06-14T03:03:57.389Z] 
[2024-06-14T03:03:57.389Z] ../../src/main/python/orc_write_test.py:237: 
[2024-06-14T03:03:57.389Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[2024-06-14T03:03:57.389Z] ../../src/main/python/asserts.py:364: in assert_gpu_fallback_write
[2024-06-14T03:03:57.389Z]     with_cpu_session(lambda spark : write_func(spark, cpu_path), conf=conf)
[2024-06-14T03:03:57.389Z] ../../src/main/python/spark_session.py:147: in with_cpu_session
[2024-06-14T03:03:57.389Z]     return with_spark_session(func, conf=copy)
[2024-06-14T03:03:57.389Z] /usr/lib/python3.8/contextlib.py:75: in inner
[2024-06-14T03:03:57.389Z]     return func(*args, **kwds)
[2024-06-14T03:03:57.389Z] ../../src/main/python/spark_session.py:131: in with_spark_session
[2024-06-14T03:03:57.389Z]     ret = func(_spark)
[2024-06-14T03:03:57.389Z] ../../src/main/python/asserts.py:364: in <lambda>
[2024-06-14T03:03:57.389Z]     with_cpu_session(lambda spark : write_func(spark, cpu_path), conf=conf)
[2024-06-14T03:03:57.389Z] ../../src/main/python/orc_write_test.py:238: in <lambda>
[2024-06-14T03:03:57.389Z]     lambda spark, path: gen_df(spark, gen_list).selectExpr("id as b_id", "data").write
[2024-06-14T03:03:57.390Z] /databricks/spark/python/pyspark/instrumentation_utils.py:48: in wrapper
[2024-06-14T03:03:57.390Z]     res = func(*args, **kwargs)
[2024-06-14T03:03:57.390Z] /databricks/spark/python/pyspark/sql/readwriter.py:1520: in saveAsTable
[2024-06-14T03:03:57.390Z]     self._jwrite.saveAsTable(name)
[2024-06-14T03:03:57.390Z] /databricks/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py:1321: in __call__
[2024-06-14T03:03:57.390Z]     return_value = get_return_value(
[2024-06-14T03:03:57.390Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[2024-06-14T03:03:57.390Z] 
[2024-06-14T03:03:57.390Z] a = ('xro639394', <py4j.clientserver.JavaClient object at 0x7f4eac089ee0>, 'o639393', 'saveAsTable')
[2024-06-14T03:03:57.390Z] kw = {}, converted = AnalysisException()
[2024-06-14T03:03:57.390Z] 
[2024-06-14T03:03:57.390Z]     def deco(*a: Any, **kw: Any) -> Any:
[2024-06-14T03:03:57.390Z]         try:
[2024-06-14T03:03:57.390Z]             return f(*a, **kw)
[2024-06-14T03:03:57.390Z]         except Py4JJavaError as e:
[2024-06-14T03:03:57.390Z]             converted = convert_exception(e.java_exception)
[2024-06-14T03:03:57.390Z]             if not isinstance(converted, UnknownException):
[2024-06-14T03:03:57.390Z]                 # Hide where the exception came from that shows a non-Pythonic
[2024-06-14T03:03:57.390Z]                 # JVM exception message.
[2024-06-14T03:03:57.390Z] >               raise converted from None
[2024-06-14T03:03:57.390Z] E               pyspark.errors.exceptions.AnalysisException: Invalid call to exprId on unresolved object
[2024-06-14T03:03:57.390Z] 
[2024-06-14T03:03:57.390Z] /databricks/spark/python/pyspark/errors/exceptions.py:234: AnalysisException
[2024-06-14T03:03:57.390Z] ---------------------------- Captured stderr setup -----------------------------
[2024-06-14T03:03:57.390Z] 2024-06-14 02:27:15 INFO     Running test 'src/main/python/orc_write_test.py::test_buckets_write_fallback_for_map[DATAGEN_SEED=1718328190, TZ=UTC, ALLOW_NON_GPU(DataWritingCommandExec,ExecutedCommandExec,WriteFilesExec, SortExec)]'
[2024-06-14T03:03:57.390Z] ------------------------------ Captured log setup ------------------------------
[2024-06-14T03:03:57.390Z] INFO     __pytest_worker_logger__:spark_init_internal.py:256 Running test 'src/main/python/orc_write_test.py::test_buckets_write_fallback_for_map[DATAGEN_SEED=1718328190, TZ=UTC, ALLOW_NON_GPU(DataWritingCommandExec,ExecutedCommandExec,WriteFilesExec, SortExec)]'
[2024-06-14T03:03:57.390Z] ----------------------------- Captured stdout call -----------------------------
[2024-06-14T03:03:57.390Z] ### CPU RUN ###
[2024-06-14T03:03:57.390Z] _____________________ test_buckets_write_fallback_for_map ______________________
[2024-06-14T03:03:57.390Z] [gw6] linux -- Python 3.8.10 /usr/bin/python
[2024-06-14T03:03:57.390Z] 
[2024-06-14T03:03:57.390Z] spark_tmp_path = '/tmp/pyspark_tests//0614-010744-asmgss3u-10-59-175-151-gw6-4465-1724103483/'
[2024-06-14T03:03:57.390Z] spark_tmp_table_factory = <conftest.TmpTableFactory object at 0x7fccdbee9ee0>
[2024-06-14T03:03:57.390Z] 
[2024-06-14T03:03:57.390Z]     @allow_non_gpu('DataWritingCommandExec,ExecutedCommandExec,WriteFilesExec, SortExec')
[2024-06-14T03:03:57.390Z]     def test_buckets_write_fallback_for_map(spark_tmp_path, spark_tmp_table_factory):
[2024-06-14T03:03:57.390Z]         data_path = spark_tmp_path + '/PARQUET_DATA'
[2024-06-14T03:03:57.390Z]         gen_list = [["id", simple_string_to_string_map_gen], ["data", long_gen]]
[2024-06-14T03:03:57.390Z] >       assert_gpu_fallback_write(
[2024-06-14T03:03:57.390Z]             lambda spark, path: gen_df(spark, gen_list).selectExpr("id as b_id", "data").write
[2024-06-14T03:03:57.390Z]                 .bucketBy(4, "b_id").format('parquet').mode('overwrite').option("path", path)
[2024-06-14T03:03:57.390Z]                 .saveAsTable(spark_tmp_table_factory.get()),
[2024-06-14T03:03:57.390Z]             lambda spark, path: spark.read.parquet(path),
[2024-06-14T03:03:57.390Z]             data_path,
[2024-06-14T03:03:57.390Z]             'DataWritingCommandExec',
[2024-06-14T03:03:57.390Z]             conf=writer_confs)
[2024-06-14T03:03:57.390Z] 
[2024-06-14T03:03:57.390Z] ../../src/main/python/parquet_write_test.py:462: 
[2024-06-14T03:03:57.390Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[2024-06-14T03:03:57.390Z] ../../src/main/python/asserts.py:364: in assert_gpu_fallback_write
[2024-06-14T03:03:57.390Z]     with_cpu_session(lambda spark : write_func(spark, cpu_path), conf=conf)
[2024-06-14T03:03:57.390Z] ../../src/main/python/spark_session.py:147: in with_cpu_session
[2024-06-14T03:03:57.390Z]     return with_spark_session(func, conf=copy)
[2024-06-14T03:03:57.390Z] /usr/lib/python3.8/contextlib.py:75: in inner
[2024-06-14T03:03:57.390Z]     return func(*args, **kwds)
[2024-06-14T03:03:57.390Z] ../../src/main/python/spark_session.py:131: in with_spark_session
[2024-06-14T03:03:57.390Z]     ret = func(_spark)
[2024-06-14T03:03:57.390Z] ../../src/main/python/asserts.py:364: in <lambda>
[2024-06-14T03:03:57.390Z]     with_cpu_session(lambda spark : write_func(spark, cpu_path), conf=conf)
[2024-06-14T03:03:57.390Z] ../../src/main/python/parquet_write_test.py:463: in <lambda>
[2024-06-14T03:03:57.390Z]     lambda spark, path: gen_df(spark, gen_list).selectExpr("id as b_id", "data").write
[2024-06-14T03:03:57.390Z] /databricks/spark/python/pyspark/instrumentation_utils.py:48: in wrapper
[2024-06-14T03:03:57.390Z]     res = func(*args, **kwargs)
[2024-06-14T03:03:57.390Z] /databricks/spark/python/pyspark/sql/readwriter.py:1520: in saveAsTable
[2024-06-14T03:03:57.390Z]     self._jwrite.saveAsTable(name)
[2024-06-14T03:03:57.390Z] /databricks/spark/python/lib/py4j-0.10.9.5-src.zip/py4j/java_gateway.py:1321: in __call__
[2024-06-14T03:03:57.390Z]     return_value = get_return_value(
[2024-06-14T03:03:57.390Z] _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
[2024-06-14T03:03:57.390Z] 
[2024-06-14T03:03:57.390Z] a = ('xro957695', <py4j.clientserver.JavaClient object at 0x7fccf3a2aee0>, 'o957694', 'saveAsTable')
[2024-06-14T03:03:57.390Z] kw = {}, converted = AnalysisException()
[2024-06-14T03:03:57.390Z] 
[2024-06-14T03:03:57.390Z]     def deco(*a: Any, **kw: Any) -> Any:
[2024-06-14T03:03:57.390Z]         try:
[2024-06-14T03:03:57.390Z]             return f(*a, **kw)
[2024-06-14T03:03:57.390Z]         except Py4JJavaError as e:
[2024-06-14T03:03:57.390Z]             converted = convert_exception(e.java_exception)
[2024-06-14T03:03:57.390Z]             if not isinstance(converted, UnknownException):
[2024-06-14T03:03:57.390Z]                 # Hide where the exception came from that shows a non-Pythonic
[2024-06-14T03:03:57.390Z]                 # JVM exception message.
[2024-06-14T03:03:57.390Z] >               raise converted from None
[2024-06-14T03:03:57.390Z] E               pyspark.errors.exceptions.AnalysisException: Invalid call to exprId on unresolved object
[2024-06-14T03:03:57.390Z] 
[2024-06-14T03:03:57.390Z] /databricks/spark/python/pyspark/errors/exceptions.py:234: AnalysisException
[2024-06-14T03:03:57.390Z] ---------------------------- Captured stderr setup -----------------------------
...
[2024-06-14T03:03:57.392Z] FAILED ../../src/main/python/orc_write_test.py::test_buckets_write_fallback_for_map[DATAGEN_SEED=1718328190, TZ=UTC, ALLOW_NON_GPU(DataWritingCommandExec,ExecutedCommandExec,WriteFilesExec, SortExec)] - pyspark.errors.exceptions.AnalysisException: Invalid call to exprId on unre...
[2024-06-14T03:03:57.392Z] FAILED ../../src/main/python/parquet_write_test.py::test_buckets_write_fallback_for_map[DATAGEN_SEED=1718328190, TZ=UTC, ALLOW_NON_GPU(DataWritingCommandExec,ExecutedCommandExec,WriteFilesExec, SortExec)] - pyspark.errors.exceptions.AnalysisException: Invalid call to exprId on unre...

firestarman · 2024-06-17T01:41:48Z

Looks like you had a few failures in databricks.

~~Yeah, checking it ...~~

It can be fixed by changing the MapGen to BinaryGen.

This is probably not a Rapids bug. I checked the error stack and it even did not go into any GPU code.

It seems to be related to the MapGen in Python. Maybe DB is doing some optimization for map data as the input from Python, but I am not 100% sure. There is also an issue related to MapGen. It is #10948.

Signed-off-by: Firestarman <[email protected]>

firestarman · 2024-06-17T02:59:09Z

build

firestarman · 2024-06-17T03:47:45Z

The lastest failure is due to the known issue #11070

firestarman · 2024-06-17T11:57:08Z

build

integration_tests/src/main/python/asserts.py

Signed-off-by: Firestarman <[email protected]>

firestarman · 2024-06-18T01:42:12Z

build

Signed-off-by: Firestarman <[email protected]>

firestarman · 2024-06-18T01:50:44Z

build

Signed-off-by: Firestarman <[email protected]>

firestarman · 2024-06-18T08:30:14Z

build

This PR adds the GPU support for the bucketing write. - React the code of the dynamic partition single writer and concurrent writer to try to reuse the code as much as possible, and then add in the bucketing write logic for both of them. - Update the bucket check during the plan overriding for the write commands, including InsertIntoHadoopFsRelationCommand, CreateDataSourceTableAsSelectCommand, InsertIntoHiveTable, CreateHiveTableAsSelectCommand. - From 330, Spark also supports HiveHash to generate the bucket IDs, in addition to Murmur3Hash. So the shim object GpuBucketingUtils is introduced to handle the shim things. - This change also adds two functions (tagForHiveBucketingWrite and tagForBucketing) to do the overriding check for the two hashing functions separately. And the Hive write nodes will fall back to CPU when HiveHash is chosen, because HiveHash is not supported on GPU. --------- Signed-off-by: Firestarman <[email protected]>

firestarman added 2 commits June 2, 2024 19:41

Support bucketed write

8f2be9b

Signed-off-by: Firestarman <[email protected]>

Merge branch 'branch-24.08' of github.com:NVIDIA/spark-rapids into wr…

07d6539

…ite-bucketed

Fix build errors for scala2.13

63e76ee

Signed-off-by: Firestarman <[email protected]>

fix a build error on db341

2257fa5

Signed-off-by: Firestarman <[email protected]>

firestarman marked this pull request as ready for review June 4, 2024 01:31

Add unit tests

e4ce9be

Signed-off-by: Firestarman <[email protected]>

firestarman added the feature request New feature or request label Jun 11, 2024

revans2 reviewed Jun 11, 2024

View reviewed changes

firestarman added 5 commits June 12, 2024 11:52

Address some comments

df6b1f7

Signed-off-by: Firestarman <[email protected]>

Address more comments

77ef948

Signed-off-by: Firestarman <[email protected]>

unify the type checks

49cf174

Signed-off-by: Firestarman <[email protected]>

add a test for a fallback case

26ff72a

Signed-off-by: Firestarman <[email protected]>

add 343 shim line

1447c87

Signed-off-by: Firestarman <[email protected]>

revans2 reviewed Jun 12, 2024

View reviewed changes

firestarman added 3 commits June 13, 2024 02:58

Address new comments

3589c62

Signed-off-by: Firestarman <[email protected]>

Add tests for orc bucketing writes

ec0ff2b

Signed-off-by: Firestarman <[email protected]>

format fix

d5e44af

Signed-off-by: Firestarman <[email protected]>

Add more tests

37d559c

Signed-off-by: Firestarman <[email protected]>

revans2 previously approved these changes Jun 13, 2024

View reviewed changes

firestarman requested review from res-life and jlowe June 14, 2024 01:00

fix a test error on DB332+

14b4d02

Signed-off-by: Firestarman <[email protected]>

firestarman dismissed revans2’s stale review via 14b4d02 June 17, 2024 02:52

jlowe reviewed Jun 17, 2024

View reviewed changes

integration_tests/src/main/python/asserts.py Outdated Show resolved Hide resolved

Address comments

1a35d11

Signed-off-by: Firestarman <[email protected]>

correct the test names

b5ecc09

Signed-off-by: Firestarman <[email protected]>

sort data for fallback tests

e649175

Signed-off-by: Firestarman <[email protected]>

jlowe approved these changes Jun 18, 2024

View reviewed changes

firestarman merged commit 4b44903 into NVIDIA:branch-24.08 Jun 24, 2024
45 checks passed

firestarman deleted the write-bucketed branch August 6, 2024 01:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support bucketing write for GPU #10957

Support bucketing write for GPU #10957

firestarman commented Jun 2, 2024 •

edited

Loading

firestarman commented Jun 2, 2024 •

edited

Loading

firestarman commented Jun 2, 2024

firestarman commented Jun 3, 2024

firestarman commented Jun 3, 2024

firestarman commented Jun 7, 2024

firestarman commented Jun 11, 2024

revans2 left a comment

firestarman commented Jun 12, 2024 •

edited

Loading

firestarman commented Jun 12, 2024

revans2 commented Jun 12, 2024

revans2 left a comment

firestarman commented Jun 13, 2024

revans2 left a comment

firestarman commented Jun 14, 2024

revans2 commented Jun 14, 2024

firestarman commented Jun 17, 2024 •

edited

Loading

firestarman commented Jun 17, 2024

firestarman commented Jun 17, 2024 •

edited

Loading

firestarman commented Jun 17, 2024

firestarman commented Jun 18, 2024

firestarman commented Jun 18, 2024

firestarman commented Jun 18, 2024

Support bucketing write for GPU #10957

Support bucketing write for GPU #10957

Conversation

firestarman commented Jun 2, 2024 • edited Loading

firestarman commented Jun 2, 2024 • edited Loading

firestarman commented Jun 2, 2024

firestarman commented Jun 3, 2024

firestarman commented Jun 3, 2024

firestarman commented Jun 7, 2024

firestarman commented Jun 11, 2024

revans2 left a comment

Choose a reason for hiding this comment

firestarman commented Jun 12, 2024 • edited Loading

firestarman commented Jun 12, 2024

revans2 commented Jun 12, 2024

revans2 left a comment

Choose a reason for hiding this comment

firestarman commented Jun 13, 2024

revans2 left a comment

Choose a reason for hiding this comment

firestarman commented Jun 14, 2024

revans2 commented Jun 14, 2024

firestarman commented Jun 17, 2024 • edited Loading

firestarman commented Jun 17, 2024

firestarman commented Jun 17, 2024 • edited Loading

firestarman commented Jun 17, 2024

firestarman commented Jun 18, 2024

firestarman commented Jun 18, 2024

firestarman commented Jun 18, 2024

firestarman commented Jun 2, 2024 •

edited

Loading

firestarman commented Jun 2, 2024 •

edited

Loading

firestarman commented Jun 12, 2024 •

edited

Loading

firestarman commented Jun 17, 2024 •

edited

Loading

firestarman commented Jun 17, 2024 •

edited

Loading