Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Spark 3.3.0 test failure: NoSuchMethodError org.apache.orc.TypeDescription.getAttributeValue #4031

Closed
tgravescs opened this issue Nov 4, 2021 · 9 comments · Fixed by #4408
Assignees
Labels
audit_3.3.0 Audit related tasks for 3.3.0 bug Something isn't working P1 Nice to have for release

Comments

@tgravescs
Copy link
Collaborator

tgravescs commented Nov 4, 2021

the nightly build fails on the Spark 3.3.0 shim layer tests:

09:03:31 OrcScanSuite:
09:03:32 *** RUN ABORTED ***
09:03:32 java.lang.NoSuchMethodError: org.apache.orc.TypeDescription.getAttributeValue(Ljava/lang/String;)Ljava/lang/String;
09:03:32 at org.apache.spark.sql.execution.datasources.orc.OrcUtils$.toCatalystType$1(OrcUtils.scala:103)
09:03:32 at org.apache.spark.sql.execution.datasources.orc.OrcUtils$.$anonfun$toCatalystSchema$1(OrcUtils.scala:118)
09:03:32 at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
09:03:32 at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
09:03:32 at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
09:03:32 at org.apache.spark.sql.execution.datasources.orc.OrcUtils$.toStructType$1(OrcUtils.scala:116)
09:03:32 at org.apache.spark.sql.execution.datasources.orc.OrcUtils$.toCatalystSchema(OrcUtils.scala:138)
09:03:32 at org.apache.spark.sql.execution.datasources.orc.OrcUtils$$anonfun$readSchema$5.applyOrElse(OrcUtils.scala:148)
09:03:32 at org.apache.spark.sql.execution.datasources.orc.OrcUtils$$anonfun$readSchema$5.applyOrElse(OrcUtils.scala:145)
09:03:32 at scala.collection.TraversableOnce.collectFirst(TraversableOnce.scala:148)
09:03:32 ...

@tgravescs tgravescs added bug Something isn't working ? - Needs Triage Need team to review and classify P0 Must have for release labels Nov 4, 2021
@tgravescs
Copy link
Collaborator Author

it looks like perhaps we are picking up wrong ORC version somehow. Is the version we are using not shaded at this point and mismatches with Spark version.

@tgravescs
Copy link
Collaborator Author

Note spark 3.3 upgraded to orc 1.7 - https://issues.apache.org/jira/browse/SPARK-34112

@tgravescs tgravescs changed the title [BUG] Spark 3.3.0 test failure: [BUG] Spark 3.3.0 test failure: NoSuchMethodError org.apache.orc.TypeDescription.getAttributeValue Nov 4, 2021
@jlowe
Copy link
Member

jlowe commented Nov 4, 2021

it looks like perhaps we are picking up wrong ORC version somehow.

This seems related to #3932. Since we're pulling in the aggregator classes (and thus sql-plugin classes and their dependencies) on the classpath, we're getting ORC 1.5.8 on the classpath.

@Salonijain27 Salonijain27 added audit_3.3.0 Audit related tasks for 3.3.0 P1 Nice to have for release and removed ? - Needs Triage Need team to review and classify P0 Must have for release labels Nov 9, 2021
@res-life
Copy link
Collaborator

res-life commented Dec 6, 2021

Spark301 depends on orc-core 1.5.10.
Spark330 depends on orc-core 1.7.1.
The plugin depends on orc-core 1.5.10 regardless of the Spark versions. This overrides the dependency path: plugin -> spark-sql 330 -> orc-core 1.7.1, because the short path has a high priority in maven.
The error throws because Spark expects orc-core 1.7.1 but only 1.5.10 in the classpath.
I tried to upgrade to orc-core 1.7.1, but compile failed, 1.7.1 deleted some methods.

[INFO] Compiling 237 Scala sources and 28 Java sources to /home/chong/code/spark-rapids/sql-plugin/target/spark330/classes ...
[ERROR] [Error] /home/chong/code/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOrcScanBase.scala:320: too many arguments (3) for method readFileData: (x$1: org.apache.orc.impl.BufferChunkList, x$2: Boolean)org.apache.orc.impl.BufferChunkList
[ERROR] [Error] /home/chong/code/spark-rapids/sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOrcScanBase.scala:742: value withBufferSize is not a member of org.apache.orc.impl.DataReaderProperties.Builder
possible cause: maybe a semicolon is missing before `value withBufferSize'?

This problem can be simplified as follows:
The new orc-core added some methods and deleted some methods.
Plugin and spark-sql both depend on orc-core, but they depend on different orc-core.
It's better to depend on the same orc-core as spark-sql does.

I think #3932 can't fix this. @jlowe @tgravescs
And I think it may have some risks if we always shade the old version.

@jlowe
Copy link
Member

jlowe commented Dec 6, 2021

#3932 would fix this, as the tests would only run against the dist jar which has ORC shaded and there would be no conflict. However the tests would then be unable to access interior classes directly, as they would be setup in the parallel-world of the dist jar.

There are two ways to address the changing ORC version across the different Spark versions we support:

  • Keep the ORC version constant in the plugin, which requires shading it in the dist jar to avoid conflicting with the ORC version Spark provides at runtime
  • Always use the ORC version that Spark ships with, which requires shimming the ORC usage in the plugin because the ORC APIs are not compatible across the ORC versions

I'm fine if we want to try going the latter route, where we stop bundling ORC and use the provided one directly via shimmed classes. It would make it easier to handle the changing ORC versions in the tests (unless the tests also access ORC directly, and then the tests themselves would need to use shims).

@res-life
Copy link
Collaborator

res-life commented Dec 10, 2021

@jlowe @tgravescs
Got the case passed after rewriting the Scala OrcScanSuite test case into the python test case.
Which is better: rewrite it or write an orc shim layer?

The details are as follows:

//  Scala case
private val fileSplitsOrc = frameFromOrc("file-splits.orc")
  testSparkResultsAreEqual("Test ORC count chunked by rows", fileSplitsOrc,
    new SparkConf().set(RapidsConf.MAX_READER_BATCH_SIZE_ROWS.key, "2048"))(frameCount)

//  python test case after rewritten
def test_orc_scan_orc_chunks():
    assert_gpu_and_cpu_are_equal_collect(
        read_orc_df("path/to/file-splits.orc"),
        conf={"spark.rapids.sql.reader.batchSizeRows": 2048})

// run the python case
(base) [chong@chong-pc spark-rapids]$ ./integration_tests/run_pyspark_from_build.sh -k test_orc_scan_orc_chunks
+++ dirname ./integration_tests/run_pyspark_from_build.sh
++ cd ./integration_tests
++ pwd -P
+ SCRIPTPATH=/home/chong/code/spark-rapids/integration_tests
+ cd /home/chong/code/spark-rapids/integration_tests
++ echo
++ tr '[:upper:]' '[:lower:]'
+ [[ '' == \t\r\u\e ]]
+ [[ -z /home/chong/progs/sparks/spark-home ]]
+ echo 'WILL RUN TESTS WITH SPARK_HOME: /home/chong/progs/sparks/spark-home'
WILL RUN TESTS WITH SPARK_HOME: /home/chong/progs/sparks/spark-home
++ /home/chong/progs/sparks/spark-home/bin/pyspark --version
++ grep -v Scala
++ awk '/version\ [0-9.]+/{print $NF}'
+ VERSION_STRING=3.3.0-SNAPSHOT
+ VERSION_STRING=3.3.0
+ [[ -z 3.3.0 ]]
+ [[ -z '' ]]
+ SPARK_SHIM_VER=spark330
+ echo 'Detected Spark version 3.3.0 (shim version: spark330)'
Detected Spark version 3.3.0 (shim version: spark330)
+ '[' -d '' ']'
++ echo /home/chong/code/spark-rapids/integration_tests/target/dependency/cudf-22.02.0-SNAPSHOT-cuda11.jar
+ CUDF_JARS=/home/chong/code/spark-rapids/integration_tests/target/dependency/cudf-22.02.0-SNAPSHOT-cuda11.jar
++ echo /home/chong/code/spark-rapids/integration_tests/../dist/target/rapids-4-spark_2.12-22.02.0-SNAPSHOT.jar
+ PLUGIN_JARS=/home/chong/code/spark-rapids/integration_tests/../dist/target/rapids-4-spark_2.12-22.02.0-SNAPSHOT.jar
++ echo '/home/chong/code/spark-rapids/integration_tests/target/rapids-4-spark-integration-tests*-spark330*.jar'
+ TEST_JARS='/home/chong/code/spark-rapids/integration_tests/target/rapids-4-spark-integration-tests*-spark330*.jar'
++ echo /home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT.jar /home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-javadoc.jar /home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-spark330tests.jar
+ UDF_EXAMPLE_JARS='/home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT.jar /home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-javadoc.jar /home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-spark330tests.jar'
+ ALL_JARS='/home/chong/code/spark-rapids/integration_tests/target/dependency/cudf-22.02.0-SNAPSHOT-cuda11.jar /home/chong/code/spark-rapids/integration_tests/../dist/target/rapids-4-spark_2.12-22.02.0-SNAPSHOT.jar /home/chong/code/spark-rapids/integration_tests/target/rapids-4-spark-integration-tests*-spark330*.jar /home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT.jar /home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-javadoc.jar /home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-spark330tests.jar'
+ echo 'AND PLUGIN JARS: /home/chong/code/spark-rapids/integration_tests/target/dependency/cudf-22.02.0-SNAPSHOT-cuda11.jar /home/chong/code/spark-rapids/integration_tests/../dist/target/rapids-4-spark_2.12-22.02.0-SNAPSHOT.jar /home/chong/code/spark-rapids/integration_tests/target/rapids-4-spark-integration-tests*-spark330*.jar /home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT.jar /home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-javadoc.jar /home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-spark330tests.jar'
AND PLUGIN JARS: /home/chong/code/spark-rapids/integration_tests/target/dependency/cudf-22.02.0-SNAPSHOT-cuda11.jar /home/chong/code/spark-rapids/integration_tests/../dist/target/rapids-4-spark_2.12-22.02.0-SNAPSHOT.jar /home/chong/code/spark-rapids/integration_tests/target/rapids-4-spark-integration-tests*-spark330*.jar /home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT.jar /home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-javadoc.jar /home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-spark330tests.jar
+ [[ '' != '' ]]
+ [[ '' != '' ]]
+ [[ '' == '' ]]
++ nvidia-smi --query-gpu=memory.free --format=csv,noheader
++ awk '{if (MAX < $1){ MAX = $1}} END {print int(MAX / (2.3 * 1024)) - 1}'
+ TEST_PARALLEL=19
+ echo 'AUTO DETECTED PARALLELISM OF 19'
AUTO DETECTED PARALLELISM OF 19
+ [[ '' != '' ]]
++ nproc
+ cpu_cores=16
+ [[ '' != '' ]]
++ awk '/MemFree/ { printf "%d\n", $2/1024 }' /proc/meminfo
+ free_mem_mib=35761
+ max_parallel_for_cpu_cores=15
+ max_parallel_for_free_memory=34
+ [[ TEST_PARALLEL -gt 15 ]]
+ echo 'set TEST_PARALLEL from 19 to 15 according to cpu cores 16'
set TEST_PARALLEL from 19 to 15 according to cpu cores 16
+ TEST_PARALLEL=15
+ [[ TEST_PARALLEL -gt 34 ]]
+ python -c 'import findspark'
Traceback (most recent call last):
  File "<string>", line 1, in <module>
ModuleNotFoundError: No module named 'findspark'
+ TEST_PARALLEL=0
+ echo 'findspark not installed cannot run tests in parallel'
findspark not installed cannot run tests in parallel
+ python -c 'import xdist.plugin'
+ echo 'FOUND xdist'
FOUND xdist
+ TEST_TYPE_PARAM=
+ [[ '' != '' ]]
+ [[ 0 -lt 2 ]]
+ TEST_PARALLEL_OPTS=()
+ MEMORY_FRACTION=1
+ RUN_DIR=/home/chong/code/spark-rapids/integration_tests/target/run_dir
+ mkdir -p /home/chong/code/spark-rapids/integration_tests/target/run_dir
+ cd /home/chong/code/spark-rapids/integration_tests/target/run_dir
+ LOCAL_ROOTDIR=/home/chong/code/spark-rapids/integration_tests
+ INPUT_PATH=/home/chong/code/spark-rapids/integration_tests
+ RUN_TESTS_COMMAND=("$SCRIPTPATH"/runtests.py --rootdir "$LOCAL_ROOTDIR" "$LOCAL_ROOTDIR"/src/main/python)
+ TEST_COMMON_OPTS=(-v -rfExXs "$TEST_TAGS" --std_input_path="$INPUT_PATH"/src/test/resources --color=yes $TEST_TYPE_PARAM "$TEST_ARGS" $RUN_TEST_PARAMS "$@")
+ NUM_LOCAL_EXECS=0
+ MB_PER_EXEC=1024
+ CORES_PER_EXEC=1
+ SPARK_TASK_MAXFAILURES=1
+ [[ 3.3.0 < 3.1.1 ]]
+ export 'PYSP_TEST_spark_driver_extraClassPath=/home/chong/code/spark-rapids/integration_tests/target/dependency/cudf-22.02.0-SNAPSHOT-cuda11.jar:/home/chong/code/spark-rapids/integration_tests/../dist/target/rapids-4-spark_2.12-22.02.0-SNAPSHOT.jar:/home/chong/code/spark-rapids/integration_tests/target/rapids-4-spark-integration-tests*-spark330*.jar:/home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT.jar:/home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-javadoc.jar:/home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-spark330tests.jar'
+ PYSP_TEST_spark_driver_extraClassPath='/home/chong/code/spark-rapids/integration_tests/target/dependency/cudf-22.02.0-SNAPSHOT-cuda11.jar:/home/chong/code/spark-rapids/integration_tests/../dist/target/rapids-4-spark_2.12-22.02.0-SNAPSHOT.jar:/home/chong/code/spark-rapids/integration_tests/target/rapids-4-spark-integration-tests*-spark330*.jar:/home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT.jar:/home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-javadoc.jar:/home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-spark330tests.jar'
+ export 'PYSP_TEST_spark_executor_extraClassPath=/home/chong/code/spark-rapids/integration_tests/target/dependency/cudf-22.02.0-SNAPSHOT-cuda11.jar:/home/chong/code/spark-rapids/integration_tests/../dist/target/rapids-4-spark_2.12-22.02.0-SNAPSHOT.jar:/home/chong/code/spark-rapids/integration_tests/target/rapids-4-spark-integration-tests*-spark330*.jar:/home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT.jar:/home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-javadoc.jar:/home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-spark330tests.jar'
+ PYSP_TEST_spark_executor_extraClassPath='/home/chong/code/spark-rapids/integration_tests/target/dependency/cudf-22.02.0-SNAPSHOT-cuda11.jar:/home/chong/code/spark-rapids/integration_tests/../dist/target/rapids-4-spark_2.12-22.02.0-SNAPSHOT.jar:/home/chong/code/spark-rapids/integration_tests/target/rapids-4-spark-integration-tests*-spark330*.jar:/home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT.jar:/home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-javadoc.jar:/home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-spark330tests.jar'
+ export 'PYSP_TEST_spark_driver_extraJavaOptions=-ea -Duser.timezone=UTC '
+ PYSP_TEST_spark_driver_extraJavaOptions='-ea -Duser.timezone=UTC '
+ export 'PYSP_TEST_spark_executor_extraJavaOptions=-ea -Duser.timezone=UTC'
+ PYSP_TEST_spark_executor_extraJavaOptions='-ea -Duser.timezone=UTC'
+ export PYSP_TEST_spark_ui_showConsoleProgress=false
+ PYSP_TEST_spark_ui_showConsoleProgress=false
+ export PYSP_TEST_spark_sql_session_timeZone=UTC
+ PYSP_TEST_spark_sql_session_timeZone=UTC
+ export PYSP_TEST_spark_sql_shuffle_partitions=12
+ PYSP_TEST_spark_sql_shuffle_partitions=12
+ export PYSP_TEST_spark_dynamicAllocation_enabled=false
+ PYSP_TEST_spark_dynamicAllocation_enabled=false
+ DB_DEPLOY_CONF=/databricks/common/conf/deploy.conf
+ [[ -f /databricks/common/conf/deploy.conf ]]
+ export PYSP_TEST_spark_task_maxFailures=1
+ PYSP_TEST_spark_task_maxFailures=1
+ (( NUM_LOCAL_EXECS > 0 ))
+ '[' -z '' ']'
+ [[ '' != *\-\-\m\a\s\t\e\r* ]]
+ export 'PYSP_TEST_spark_master=local[*,1]'
+ PYSP_TEST_spark_master='local[*,1]'
+ (( 0 > 0 ))
+ /home/chong/progs/sparks/spark-home/bin/spark-submit --jars '/home/chong/code/spark-rapids/integration_tests/target/dependency/cudf-22.02.0-SNAPSHOT-cuda11.jar,/home/chong/code/spark-rapids/integration_tests/../dist/target/rapids-4-spark_2.12-22.02.0-SNAPSHOT.jar,/home/chong/code/spark-rapids/integration_tests/target/rapids-4-spark-integration-tests*-spark330*.jar,/home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT.jar,/home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-javadoc.jar,/home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-spark330tests.jar' --driver-java-options '-ea -Duser.timezone=UTC ' /home/chong/code/spark-rapids/integration_tests/runtests.py --rootdir /home/chong/code/spark-rapids/integration_tests /home/chong/code/spark-rapids/integration_tests/src/main/python -v -rfExXs '' --std_input_path=/home/chong/code/spark-rapids/integration_tests/src/test/resources --color=yes '' -k test_orc_scan_orc_chunks
21/12/10 03:12:53 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
21/12/10 03:12:54 INFO SparkContext: Running Spark version 3.3.0-SNAPSHOT
21/12/10 03:12:54 INFO ResourceUtils: ==============================================================
21/12/10 03:12:54 INFO ResourceUtils: No custom resources configured for spark.driver.
21/12/10 03:12:54 INFO ResourceUtils: ==============================================================
21/12/10 03:12:54 INFO SparkContext: Submitted application: rapids spark plugin integration tests (python)
21/12/10 03:12:54 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
21/12/10 03:12:54 INFO ResourceProfile: Limiting resource is cpu
21/12/10 03:12:54 INFO ResourceProfileManager: Added ResourceProfile id: 0
21/12/10 03:12:54 INFO SecurityManager: Changing view acls to: chong
21/12/10 03:12:54 INFO SecurityManager: Changing modify acls to: chong
21/12/10 03:12:54 INFO SecurityManager: Changing view acls groups to: 
21/12/10 03:12:54 INFO SecurityManager: Changing modify acls groups to: 
21/12/10 03:12:54 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(chong); groups with view permissions: Set(); users  with modify permissions: Set(chong); groups with modify permissions: Set()
21/12/10 03:12:54 INFO Utils: Successfully started service 'sparkDriver' on port 39762.
21/12/10 03:12:54 INFO SparkEnv: Registering MapOutputTracker
21/12/10 03:12:54 INFO SparkEnv: Registering BlockManagerMaster
21/12/10 03:12:54 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
21/12/10 03:12:54 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
21/12/10 03:12:54 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
21/12/10 03:12:54 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-33226858-4018-40e6-bbfb-d3b12fc01356
21/12/10 03:12:54 INFO MemoryStore: MemoryStore started with capacity 397.5 MiB
21/12/10 03:12:54 INFO SparkEnv: Registering OutputCommitCoordinator
21/12/10 03:12:54 INFO Utils: Successfully started service 'SparkUI' on port 4040.
21/12/10 03:12:54 INFO SparkContext: Added JAR file:///home/chong/code/spark-rapids/integration_tests/target/dependency/cudf-22.02.0-SNAPSHOT-cuda11.jar at spark://chong-pc:39762/jars/cudf-22.02.0-SNAPSHOT-cuda11.jar with timestamp 1639105974229
21/12/10 03:12:54 INFO SparkContext: Added JAR file:///home/chong/code/spark-rapids/dist/target/rapids-4-spark_2.12-22.02.0-SNAPSHOT.jar at spark://chong-pc:39762/jars/rapids-4-spark_2.12-22.02.0-SNAPSHOT.jar with timestamp 1639105974229
21/12/10 03:12:54 INFO SparkContext: Added JAR file:///home/chong/code/spark-rapids/udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT.jar at spark://chong-pc:39762/jars/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT.jar with timestamp 1639105974229
21/12/10 03:12:54 INFO SparkContext: Added JAR file:///home/chong/code/spark-rapids/udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-javadoc.jar at spark://chong-pc:39762/jars/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-javadoc.jar with timestamp 1639105974229
21/12/10 03:12:54 INFO SparkContext: Added JAR file:///home/chong/code/spark-rapids/udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-spark330tests.jar at spark://chong-pc:39762/jars/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-spark330tests.jar with timestamp 1639105974229
21/12/10 03:12:54 INFO ShimLoader: Loading shim for Spark version: 3.3.0-SNAPSHOT
21/12/10 03:12:54 INFO ShimLoader: Complete Spark build info: 3.3.0-SNAPSHOT, ssh://[email protected]:12051/nvspark/spark.git, HEAD, 510a5e779368bdd00fa821f7e4fdee78268ec91b, 2021-12-06T15:33:10Z
21/12/10 03:12:54 INFO ShimLoader: Forcing shim caller classloader update (default behavior). If it causes issues with userClassPathFirst, set spark.rapids.force.caller.classloader to false!
21/12/10 03:12:54 INFO ShimLoader: Falling back on ShimLoader caller's classloader org.apache.spark.util.MutableURLClassLoader@302f7971
21/12/10 03:12:54 INFO ShimLoader: Updating spark classloader org.apache.spark.util.MutableURLClassLoader@302f7971 with the URLs: jar:file:/home/chong/code/spark-rapids/dist/target/rapids-4-spark_2.12-22.02.0-SNAPSHOT.jar!/spark3xx-common/, jar:file:/home/chong/code/spark-rapids/dist/target/rapids-4-spark_2.12-22.02.0-SNAPSHOT.jar!/spark330/
21/12/10 03:12:54 INFO ShimLoader: Spark classLoader org.apache.spark.util.MutableURLClassLoader@302f7971 updated successfully
21/12/10 03:12:54 INFO RapidsPluginUtils: RAPIDS Accelerator build: {version=22.02.0-SNAPSHOT, user=chong, [email protected]:res-life/spark-rapids.git, date=2021-12-10T03:01:56Z, revision=61f51ffd6bf521f9c87451b04ec5a7d6011837f8, cudf_version=22.02.0-SNAPSHOT, branch=branch-22.02}
21/12/10 03:12:54 INFO RapidsPluginUtils: cudf build: {version=22.02.0-SNAPSHOT, user=, date=2021-12-09T09:54:03Z, revision=024003ca444f9d1a8374a1133337419f22cc880a, branch=HEAD}
21/12/10 03:12:54 WARN RapidsPluginUtils: RAPIDS Accelerator 22.02.0-SNAPSHOT using cudf 22.02.0-SNAPSHOT. To disable GPU support set `spark.rapids.sql.enabled` to false
21/12/10 03:12:54 INFO DriverPluginContainer: Initialized driver component for plugin com.nvidia.spark.SQLPlugin.
21/12/10 03:12:54 INFO Executor: Starting executor ID driver on host chong-pc
21/12/10 03:12:54 INFO Executor: Starting executor with user classpath (userClassPathFirst = false): 'file:/home/chong/code/spark-rapids/integration_tests/target/dependency/cudf-22.02.0-SNAPSHOT-cuda11.jar,file:/home/chong/code/spark-rapids/integration_tests/../dist/target/rapids-4-spark_2.12-22.02.0-SNAPSHOT.jar,file:/home/chong/code/spark-rapids/integration_tests/target/rapids-4-spark-integration-tests*-spark330*.jar,file:/home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT.jar,file:/home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-javadoc.jar,file:/home/chong/code/spark-rapids/integration_tests/../udf-examples/target/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-spark330tests.jar,file:/home/chong/code/spark-rapids/integration_tests/target/run_dir/cudf-22.02.0-SNAPSHOT-cuda11.jar,file:/home/chong/code/spark-rapids/integration_tests/target/run_dir/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-javadoc.jar,file:/home/chong/code/spark-rapids/integration_tests/target/run_dir/rapids-4-spark-integration-tests*-spark330*.jar,file:/home/chong/code/spark-rapids/integration_tests/target/run_dir/rapids-4-spark_2.12-22.02.0-SNAPSHOT.jar,file:/home/chong/code/spark-rapids/integration_tests/target/run_dir/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT.jar,file:/home/chong/code/spark-rapids/integration_tests/target/run_dir/rapids-4-spark-udf-examples_2.12-22.02.0-SNAPSHOT-spark330tests.jar'
21/12/10 03:12:54 INFO RapidsExecutorPlugin: RAPIDS Accelerator build: {version=22.02.0-SNAPSHOT, user=chong, [email protected]:res-life/spark-rapids.git, date=2021-12-10T03:01:56Z, revision=61f51ffd6bf521f9c87451b04ec5a7d6011837f8, cudf_version=22.02.0-SNAPSHOT, branch=branch-22.02}
21/12/10 03:12:54 INFO RapidsExecutorPlugin: cudf build: {version=22.02.0-SNAPSHOT, user=, date=2021-12-09T09:54:03Z, revision=024003ca444f9d1a8374a1133337419f22cc880a, branch=HEAD}
21/12/10 03:12:55 INFO RapidsExecutorPlugin: Initializing memory from Executor Plugin
21/12/10 03:12:57 INFO GpuDeviceManager: Initializing RMM ARENA pool size = 47523.3125 MB on gpuId 0
21/12/10 03:12:57 INFO GpuDeviceManager: Using per-thread default stream
21/12/10 03:12:57 INFO ShimDiskBlockManager: Created local directory at /tmp/blockmgr-126f46c7-8dc3-4f86-96cb-8f228737032e
21/12/10 03:12:57 INFO RapidsBufferCatalog: Installing GPU memory handler for spill
21/12/10 03:12:57 INFO RapidsExecutorPlugin: The number of concurrent GPU tasks allowed is 1
21/12/10 03:12:57 INFO ExecutorPluginContainer: Initialized executor component for plugin com.nvidia.spark.SQLPlugin.
21/12/10 03:12:57 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 38269.
21/12/10 03:12:57 INFO NettyBlockTransferService: Server created on chong-pc:38269
21/12/10 03:12:57 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
21/12/10 03:12:57 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, chong-pc, 38269, None)
21/12/10 03:12:57 INFO BlockManagerMasterEndpoint: Registering block manager chong-pc:38269 with 397.5 MiB RAM, BlockManagerId(driver, chong-pc, 38269, None)
21/12/10 03:12:57 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, chong-pc, 38269, None)
21/12/10 03:12:57 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, chong-pc, 38269, None)
21/12/10 03:12:58 INFO SingleEventLogFileWriter: Logging events to file:/home/chong/code/spark-rapids/integration_tests/target/run_dir/eventlog_gw0/local-1639105974812.zstd.inprogress
21/12/10 03:12:58 WARN Plugin: Installing rapids UDF compiler extensions to Spark. The compiler is disabled by default. To enable it, set `spark.rapids.sql.udfCompiler.enabled` to true
============================= test session starts ==============================
platform linux -- Python 3.8.8, pytest-6.2.4, py-1.10.0, pluggy-0.13.1 -- /opt/conda/bin/python3
cachedir: .pytest_cache
rootdir: /home/chong/code/spark-rapids/integration_tests, configfile: pytest.ini
plugins: forked-1.3.0, xdist-2.3.0
collecting ... 21/12/10 03:13:00 WARN SparkSession: Using an existing SparkSession; the static sql configurations will not take effect.
21/12/10 03:13:00 WARN SparkSession: Using an existing SparkSession; some spark core configurations may not take effect.
collected 13373 items / 13372 deselected / 1 selected

../../src/main/python/orc_test.py::test_orc_scan_orc_chunks 21/12/10 03:13:01 WARN package: Truncated the string representation of a plan since it was too large. This behavior can be adjusted by setting 'spark.sql.debug.maxToStringFields'.
PASSED       [100%]

===================== 1 passed, 13372 deselected in 4.83s ======================

@jlowe
Copy link
Member

jlowe commented Dec 10, 2021

Which is better: rewrite it or write an orc shim layer?

I think it will be better to use the ORC version that corresponds to the Spark version. We've had other issues with the shading of Hive in the plugin jar messing with GPU support of Hive UDFs, and I think things would just get simpler if we shade as little as possible.

So I'm +1 for stop shading ORC/Hive classes and use shims when ORC APIs change between Spark versions if we can get it to work. We need to double-check that we're OK with Spark installations that don't have Hive support compiled in. cc: @tgravescs in case he can think of any issues there, as I vaguely recall a problem we hit in the past where the Spark artifacts don't have a classifier for which ORC they are using (i.e.: ORC with or without Hive support) and compiling against one could lead to class not found issues when running against the other ORC at runtime.

@tgravescs
Copy link
Collaborator Author

Yes we had issues with ORC when we built against I believe the standard version and then user had orc with nohive. I don't remember exact details.

I think the minimum version of hive spark supports in 3.0 is hive 2.3 or newer. They do support hive 3.0 as well. Those have orc versions 1.3.3 and 1.6.9. Then the orc nohive profile shades hive inside of orc.

Find more info here:
https://spark.apache.org/docs/latest/sql-data-sources-hive-tables.html#interacting-with-different-versions-of-hive-metastore

Are we only using ORC Api that Spark also uses? including any changes between Spark versions. Seems like if we did that we would be relatively safe, unless of course you get CSP that modified Spark version, I guess they could modify the plugin then.

I think we need to be very careful about this both now and if someone modifies in the future. There are a lot of different ways ORC and HIVE can be picked up and I'm not sure how good they are about API compatibility.

@sameerz
Copy link
Collaborator

sameerz commented Jan 11, 2022

Requesting we move the target version to 22.04, so we can get the fix for rapidsai/cudf#9964 into RAPIDS first.

cc: @GaryShen2008

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
audit_3.3.0 Audit related tasks for 3.3.0 bug Something isn't working P1 Nice to have for release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants