From 20bff54605a00ddc2ac15a2863fced8fc7a4eaac Mon Sep 17 00:00:00 2001 From: Renjie Liu Date: Wed, 31 Jul 2024 23:20:21 +0800 Subject: [PATCH] Fix config format problem (#11278) Signed-off-by: liurenjie1024 --- docs/additional-functionality/advanced_configs.md | 6 +----- .../main/scala/com/nvidia/spark/rapids/RapidsConf.scala | 8 ++++---- 2 files changed, 5 insertions(+), 9 deletions(-) diff --git a/docs/additional-functionality/advanced_configs.md b/docs/additional-functionality/advanced_configs.md index 3f92ece1056..9b39fb7478a 100644 --- a/docs/additional-functionality/advanced_configs.md +++ b/docs/additional-functionality/advanced_configs.md @@ -75,11 +75,7 @@ Name | Description | Default Value | Applicable at spark.rapids.sql.csv.read.float.enabled|CSV reading is not 100% compatible when reading floats.|true|Runtime spark.rapids.sql.decimalOverflowGuarantees|FOR TESTING ONLY. DO NOT USE IN PRODUCTION. Please see the decimal section of the compatibility documents for more information on this config.|true|Runtime spark.rapids.sql.delta.lowShuffleMerge.deletionVector.broadcast.threshold|Currently we need to broadcast deletion vector to all executors to perform low shuffle merge. When we detect the deletion vector broadcast size is larger than this value, we will fallback to normal shuffle merge.|20971520|Runtime -spark.rapids.sql.delta.lowShuffleMerge.enabled|Option to turn on the low shuffle merge for Delta Lake. Currently there are some limitations for this feature: -1. We only support Databricks Runtime 13.3 and Deltalake 2.4. -2. The file scan mode must be set to PERFILE -3. The deletion vector size must be smaller than spark.rapids.sql.delta.lowShuffleMerge.deletionVector.broadcast.threshold -|false|Runtime +spark.rapids.sql.delta.lowShuffleMerge.enabled|Option to turn on the low shuffle merge for Delta Lake. Currently there are some limitations for this feature: 1. We only support Databricks Runtime 13.3 and Deltalake 2.4. 2. The file scan mode must be set to PERFILE 3. The deletion vector size must be smaller than spark.rapids.sql.delta.lowShuffleMerge.deletionVector.broadcast.threshold |false|Runtime spark.rapids.sql.detectDeltaCheckpointQueries|Queries against Delta Lake _delta_log checkpoint Parquet files are not efficient on the GPU. When this option is enabled, the plugin will attempt to detect these queries and fall back to the CPU.|true|Runtime spark.rapids.sql.detectDeltaLogQueries|Queries against Delta Lake _delta_log JSON files are not efficient on the GPU. When this option is enabled, the plugin will attempt to detect these queries and fall back to the CPU.|true|Runtime spark.rapids.sql.fast.sample|Option to turn on fast sample. If enable it is inconsistent with CPU sample because of GPU sample algorithm is inconsistent with CPU.|false|Runtime diff --git a/sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsConf.scala b/sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsConf.scala index c529ced0ab0..1cb03958111 100644 --- a/sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsConf.scala +++ b/sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsConf.scala @@ -2330,11 +2330,11 @@ val SHUFFLE_COMPRESSION_LZ4_CHUNK_SIZE = conf("spark.rapids.shuffle.compression. val ENABLE_DELTA_LOW_SHUFFLE_MERGE = conf("spark.rapids.sql.delta.lowShuffleMerge.enabled") .doc("Option to turn on the low shuffle merge for Delta Lake. Currently there are some " + - "limitations for this feature: \n" + - "1. We only support Databricks Runtime 13.3 and Deltalake 2.4. \n" + - s"2. The file scan mode must be set to ${RapidsReaderType.PERFILE} \n" + + "limitations for this feature: " + + "1. We only support Databricks Runtime 13.3 and Deltalake 2.4. " + + s"2. The file scan mode must be set to ${RapidsReaderType.PERFILE} " + "3. The deletion vector size must be smaller than " + - s"${DELTA_LOW_SHUFFLE_MERGE_DEL_VECTOR_BROADCAST_THRESHOLD.key} \n") + s"${DELTA_LOW_SHUFFLE_MERGE_DEL_VECTOR_BROADCAST_THRESHOLD.key} ") .booleanConf .createWithDefault(false)