merged from branch-0.5

Squashed commit of the following: commit dc66f03 commit f66c3ef ... commit 72b2e12 Signed-off-by: Firestarman <[email protected]>
NVIDIA · Mar 8, 2021 · 0cedde8 · 0cedde8
1 parent 3185811
commit 0cedde8
Show file tree

Hide file tree

Showing 118 changed files with 2,980 additions and 11,743 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
diff --git a/README.md b/README.md
@@ -5,18 +5,8 @@ The RAPIDS Accelerator for Apache Spark provides a set of plugins for
 [Apache Spark](https://spark.apache.org) that leverage GPUs to accelerate processing
 via the [RAPIDS](https://rapids.ai) libraries and [UCX](https://www.openucx.org/).
 
-![TPCxBB Like query results](./docs/img/tpcxbb-like-results.png "TPCxBB Like Query Results")
-
-The chart above shows results from running ETL queries based off of the 
-[TPCxBB benchmark](http://www.tpc.org/tpcx-bb/default.asp). These are **not** official results in
-any way. It uses a 10TB Dataset (scale factor 10,000), stored in parquet. The processing happened on
-a two node DGX-2 cluster. Each node has 96 CPU cores, 1.5TB host memory, 16 V100 GPUs, and 512 GB
-GPU memory.
-
 To get started and try the plugin out use the [getting started guide](./docs/get-started/getting-started.md).
 
-For more information about these benchmarks, see the [benchmark guide](./docs/benchmarks.md).
-
 ## Compatibility
 
 The SQL plugin tries to produce results that are bit for bit identical with Apache Spark.

diff --git a/api_validation/pom.xml b/api_validation/pom.xml
@@ -46,6 +46,12 @@
                 <spark.version>${spark311.version}</spark.version>
             </properties>
         </profile>
+        <profile>
+            <id>spark320</id>
+            <properties>
+                <spark.version>${spark320.version}</spark.version>
+            </properties>
+        </profile>
     </profiles>
 
     <dependencies>

diff --git a/docs/FAQ.md b/docs/FAQ.md
@@ -10,10 +10,10 @@ nav_order: 11
 
 ### What versions of Apache Spark does the RAPIDS Accelerator for Apache Spark support?
 
-The RAPIDS Accelerator for Apache Spark requires version 3.0.0 or 3.0.1 of Apache Spark. Because the
-plugin replaces parts of the physical plan that Apache Spark considers to be internal the code for
-those plans can change even between bug fix releases. As a part of our process, we try to stay on
-top of these changes and release updates as quickly as possible.
+The RAPIDS Accelerator for Apache Spark requires version 3.0.0, 3.0.1, 3.0.2 or 3.1.1 of Apache
+Spark. Because the plugin replaces parts of the physical plan that Apache Spark considers to be
+internal the code for those plans can change even between bug fix releases. As a part of our
+process, we try to stay on top of these changes and release updates as quickly as possible.
 
 ### Which distributions are supported?
 

diff --git a/docs/additional-functionality/rapids-shuffle.md b/docs/additional-functionality/rapids-shuffle.md
@@ -257,7 +257,10 @@ In this section, we are using a docker container built using the sample dockerfi
     | 3.0.1      | com.nvidia.spark.rapids.spark301.RapidsShuffleManager    |
     | 3.0.1 EMR  | com.nvidia.spark.rapids.spark301emr.RapidsShuffleManager |
     | 3.0.2      | com.nvidia.spark.rapids.spark302.RapidsShuffleManager    |
+    | 3.0.3      | com.nvidia.spark.rapids.spark303.RapidsShuffleManager    |
     | 3.1.1      | com.nvidia.spark.rapids.spark311.RapidsShuffleManager    |
+    | 3.1.2      | com.nvidia.spark.rapids.spark312.RapidsShuffleManager    |
+    | 3.2.0      | com.nvidia.spark.rapids.spark320.RapidsShuffleManager    |
 
 2. Recommended settings for UCX 1.9.0+
 ```shell
@@ -270,7 +273,6 @@ In this section, we are using a docker container built using the sample dockerfi
 --conf spark.executorEnv.UCX_MAX_RNDV_RAILS=1 \
 --conf spark.executorEnv.UCX_MEMTYPE_CACHE=n \
 --conf spark.executorEnv.UCX_IB_RX_QUEUE_LEN=1024 \
---conf spark.executorEnv.LD_LIBRARY_PATH=/usr/lib:/usr/lib/ucx \
 --conf spark.executor.extraClassPath=${SPARK_CUDF_JAR}:${SPARK_RAPIDS_PLUGIN_JAR}
 ```
 

diff --git a/docs/benchmarks.md b/docs/benchmarks.md
diff --git a/docs/configs.md b/docs/configs.md
@@ -139,6 +139,7 @@ Name | SQL Function(s) | Description | Default Value | Notes
 <a name="sql.expression.CreateNamedStruct"></a>spark.rapids.sql.expression.CreateNamedStruct|`named_struct`, `struct`|Creates a struct with the given field names and values|true|None|
 <a name="sql.expression.CurrentRow$"></a>spark.rapids.sql.expression.CurrentRow$| |Special boundary for a window frame, indicating stopping at the current row|true|None|
 <a name="sql.expression.DateAdd"></a>spark.rapids.sql.expression.DateAdd|`date_add`|Returns the date that is num_days after start_date|true|None|
+<a name="sql.expression.DateAddInterval"></a>spark.rapids.sql.expression.DateAddInterval| |Adds interval to date|true|None|
 <a name="sql.expression.DateDiff"></a>spark.rapids.sql.expression.DateDiff|`datediff`|Returns the number of days from startDate to endDate|true|None|
 <a name="sql.expression.DateSub"></a>spark.rapids.sql.expression.DateSub|`date_sub`|Returns the date that is num_days before start_date|true|None|
 <a name="sql.expression.DayOfMonth"></a>spark.rapids.sql.expression.DayOfMonth|`dayofmonth`, `day`|Returns the day of the month from a date or timestamp|true|None|

diff --git a/docs/download.md b/docs/download.md
@@ -21,8 +21,8 @@ This release includes additional performance improvements, including
 * Instructions on how to use [Alluxio caching](get-started/getting-started-alluxio.md) with Spark to
   leverage caching.
 
-The release is supported on Apache Spark 3.0.0, 3.0.1, 3.1.1, Databricks 7.3 ML LTS and Google Cloud
-Platform Dataproc 2.0.
+The release is supported on Apache Spark 3.0.0, 3.0.1, 3.0.2, 3.1.1, Databricks 7.3 ML LTS and
+Google Cloud Platform Dataproc 2.0.
 
 The list of all supported operations is provided [here](supported_ops.md).
 

diff --git a/docs/get-started/Dockerfile.cuda b/docs/get-started/Dockerfile.cuda
@@ -35,7 +35,6 @@ RUN set -ex && \
     ln -s /lib /lib64 && \
     mkdir -p /opt/spark && \
     mkdir -p /opt/spark/jars && \
-    mkdir -p /opt/tpch && \
     mkdir -p /opt/spark/examples && \
     mkdir -p /opt/spark/work-dir && \
     mkdir -p /opt/sparkRapidsPlugin && \