From 2b5adcf0f2076c9220c09d1eaca1ef77c7fc0b9b Mon Sep 17 00:00:00 2001 From: Fokko Driesprong Date: Fri, 29 Sep 2023 10:57:35 +0200 Subject: [PATCH 1/3] Bump to Spark 3.4 and update docs --- README.md | 12 +++++++----- docker/Dockerfile | 2 +- 2 files changed, 8 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index fa286b1f7..eb3be92dd 100644 --- a/README.md +++ b/README.md @@ -26,18 +26,20 @@ more information, consult [the docs](https://docs.getdbt.com/docs/profile-spark) ## Running locally A `docker-compose` environment starts a Spark Thrift server and a Postgres database as a Hive Metastore backend. -Note: dbt-spark now supports Spark 3.1.1 (formerly on Spark 2.x). +Note: dbt-spark now supports Spark 3.4.1. -The following command would start two docker containers -``` +The following command starts two docker containers: + +```sh docker-compose up -d ``` + It will take a bit of time for the instance to start, you can check the logs of the two containers. If the instance doesn't start correctly, try the complete reset command listed below and then try start again. Create a profile like this one: -``` +```yaml spark_testing: target: local outputs: @@ -60,7 +62,7 @@ Connecting to the local spark instance: Note that the Hive metastore data is persisted under `./.hive-metastore/`, and the Spark-produced data under `./.spark-warehouse/`. To completely reset you environment run the following: -``` +```sh docker-compose down rm -rf ./.hive-metastore/ rm -rf ./.spark-warehouse/ diff --git a/docker/Dockerfile b/docker/Dockerfile index bb4d378ed..52d28397a 100644 --- a/docker/Dockerfile +++ b/docker/Dockerfile @@ -2,7 +2,7 @@ ARG OPENJDK_VERSION=8 FROM eclipse-temurin:${OPENJDK_VERSION}-jre ARG BUILD_DATE -ARG SPARK_VERSION=3.3.2 +ARG SPARK_VERSION=3.4.1 ARG HADOOP_VERSION=3 LABEL org.label-schema.name="Apache Spark ${SPARK_VERSION}" \ From 0da818fe70a9ab742a9ba92b45e6165bc9593689 Mon Sep 17 00:00:00 2001 From: Fokko Driesprong Date: Fri, 29 Sep 2023 16:44:49 +0200 Subject: [PATCH 2/3] Update docker/Dockerfile --- docker/Dockerfile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docker/Dockerfile b/docker/Dockerfile index 52d28397a..bb4d378ed 100644 --- a/docker/Dockerfile +++ b/docker/Dockerfile @@ -2,7 +2,7 @@ ARG OPENJDK_VERSION=8 FROM eclipse-temurin:${OPENJDK_VERSION}-jre ARG BUILD_DATE -ARG SPARK_VERSION=3.4.1 +ARG SPARK_VERSION=3.3.2 ARG HADOOP_VERSION=3 LABEL org.label-schema.name="Apache Spark ${SPARK_VERSION}" \ From 44118a34c2ae589d80f07dbeaaab979379cd40c2 Mon Sep 17 00:00:00 2001 From: Fokko Driesprong Date: Fri, 29 Sep 2023 16:45:22 +0200 Subject: [PATCH 3/3] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index eb3be92dd..2d2586795 100644 --- a/README.md +++ b/README.md @@ -26,7 +26,7 @@ more information, consult [the docs](https://docs.getdbt.com/docs/profile-spark) ## Running locally A `docker-compose` environment starts a Spark Thrift server and a Postgres database as a Hive Metastore backend. -Note: dbt-spark now supports Spark 3.4.1. +Note: dbt-spark now supports Spark 3.3.2. The following command starts two docker containers: