From ee29d8adc60681e1619f857acf56090e148b3ac9 Mon Sep 17 00:00:00 2001 From: Mohammad Derakhshani Date: Tue, 23 Mar 2021 10:14:27 -0700 Subject: [PATCH] spark 4.0.0-beta.1 changelog (#20050) added Changelog entry for spark 4.0.0-beta.1 --- .../azure-cosmos-spark_3-1_2-12/CHANGELOG.md | 35 ++++++++++++++----- .../azure-cosmos-spark_3-1_2-12/README.md | 14 +++++--- .../docs/quick-start.md | 3 -- 3 files changed, 37 insertions(+), 15 deletions(-) diff --git a/sdk/cosmos/azure-cosmos-spark_3-1_2-12/CHANGELOG.md b/sdk/cosmos/azure-cosmos-spark_3-1_2-12/CHANGELOG.md index 3d7c59ef6400a..ba29f04da46ab 100644 --- a/sdk/cosmos/azure-cosmos-spark_3-1_2-12/CHANGELOG.md +++ b/sdk/cosmos/azure-cosmos-spark_3-1_2-12/CHANGELOG.md @@ -1,11 +1,30 @@ ## Release History -## 4.0.0-beta.1 (Unreleased) -## 4.0.0-alpha.1 +## 4.0.0-beta.1 (2021-03-22) +* Cosmos DB Spark 3.1.1 Connector Preview `4.0.0-beta.1` Release. +### Features +* Supports Spark 3.1.1 and Scala 2.12. +* Integrated against Spark3 DataSourceV2 API. +* Devloped ground up using Cosmos DB Java V4 SDK. +* Added support for Spark Query, Write, and Streaming. +* Added support for Spark3 Catalog metadata APIs. +* Added support for Java V4 Throughput Control. +* Added support for different partitioning strategies +* Integrated against Cosmos DB TCP protocol. +* Added support for Databricks automated Maven Resolver. +* Added support for broadcasting CosmosClient caches to reduce bootstrapping RU throttling. +* Added support for unified jackson ObjectNode to SparkRow Converter. +* Added support for Raw Json format. +* Added support for Config Validation. +* Added support for Spark application configuration consolidation. +* Integrated against Cosmos DB FeedRange API to support Partition Split Proofing. +* Automated CI testing on DataBricks and Cosmos DB live endpoint. +* Automated CI Testing on Cosmos DB Emulator. -#### New Features -* TBD -#### Renames -* TBD -#### Key Bug Fixes -* TBD +### Known limitations +* Spark structured streaming (micro batches) for consuming change feed has been implemented but not tested end-to-end fully so is considered experimental at this point +* No support for continous processing (change feed) yet +* No perf tests / optimizations have been done yet - we will iterate on perf in the next preview releases. So usage should be limited to non-production environments with this preview. + +## 4.0.0-alpha.1 (2021-03-17) +* Cosmos DB Spark 3.1.1 Connector Test Release. diff --git a/sdk/cosmos/azure-cosmos-spark_3-1_2-12/README.md b/sdk/cosmos/azure-cosmos-spark_3-1_2-12/README.md index 7e4237653c1de..7330d8df85798 100644 --- a/sdk/cosmos/azure-cosmos-spark_3-1_2-12/README.md +++ b/sdk/cosmos/azure-cosmos-spark_3-1_2-12/README.md @@ -32,11 +32,17 @@ https://github.com/Azure/azure-sdk-for-java/issues/new | ------------- | ------------- | -------------------- | ----------------------- | | 4.0.0-beta.1 | 3.1.1 | 8 | 2.12 | -## Beta version package +## Download -Beta version built from `feature/cosmos/spark30` branch are available, you can refer to - the [instruction](https://github.com/Azure/azure-sdk-for-java/blob/master/CONTRIBUTING.md#nightly-package-builds) -to use beta version packages. +You can use the maven coordinate of the jar to auto install the Spark Connector to your Databricks Runtime 8 from Maven: +`com.azure.cosmos.spark:azure-cosmos-spark_3-1_2-12:4.0.0-beta.1` + +You can also integrate against Cosmos DB Spark Connector in your SBT project: +```scala +libraryDependencies += "com.azure.cosmos.spark" % "azure-cosmos-spark_3-1_2-12" % "4.0.0-beta.1" +``` + +Cosmos DB Spark Connector is available on [Maven Central Repo](https://search.maven.org/artifact/com.azure.cosmos.spark/azure-cosmos-spark_3-1_2-12/4.0.0-beta.1/jar). ### General diff --git a/sdk/cosmos/azure-cosmos-spark_3-1_2-12/docs/quick-start.md b/sdk/cosmos/azure-cosmos-spark_3-1_2-12/docs/quick-start.md index e53d45ead0285..4a7870da78328 100644 --- a/sdk/cosmos/azure-cosmos-spark_3-1_2-12/docs/quick-start.md +++ b/sdk/cosmos/azure-cosmos-spark_3-1_2-12/docs/quick-start.md @@ -128,8 +128,5 @@ df = spark.read.format("cosmos.items").options(**cfg)\ df.printSchema() ``` -Note when running queries unless if are interested to get back the raw json payload -we recommend setting `spark.cosmos.read.inferSchemaEnabled` to be `true`. - see [Schema Inference Configuration](https://github.com/Azure/azure-sdk-for-java/blob/feature/cosmos/spark30/sdk/cosmos/azure-cosmos-spark_3-1_2-12/docs/configuration-reference.md#schema-inference-config) for more detail.