From d0b313077827c352a530e82d066f5fc864bae556 Mon Sep 17 00:00:00 2001 From: Sameer Raheja Date: Tue, 20 Sep 2022 16:04:47 -0700 Subject: [PATCH 1/2] Update doc to indicate ORC and Parquet zstd write support Signed-off-by: Sameer Raheja --- docs/compatibility.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/docs/compatibility.md b/docs/compatibility.md index fc1dc559bb2..6e570b0d146 100644 --- a/docs/compatibility.md +++ b/docs/compatibility.md @@ -366,9 +366,10 @@ similar issue exists for writing dates as described to work for dates after the epoch as described [here](https://github.com/NVIDIA/spark-rapids/issues/140). -The plugin supports reading `uncompressed`, `snappy` and `zlib` ORC files and writing `uncompressed` - and `snappy` ORC files. At this point, the plugin does not have the ability to fall back to the - CPU when reading an unsupported compression format, and will error out in that case. +The plugin supports reading `uncompressed`, `snappy`, `zlib` and `zstd` ORC files and writing + `uncompressed`, `snappy` and `zstd` ORC files. At this point, the plugin does not have the ability + to fall back to the CPU when reading an unsupported compression format, and will error out in that + case. ### Push Down Aggregates for ORC @@ -437,8 +438,8 @@ issue, turn off the ParquetWriter acceleration for timestamp columns by either s set `spark.sql.parquet.outputTimestampType` to `TIMESTAMP_MICROS` or `TIMESTAMP_MILLIS` to by -pass the issue entirely. -The plugin supports reading `uncompressed`, `snappy` and `gzip` Parquet files and writing -`uncompressed` and `snappy` Parquet files. At this point, the plugin does not have the ability to +The plugin supports reading `uncompressed`, `snappy`, `gzip` and `zstd` Parquet files and writing +`uncompressed`, `snappy` and `zstd` Parquet files. At this point, the plugin does not have the ability to fall back to the CPU when reading an unsupported compression format, and will error out in that case. From 4786f26e28f82eb3010f6896f1cfedbedc69fdcd Mon Sep 17 00:00:00 2001 From: Sameer Raheja Date: Wed, 21 Sep 2022 08:38:19 -0700 Subject: [PATCH 2/2] Remove mention of zstd write Signed-off-by: Sameer Raheja --- docs/compatibility.md | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/docs/compatibility.md b/docs/compatibility.md index 6e570b0d146..60179f5762e 100644 --- a/docs/compatibility.md +++ b/docs/compatibility.md @@ -367,9 +367,8 @@ to work for dates after the epoch as described [here](https://github.com/NVIDIA/spark-rapids/issues/140). The plugin supports reading `uncompressed`, `snappy`, `zlib` and `zstd` ORC files and writing - `uncompressed`, `snappy` and `zstd` ORC files. At this point, the plugin does not have the ability - to fall back to the CPU when reading an unsupported compression format, and will error out in that - case. + `uncompressed` and `snappy` ORC files. At this point, the plugin does not have the ability to fall + back to the CPU when reading an unsupported compression format, and will error out in that case. ### Push Down Aggregates for ORC @@ -439,7 +438,7 @@ set `spark.sql.parquet.outputTimestampType` to `TIMESTAMP_MICROS` or `TIMESTAMP_ -pass the issue entirely. The plugin supports reading `uncompressed`, `snappy`, `gzip` and `zstd` Parquet files and writing -`uncompressed`, `snappy` and `zstd` Parquet files. At this point, the plugin does not have the ability to +`uncompressed` and `snappy` Parquet files. At this point, the plugin does not have the ability to fall back to the CPU when reading an unsupported compression format, and will error out in that case.