Releases · elastacloud/spark-excel

29 Jun 12:38

0.1.13

5dc22b4

v0.1.13 Latest

Latest

Welcome to the latest release of the Spark Excel reader. This release brings a few changes and bug-fixes in with it including.

New parser option to disable formula evaluation. When disabled the formula itself is extracted from the sheet rather than being evaluated.
Fix for empty files. Now when a file is parsed that contains only the header record a single null record is returned instead of raising an error
Support for the latest Spark 3.4.x and 3.5.x releases
Updates to the Apache POI library and other dependencies

Another thanks goes out to @josecsotomorales for his help in this release

Contributors

josecsotomorales

Assets 17

spark-excel-3.0.1_0.1.13.jar

16.4 MB 2024-06-29T12:38:03Z
spark-excel-3.0.2_0.1.13.jar

16.4 MB 2024-06-29T12:38:02Z
spark-excel-3.0.3_0.1.13.jar

16.4 MB 2024-06-29T12:38:01Z
spark-excel-3.1.2_0.1.13.jar

16.4 MB 2024-06-29T12:37:59Z
spark-excel-3.2.1_0.1.13.jar

16.4 MB 2024-06-29T12:37:58Z
spark-excel-3.2.4_0.1.13.jar

16.4 MB 2024-06-29T12:37:56Z
spark-excel-3.3.0_0.1.13.jar

16.4 MB 2024-06-29T12:37:55Z
spark-excel-3.3.1_0.1.13.jar

16.4 MB 2024-06-29T12:37:53Z
spark-excel-3.3.2_0.1.13.jar

16.4 MB 2024-06-29T12:37:51Z
spark-excel-3.3.3_0.1.13.jar

16.4 MB 2024-06-29T12:37:50Z
Source code (zip)

2024-06-29T12:30:09Z
Source code (tar.gz)

2024-06-29T12:30:09Z

25 Oct 15:02

dazfuller

0.1.12

c8d97d0

v0.1.12

It's been a while longer this time but we're back with the 0.1.12 release of the Excel data source for Apache Spark. A big thanks to @josecsotomorales who has contributed to this release.

This release introduces the following changes

Update Apache POI to 5.2.3, bring in support for new functions and features
Add the ability for users to specify values which should be treated as null (e.g. "N/A")
Handle Log4J conflicts across Spark versions
Spark 3.4.1 support
Spark 3.5.0 support

We're always looking for people to help contibute, from code changes to feature suggestions, so if you feel like you can contribute then feel free to join in.

Contributors

josecsotomorales

Assets 15

29 May 09:22

dazfuller

0.1.11

c2511ab

v0.1.11

This release brings in support for Spark 3.4.0 (the latest version of the 3.4.x series at the time of release) along with a new feature to provide a per-row flag indicating if the row matches the provided or inferred schema.

Support for Spark 3.4
schemaMatchColumnName option for indicating if each row matches the schema
Update of Spark versions to match releases available from Apache Spark, and the versions supported by supported Databricks Runtimes and Azure Synapse Analytics.

N.B. When using Databricks, please check the Spark version used by the runtime to ensure the correct version of the package is used.

Assets 12

20 Aug 10:51

dazfuller

0.1.10

0e0c0a0

v0.1.10

Another minor increment, but bringing in Spark 3.3 support (which is big, right?)

Support for Spark 3.3
Introduces the thresholdBytesForTempFiles option which is an alias for maxBytesForTempFiles but is clearer about what is happening

Assets 12

02 Jul 16:34

dazfuller

0.1.9

12b4459

v0.1.9

Minor release with a couple of significant updates.

Update scalatest to 3.2.11
Update Apache POI to version 5.2.2
Update Apache Commons IO to 2.11.0
Introduce new option for maxBytesForTempFiles
- Resolves issue for loading larger files by setting the maximum bytes before creating temp files
- Defaults the option to 100_000_000
- Can be overridden with options

Assets 10

05 Mar 21:57

dazfuller

0.1.8

22f2fee

v0.1.8

Upgrades to version 5.2.0 of the Apache POI library, bringing general improvements and support for additional formula types such as CONCAT.

Brings in support for additional types in user-defined schemas, reducing the need to cast data once it's been read

Brings in Spark version 3.1.3 and 3.2.1 to the build profiles.

The JAR is a little larger as additional Apache libraries need to be included to support the updated POI library. In addition, Azure Synapse uses an older version of commons-io which doesn't include features required by the updated POI library.

Assets 10

23 Oct 10:36

dazfuller

0.1.7

7b02ed4

v0.1.7

Provides a general clean up of the build.sbt file, and adds support for Spark 3.2.0

Assets 8

08 Aug 09:42

dazfuller

0.1.6

d7ce3c6

v0.1.6

With no new ideas turning up in the last couple of weeks I've promoted the 0.1.6-SNAPSHOT release to final. This release includes the following changes.

Add Spark 3.0.3
Fix for cells which contain data that does not match the target schema
Better error reporting for parser option, suggesting which option the user might have meant if it has been mis-spelt
Some code cleanup

Assets 7

19 Jul 16:43

dazfuller

0.1.6-SNAPSHOT

c4fa957

v0.1.6-SNAPSHOT Pre-release

Pre-release

Snapshot release for 0.1.6, this includes

Fix for cells which contain data that does not match the target schema
Better error reporting for parser option, suggesting which option the user might have meant if it has been mis-spelt
Some code cleanup

Assets 6

03 Jul 14:11

dazfuller

0.1.5

7700109

v0.1.5

Initial public release of the Spark Excel library. Please see the README.md file for a list of current capabilities.

JARs have been attached to the release. In future the plan is to publish as well so that they can be installed using maven co-ordinates.

Assets 6

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Contributors

Contributors

Releases: elastacloud/spark-excel

v0.1.13

Contributors

v0.1.12

Contributors

v0.1.11

v0.1.10

v0.1.9

v0.1.8

v0.1.7

v0.1.6

v0.1.6-SNAPSHOT

v0.1.5