v2.3.2
Version 2.3.2
of azure-eventhubs-spark_2.11
. Release notes:
- Bug fixes
- Fixed data loss check in
getBatch
- excess warnings are no longer printed. - Prefetch count can no longer be set below the minimum allowed value.
- Invalid offsets are detected in
translate
.
- Fixed data loss check in
- Enhancements
- cached receivers are used. This allows receivers to be reused across batches which dramatically improves receive times. This changed
required a move to epoch receivers which means each Spark application requires it's own consumer group in order to run properly. - properties has been added to the Structured Streaming schema
- partition has been added to the Structured Streaming schema
- Check for old fromSeqNos in DStreams. If a DStream falls behind (e.g. events expire from the Event Hub before they are consumed by Spark),
then Spark will move to the earliest valid event. Previously, the job would fail in this case. - Add retry mechanism for certain exceptions thrown in
getRunTimeInfo
. Exceptions are retried if the exception (or the inner exception) is
anEventHubException
andgetIsTransient
returnstrue
. - All service API calls are now asynchronously done
ForeachWriter
Implementation has been added. ThisForeachWriter
uses asynchronous sends and performs much better than the existing
EventHubsSink
.translate
has been omptimized- javadocs have been added
- Only the necessary configs are sent to executors (the unneeded onces are trimmed by the driver before they're sent over)
- ConnectionString validation added in
EventHubsConf
- Improved error messaging
- All singletons (
ClientConnectionPool
andCachedEventHubsReceiver
) use globally valid keys.
- cached receivers are used. This allows receivers to be reused across batches which dramatically improves receive times. This changed
- Various documentation updates
- Various unit tests have been added