Skip to content

v2.3.2

Compare
Choose a tag to compare
@sabeegrewal sabeegrewal released this 05 Jul 21:41
· 128 commits to master since this release
3682b30

Version 2.3.2 of azure-eventhubs-spark_2.11. Release notes:

  • Bug fixes
    • Fixed data loss check in getBatch - excess warnings are no longer printed.
    • Prefetch count can no longer be set below the minimum allowed value.
    • Invalid offsets are detected in translate.
  • Enhancements
    • cached receivers are used. This allows receivers to be reused across batches which dramatically improves receive times. This changed
      required a move to epoch receivers which means each Spark application requires it's own consumer group in order to run properly.
    • properties has been added to the Structured Streaming schema
    • partition has been added to the Structured Streaming schema
    • Check for old fromSeqNos in DStreams. If a DStream falls behind (e.g. events expire from the Event Hub before they are consumed by Spark),
      then Spark will move to the earliest valid event. Previously, the job would fail in this case.
    • Add retry mechanism for certain exceptions thrown in getRunTimeInfo. Exceptions are retried if the exception (or the inner exception) is
      an EventHubException and getIsTransient returns true.
    • All service API calls are now asynchronously done
    • ForeachWriter Implementation has been added. This ForeachWriter uses asynchronous sends and performs much better than the existing
      EventHubsSink.
    • translate has been omptimized
    • javadocs have been added
    • Only the necessary configs are sent to executors (the unneeded onces are trimmed by the driver before they're sent over)
    • ConnectionString validation added in EventHubsConf
    • Improved error messaging
    • All singletons (ClientConnectionPool and CachedEventHubsReceiver) use globally valid keys.
  • Various documentation updates
  • Various unit tests have been added