-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix the error when num of partitions increased #500
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proposed change updates the common path of a batch execution by getting the number of partitions for every single batch (partitionCount is not a lazy val anymore and this affects the execution of earliestAndLatest for every batch).
This is an overhead added to every batch execution for a scenario that would happen rarely. Therefore, instead of changing the common case scenario in batch execution we can add an exception handler which handles this scenario when needed.
I remove getting number of partitions for each batch part, then users should restart their jobs every time partition number increased |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add an exception handling scenario so that when the exception happens it gets the new partition size and adds the new partitions. In this way, the user doesn't have to restart when he/she adds a new partition.
core/src/main/scala/org/apache/spark/sql/eventhubs/EventHubsSource.scala
Show resolved
Hide resolved
core/src/main/scala/org/apache/spark/sql/eventhubs/EventHubsSource.scala
Outdated
Show resolved
Hide resolved
Hi @tilumi - have we done an upgrade testing with the change and verify whether the latest iteration works as expected? |
Dedicated EventHub support partition increase, but Spark job get failed due to checkpoint doesn't contain sequence number for new partitions:
This PR contains following change:
fromStartOfStream