Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[event-hubs] Error handling changes for SDK consistency #6444

Closed
richardpark-msft opened this issue Dec 6, 2019 · 4 comments · Fixed by #6465
Closed

[event-hubs] Error handling changes for SDK consistency #6444

richardpark-msft opened this issue Dec 6, 2019 · 4 comments · Fixed by #6465
Assignees
Labels
Client This issue points to a problem in the data-plane of the library. Event Hubs

Comments

@richardpark-msft
Copy link
Member

During our recent review for error handling in EventHubs a few spots showed up as being different than we'd like to change:

(will do)

  1. When claiming ownership do NOT eat errors unless they are specifically related to fighting for ownership over a blob (ie, etag related). Eating them can conceal underlying problems (storage account accessibility, etc..). This should be the standard for the CheckpointStore contract overall.
  2. User-thrown errors from the event handlers (except for processError) will be routed to the processError callback.

(might do)
There was an additional discussion about whether this should also terminate the processing loop - some languages do this. That is a larger issue that should be discussed with our architect (@bterlson)

@richardpark-msft richardpark-msft added this to the [2020] January milestone Dec 6, 2019
@richardpark-msft richardpark-msft self-assigned this Dec 6, 2019
@triage-new-issues triage-new-issues bot removed the triage label Dec 6, 2019
@richardpark-msft richardpark-msft added Client This issue points to a problem in the data-plane of the library. triage labels Dec 6, 2019
@triage-new-issues triage-new-issues bot removed the triage label Dec 6, 2019
@chradek
Copy link
Contributor

chradek commented Dec 6, 2019

Yes this looks correct to me. Reiterating...all our checkpointstore implementations should throw errors instead of eating them when calling claimOwnership, unless the failure is due to an etag mismatch (since that will get indirectly handled as EventProcessor should see an OwnershipLost eventually)

@bterlson
Copy link
Member

bterlson commented Dec 6, 2019

I'll need more context to weigh in on the processing loop termination. Come find me to discuss 😀

@richardpark-msft
Copy link
Member Author

I've got a PR out for the first two points since those seem universal.

richardpark-msft added a commit that referenced this issue Dec 10, 2019
Prior to this change we couldn't tell the difference between "storage is failing for some reason" and "we're just losing to other processors that are attempting to claim partitions".

Now an empty list of ownerships is okay and just indicates that we should do nothing this time around.

Fixes #6444 minus the the change to exit the processing loop.
@richardpark-msft
Copy link
Member Author

Closing this now that the PR is complete.

After talking with @bterlson we weren't able to come up with a good reason in the current JS API why user errors would terminate the processing loop.

There isn't a great way for us to know which user errors might be considered fatal, which might be recoverable, etc.. and the cost of spinning down the loop and starting it up should be made by the user themselves.

PR#6555 addresses the first two points and the fact that processError is mandatory and their errors get routed to it should be enough to for the user to not mistakenly be failing in their event handlers and not notice.

@github-actions github-actions bot locked and limited conversation to collaborators Apr 12, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Client This issue points to a problem in the data-plane of the library. Event Hubs
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants