-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BEAM-9476] KinesisIO retry LimitExceededException #10973
Conversation
Thank you for contribution! Have you seen this PR #9765 and a discussion over this topic there? I think it tries to solve the similar problem, if I'm not mistaken. |
@aromanenko-dev thank you for the response. I had a quick look at the referenced PR and I suspect that may be solving a different issue, that being transient errors associated with hitting read throughput rate limiting. Some additional context on the issue we are running into: in our case, we have a large number of Kinesis streams in a single account. By default, Kinesis It is possible I am misinterpreting something in the other PR however. |
@ameihm0912 Yes, I see your point and this is actually a different but similar issue imo. One of the option to overcome this could be using internal AWS Another option can using Beam |
@aromanenko-dev interesting, that's definitely an option but as you mention I am unclear on what the implications of applying a global retry/backoff policy to all API requests would be. For this particular PR I tried to just focus on the |
@ameihm0912 Could you use Beam |
Also, please, create new Jira issue for this feature and name this PR and commits using Jira ID as a prefix, like |
@aromanenko-dev updated to use |
retest this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, it looks fine, thanks. Could you just add a unit test to test that this BackOff actually works?
@ameihm0912 kindly pinging to prevent this PR get staled |
@aromanenko-dev apologies for the delay here, I'll try to get that test added this week. |
@ameihm0912 Not a problem, thank you for keep working on this! |
@aromanenko-dev I have added a couple test cases to verify both the retry behavior and the retry limit behavior to |
@ameihm0912 Thanks, all is fine for me except that Spotless fails. Please, run |
During Kinesis stream setup DescribeStream is used. This API call has quota limits that can become problematic when attempting to configure multiple Kinesis streams in the same AWS account. This change modifies the caller of describeStream (listShards) such that transient failures are retried up to 10 times using FluentBackoff starting with a one second backoff (the time used to assess the quota). After ten attempts the exception will be thrown.
@aromanenko-dev this should be fixed up now |
@aromanenko-dev I'm not too sure why the tests are indicated as still failing here, they seemed to pass for me locally. I had a look at the Jenkins output as well and don't see anything obvious in there. |
Run Java PreCommit |
Run JavaPortabilityApi PreCommit |
Run Java PreCommit |
Run JavaPortabilityApi PreCommit |
@ameihm0912 I think it's not related to your changes, so I just rerun this tests |
Looks like we still had a failure; if you'd like me to rebase this off current master let me know, I'm not sure if maybe that might help? |
Run Java PreCommit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ok now and the fails were caused by flakiness in some not related tests.
LGTM and thanks for your contribution!
During Kinesis stream setup DescribeStream is used. This API call has
quota limits that can become very problematic when attempting to
configure multiple Kinesis streams in the same AWS account.
This change modifies the caller of describeStream (listShards) such that
transient failures are retried up to 10 times with a one second delay in
between each try (the time used to assess the quota), at which point if
it has still failed the exception is thrown.
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
R: @username
).[BEAM-XXX] Fixes bug in ApproximateQuantiles
, where you replaceBEAM-XXX
with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
Post-Commit Tests Status (on master branch)
Pre-Commit Tests Status (on master branch)
See .test-infra/jenkins/README for trigger phrase, status and link of all Jenkins jobs.