Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add timeout to token refresh retry #123

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

DoumanAsh
Copy link

Fixes #122

Copy link
Owner

@djc djc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking at this!

src/types.rs Outdated
@@ -66,6 +66,8 @@ impl HttpClient {
retries += 1;
if retries >= RETRY_COUNT {
return Err(err);
} else {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No else necessary here, since the if block returns.

Please import sleep() and Duration (from std, not via Tokio).

How long does a loop iteration take for you without timeout? I wonder if 200ms is a good length or if we could go a bit shorter on the first retry and back off a little?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sleep from std?
I'm not sure if that's good idea in async code.

So in my case we had metadata server rejecting connections due sudden burst in which case multiple retries in short interval would be only even more harmful
I used 200ms just to make sure retries are done within 1 second

Copy link
Author

@DoumanAsh DoumanAsh Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How long does a loop iteration take for you without timeout?

Current logic (no timeout) performs retries within 10 millisecond
I do not think it is good idea for metadata server because google most likely has some burst limits as we started encountering it recently with GKE's scheduled pods

Copy link
Owner

@djc djc Dec 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sleep from std? I'm not sure if that's good idea in async code.

No, Duration from std, but would like to import sleep() so that you can use it without the tokio::time prefix at the callsite.

So in my case we had metadata server rejecting connections due sudden burst in which case multiple retries in short interval would be only even more harmful I used 200ms just to make sure retries are done within 1 second

Okay, how about we started with like 50ms and multiply by 2 on each iteration?

  • 50ms
  • 100ms
  • 200ms
  • 400ms
  • 800ms

That gets the total time to 1.5s.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok

@djc
Copy link
Owner

djc commented Dec 13, 2024

(If you rebase, the MSRV failure in CI should go away.)

@DoumanAsh
Copy link
Author

Rebase PR and applied comments
I'm not sure if it is good idea to use sleep in asyn code, so I kept tokio's sleep

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Token fetch retry mechanism should use some timeout
2 participants