-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bugs around docker start retries #6326
Conversation
This handles a bug where we may start a container successfully, yet we fail due to retries and startContainer not being idempotent call. Here, we ensure that when starting a container fails with 500 error, the retry succeeds if container was started successfully.
} else if isDockerTransientError(createErr) && attempted < 5 { | ||
attempted++ | ||
time.Sleep(1 * time.Second) | ||
goto CREATE | ||
} | ||
|
||
return nil, recoverableErrTimeouts(createErr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not obvious to me why recoverableErrTimeout
are used. StartTask
doesn't special case them, and wraps them before returning. I considered removing it completely and doing more retries locally, but decided to be conservative and double check goal before making further changes to driver.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have much context to add here other than that it is was brought over from pre-0.9 drivers:
Line 1701 in fcc4149
return nil, recoverableErrTimeouts(createErr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Makes sense for more aggressive refactor follow up PR.
62d7fda
to
b5b445c
Compare
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
Noticed that docker driver attempts to retry container creation and start.
This fixes a couple of bugs, where we either don't retry transient errors, or we retry in such away that makes retry invocation fails.