-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docker: handle http 429 status codes #703
Conversation
Fixes: #618 |
b40bf88
to
2967616
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reviews, @MartinBasti and @twaugh.
As we're already on it: we're not (yet) looking for the Retry-After
but can easily add support for it. The header can be both, an http date or a numerical value for the delay in seconds. Both can be exploited by a malicious server, so we had to set a maximum. Hence the questions:
- What should the maximum delay in seconds be?
- Should we attempt exponential back offs for the values from the header as well?
Not sure. 120? 60?
I don't mind. I don't think it's mandatory, but it's possible it will give better behaviour. |
994ca9c
to
abb6f83
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
docker/docker_client.go
Outdated
// iterations. | ||
delay = 2 | ||
for i := 0; i < 5; i++ { | ||
res, err = c.makeRequestToResolvedURLHelper(ctx, method, url, headers, stream, streamLen, auth, extraScope) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stream
can be only consumed once; this fundamentally can’t be done, at least not at this level of abstraction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Looking at the call sites, we should not perform any retry when stream != nil
which avoids the issue.
@@ -627,9 +636,11 @@ func (c *dockerClient) getExtensionsSignatures(ctx context.Context, ref dockerRe | |||
return nil, err | |||
} | |||
defer res.Body.Close() | |||
if res.StatusCode != http.StatusOK { | |||
return nil, errors.Wrapf(client.HandleErrorResponse(res), "Error downloading signatures for %s in %s", manifestDigest, ref.ref.Name()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropping client.HandleErrorResponse(res)
seems rather undesirable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you elaborate why? I was found the output of HandleErrorResponse
not very user friendly and hard to brain parse. Is there some behavior or output you'd like to keep?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case, I'd switch all checks to that to have ~consistent error messaging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- It’s the only way to get actual server-produced error messages; dumping the raw HTTP response body would be much worse. Sure, they are often useless, but at least they can be improved.
- The failures are machine-identifiable and converted to specific Go types. See how many different failures in See
vendor/github.com/docker/distribution/registry/api/v2/errors.go
just map tohttp.StatusBadRequest
, andisManifestInvalidError
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case, I'd switch all checks to that to have ~consistent error messaging.
Sure, as long as the API documents that this format of error reporting is used. That’s not the case for look-aside signatures, at the very least; I haven’t checked the other cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case, I'd switch all checks to that to have ~consistent error messaging.
That won't work for all responses:
error parsing HTTP 429 response body: invalid character '<' looking for beginning of value: "<html>\r\n<head><title>429 Too Many Requests</title></head>\r\n<body>\r\n<center><h1>429 Too Many Requests</h1></center>\r\n<hr><center>nginx/1.17.3</center>\r\n</body>\r\n</html>\r\n"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Quay.io is known to be non-compliant, what else is new?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's news for me at least. I'll try with nginx pointed to another registry...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m sorry, this was too flippant and not helpful. If we are centralizing error handling, adding a centralized workaround for Quay (e.g. “body starts with <html>
”) would of course be an option. (Sure, technically it was an option before as well, but sprinkling that workaround all over the package would be too ugly.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left this unchanged. Improving and further centralizing error handling sounds like a good idea but is not the intention of this PR. The new function was added to avoid too much boilerplate.
docker/docker_client.go
Outdated
return res, err | ||
} | ||
|
||
func (c *dockerClient) makeRequestToResolvedURLHelper(ctx context.Context, method, url string, headers map[string][]string, stream io.Reader, streamLen int64, auth sendAuth, extraScope *authScope) (*http.Response, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Helper
says nothing, especially without any documentation about its purpose. …WithoutRetries
, maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a comment and renamed it to makeRequestToResolvedURLOnce
.
abb6f83
to
17c818f
Compare
Updated. @mtrmac, mind having another look? I'll tackle the |
26d1a7d
to
c9d6299
Compare
We're now handling the |
c9d6299
to
9dd1193
Compare
LGTM. Thanks! |
9dd1193
to
d0b93a2
Compare
Rebased on latest master. |
d0b93a2
to
7b5fe14
Compare
90c6f20
to
c58f11c
Compare
LGTM |
// ErrUnauthorizedForCredentials is returned when the status code returned is 401 | ||
ErrUnauthorizedForCredentials = errors.New("unable to retrieve auth token: invalid username/password") | ||
// ErrTooManyRequests is returned when the status code returned is 429 | ||
ErrTooManyRequests = errors.New("too many request to registry") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
github.com/pkg/errors.New()
records a stack trace that leads back here. I'm guessing that the standard library's errors.New()
would work better, perhaps using github.com/pkg/errors.WithStack()
to wrap them before returning them in httpResponseToError()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds like a nice improvement!
Consolidate checking the http-status codes to allow for a more uniform error handling. Also treat code 429 (too many requests) as a known error instead of an invalid status code. When hitting 429, perform an exponential back off starting a 2 seconds for at most 5 iterations. If the http.Response set the `Retry-Header` then use the provided value or date to compute the delay until the next attempt. Note that the maximum delay is 60 seconds. Signed-off-by: Valentin Rothberg <[email protected]>
c58f11c
to
d51a7ca
Compare
Updated |
LGTM |
Consolidate checking the http-status codes to allow for a more uniform
error handling. Also treat code 429 (too many requests) as a known
error instead of an invalid status code.
When hitting 429, perform an exponential back off starting a 2 seconds
for at most 5 iterations.
Signed-off-by: Valentin Rothberg [email protected]