Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translate AWS Rate limiting errors to 502 errors #5270

Merged
merged 4 commits into from
Sep 18, 2018

Conversation

catsby
Copy link
Contributor

@catsby catsby commented Sep 4, 2018

Builds off of and includes #5216

When interacting with AWS, if we hit any throttling errors return those to the users as a 502 "Bad Gateway" error code.

Supports:

  • Secret backend
  • Auth backend

@catsby catsby requested a review from chrishoffman September 4, 2018 20:55

// CheckAWSError will examine an error and convert to a logical error if
// approrpiate. If no appropriate error is found, return nil
func CheckAWSError(err error) error {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestions for better names for this function as well as AppendLogicalError are most welcome

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of comments on this file:

  • Do you need have CheckAWSError as a separate, exported function? The functionality seems like it could just be part of AppendLogicalError.
  • Is multierror the best response pattern for AppendLogicalError, or would errwrap be more appropriate? I view the logical error as adding context to the underlying AWS error (very much an errwrap concept), vs. multierror which is often used to accumulate various errors together
  • Depending on what you think of ^^^, maybe a single WrapAWSError would work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need have CheckAWSError as a separate, exported function?

CheckAWSError is for transforming an upstream error into a different error of our choosing. In this case (and only case I'm aware of), it's "AWS gave us 503 (or other throttle error) so we change that to a BadGateway). In short "do we want to change this error to something else?" If yes, that something gets returned. If no, then return nothing. Another way of doing it would be to return (error, bool), with bool indicating if a change was made. If false, then the error returned is the same as the function received.

The need for a separate method is shown in

"Error creating IAM user: %s", err)), awsutil.CheckAWSError(err)
, where we're already returning a error response, and we're using the returned error here to indicate a category of error. That error is later checked in response_util to control how Vault responds.

AppendLogicalError came up because in in auth/aws/backend we weren't using logical.ErrorResponse, so the original error wasn't going to be returned if I just used CheckAWSError. So, I wanted to return both errors, the original and the logical one, if necessary.

My first approach used errwrap however, either it's not suited or I did it wrong, because I wasn't able to get errwrap.Contains to find the logical.ErrUpstreamRateLimited when wrapping the errors. Using multierror did work, and is compatible with errwrap and honestly made more sense to me than errwrap.

@@ -168,7 +169,8 @@ func (b *backend) validateInstance(ctx context.Context, s logical.Storage, insta
},
})
if err != nil {
return nil, errwrap.Wrapf(fmt.Sprintf("error fetching description for instance ID %q: {{err}}", instanceID), err)
errW := errwrap.Wrapf(fmt.Sprintf("error fetching description for instance ID %q: {{err}}", instanceID), err)
return nil, awsutil.AppendLogicalError(errW)
Copy link
Contributor

@kalafut kalafut Sep 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After err is wrapped, won't awsRequest.IsErrorThrottle() be unable to detect that it is a throttling error?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great catch, thanks. I modified this in b01aaad

if err != nil {
return logical.ErrorResponse(err.Error()), nil

b.clientMutex.Lock()
Copy link
Contributor

@kalafut kalafut Sep 5, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider making clientMutex an RWLock that can be upgraded from RLock to Lock if client is nil. This avoids synchronizing all of the read requests. You can see an example upgrade pattern (which shows up in various place in Vault) here: https://github.com/hashicorp/vault-plugin-secrets-azure/blob/master/client.go#L37

(Comment applies to the other uses of this lock too.)

STSClient, err := clientSTS(ctx, s)
if err != nil {
return logical.ErrorResponse(err.Error()), nil
if b.stsClient == nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guard with the lock?

)

// CheckAWSError will examine an error and convert to a logical error if
// approrpiate. If no appropriate error is found, return nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/approrpiate/appropriate

logical/error.go Outdated
@@ -20,6 +20,10 @@ var (
// ErrMultiAuthzPending is returned if the the request needs more
// authorizations
ErrMultiAuthzPending = errors.New("request needs further approval")

// ErrUpstreamRateLimited is returned when Vault recieves a rate limited
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/recieves/receives

@kalafut
Copy link
Contributor

kalafut commented Sep 6, 2018

You'll want to run make fmt with go 1.11 installed.

- bump aws iam and sts packages to v1.14.31 to get mocking interface
- promote the iam and sts clients to the aws backend struct, for mocking in tests
- this also promotes some functions to methods on the Backend struct, so
  that we can use the injected client

Generating creds requires reading config/root for credentials to contact
IAM. Here we make pathConfigRoot a method on aws/backend so we can clear
the clients on successful update of config/root path. Adds a mutex to
safely clear the clients
@catsby catsby force-pushed the aws-upstream-ratelimiting branch from 649010b to a3e59aa Compare September 7, 2018 15:38
@catsby catsby changed the title [WIP] Translate AWS Rate limiting errors to 502 errors Translate AWS Rate limiting errors to 502 errors Sep 10, 2018
if err != nil {
return logical.ErrorResponse(err.Error()), nil

b.clientMutex.RLock()
Copy link
Member

@vishalnayak vishalnayak Sep 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May it be a good idea to abstract out the nil checking and client setting in the backend struct? I see that all the functions that use either b.iamClient or b.stsClient should check those for nil and perform the logic of setting it in the backend. This seems easy to get wrong, or simply forget.

How about we push this logic in a function, say b.IAMClient() and b.STSClient() which return the b.iamClient and b.stsClient respectively if populated. If not, it can then do the locking to set the client and return the client. The calling functions will then have to use the returned client as opposed to b.[iam|sts]Client. That way calling functions can be sure that the client is a non-nil value.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable, I made the changes in 3c74986 please take a look

vishalnayak
vishalnayak previously approved these changes Sep 13, 2018
@vishalnayak vishalnayak added this to the 0.11.2 milestone Sep 17, 2018
jefferai
jefferai previously approved these changes Sep 18, 2018
Copy link
Member

@jefferai jefferai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just the one comment, LGTM!

@catsby catsby dismissed stale reviews from jefferai and vishalnayak via 3d468da September 18, 2018 16:46
@catsby catsby merged commit f39bafb into master Sep 18, 2018
@catsby catsby deleted the aws-upstream-ratelimiting branch September 18, 2018 20:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants