Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature/ec2/imds: Fix Client's response handling and operation timeout race #1448

Merged
merged 3 commits into from
Oct 6, 2021

Conversation

jasdel
Copy link
Contributor

@jasdel jasdel commented Oct 5, 2021

Fixes #1253 race between reading a IMDS response, and the operationTimeout middleware cleaning up its timeout context. Changes the IMDS client to always buffer the response body received, before the result is deserialized. This ensures that the consumer of the operation's response body will not race with context cleanup within the middleware stack.

Also updates the IMDS client to respect passed in Context Deadline/Timeout. Updates the IMDS Client operations to not override the passed in Context's Deadline or Timeout options. If an Client operation is called with a Context with a Deadline or Timeout, the client will no longer override it with the client's default timeout.

@jasdel jasdel requested review from skmcgrail and skotambkar October 5, 2021 21:24
…imeout race

Fixes aws#1253 race between reading a IMDS response, and the
operationTimeout middleware cleaning up its timeout context. Changes the
IMDS client to always buffer the response body received, before the
result is deserialized. This ensures that the consumer of the
operation's response body will not race with context cleanup within the
middleware stack.
@jasdel jasdel force-pushed the fixup/IMDSBodyRace branch from 111715c to 3e25478 Compare October 5, 2021 23:09
@jasdel jasdel changed the title Fix IMDS client's response handling and operation timeout race feature/ec2/imds: Fix Client's response handling and operation timeout race Oct 5, 2021
Updates the IMDS Client operations to not override the passed in
Context's Deadline or Timeout options. If an Client operation is called
with a Context with a Deadline or Timeout, the client will no longer
override it with the client's default timeout.
@jasdel jasdel force-pushed the fixup/IMDSBodyRace branch from 3e25478 to dccd10a Compare October 5, 2021 23:13
Comment on lines 227 to 249
var cancelFn func()

ctx, cancelFn = context.WithTimeout(ctx, m.Timeout)
defer cancelFn()
if _, ok := ctx.Deadline(); !ok {
var cancelFn func()
ctx, cancelFn = context.WithTimeout(ctx, m.DefaultTimeout)
defer cancelFn()
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<3 thank you for allowing timeouts to be externally configurable!

defer cancelFn()
if _, ok := ctx.Deadline(); !ok {
var cancelFn func()
ctx, cancelFn = context.WithTimeout(ctx, m.DefaultTimeout)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this could use the constant defaultOperationTimeout directly and remove the timeout var entirely from operationTimeout struct? Seems m.DefaultTimeout will always be defaultOperationTimeout in the end.

Not sure though of the impact/necessity on tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having the DefaultTimeout member on operationTimeout middleware is handy if we want to change the timeout depending on the operation. I'm not aware of a need for this at the moment. Since operationTimeout is not exported, having the member is preferred so we can reuse the operationTimeout middleware later.

There is a test that uses a custom timeout, but I think this test needs to be tweaked a little bit to more closely align with the intention behind it. e.g. if there was a context with timeout use it, if there wasn't use a default. This test is relying on the behavior of a canceled context, and the configured nanosecond timeout input.

func TestOperationTimeoutMiddleware(t *testing.T) {
m := &operationTimeout{
DefaultTimeout: time.Nanosecond,
}
_, _, err := m.HandleInitialize(context.Background(), middleware.InitializeInput{},
middleware.InitializeHandlerFunc(func(
ctx context.Context, input middleware.InitializeInput,
) (
out middleware.InitializeOutput, metadata middleware.Metadata, err error,
) {
if err := sdk.SleepWithContext(ctx, time.Second); err != nil {
return out, metadata, err
}
return out, metadata, nil
}))
if err == nil {
t.Fatalf("expect error got none")
}
if e, a := "deadline exceeded", err.Error(); !strings.Contains(a, e) {
t.Errorf("expect %q error in %q", e, a)
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated operationTimeout to not set a default timeout, if DefaultTimeout is zero. I think this was the missing behavior in keeping DefaultTimeout member in operationTimeout.

Updates operationTimeout so that if DefaultTimeout is unset (aka zero)
operationTimeout will not set a default timeout on the context.
@jasdel jasdel force-pushed the fixup/IMDSBodyRace branch from 4eba5da to cdc4292 Compare October 6, 2021 17:31
@jasdel jasdel merged commit 0d3bd7a into aws:main Oct 6, 2021
@jasdel jasdel deleted the fixup/IMDSBodyRace branch October 6, 2021 18:18
jrichard8 pushed a commit to jrichard8/aws-sdk-go-v2 that referenced this pull request Feb 14, 2022
…t race (aws#1448)

Fixes aws#1253 race between reading a IMDS response, and the
operationTimeout middleware cleaning up its timeout context. Changes the
IMDS client to always buffer the response body received, before the
result is deserialized. This ensures that the consumer of the
operation's response body will not race with context cleanup within the
middleware stack.

Updates the IMDS Client operations to not override the passed in
Context's Deadline or Timeout options. If an Client operation is called
with a Context with a Deadline or Timeout, the client will no longer
override it with the client's default timeout.

Updates operationTimeout so that if DefaultTimeout is unset (aka zero)
operationTimeout will not set a default timeout on the context.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unexpected error from sdk: failed to decode <role> EC2 IMDS role credentials
3 participants