Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

waitUntilFunctionUpdated (all waiters?) times out instead of throwing permissions error when permissions are missing #6699

Open
3 of 4 tasks
rix0rrr opened this issue Nov 27, 2024 · 1 comment
Labels
bug This issue is a bug. needs-triage This issue or PR still needs to be triaged.

Comments

@rix0rrr
Copy link

rix0rrr commented Nov 27, 2024

Checkboxes for prior research

Describe the bug

We were using waitUntilFunctionUpdated to wait for a function to stabilize after updating it. We have since moved to waitUntilFunctionUpdatedV2, I don't know if this matters.

Since then, we have gotten reports from the Amplify team that tests were failing with the error:

TimeoutError: Resource is not in the expected state due to waiter status: TIMEOUT. Waiter has timed out.

We spent days looking over timings, comparing code between v2 and v3 waitiers, and theorizing what might be the problem that caused the waiter to hit the timeout. We ultimately had to give up because we couldn't think of anything.

Then, 2 weeks later, at the least opportune time (blocked days for re:Invent), we figured out what the issue was: the policy they were using had permissions for lambda:GetFunction, but it needed permissions for lambda:GetFunctionConfiguration.

We could have known this immediately, but the waiter swallowed the permissions error and instead reported an "oh well the service doesn't seem to stabilize ¯_(ツ)_/¯" error.

This is extremely unexpected behavior, and this poor error reporting has costs us many days and sweat and stress.

I understand you're probably doing this to proceed in the face of transient errors, but I would ask for one of the following behaviors, in order of preference:

  • If you catch a non-retryable error, throw it immediately instead of continuing to wait.
  • Upon throwing the TimedOut error, if you notice that you've been consistently getting the same error again and again over the period of the wait (not a single success), just throw that error instead.
  • Upon throwing the TimedOut error, include the (unique?) error messages of the errors you've seen in the error description.

Regression Issue

  • Select this option if this issue appears to be a regression.

SDK version number

@aws-sdk/[email protected]

Which JavaScript Runtime is this issue in?

Node.js

Details of the browser/Node.js/ReactNative version

v22.11.0

Reproduction Steps

Create and assume a role that has the statement:

{
  "Effect": "Deny",
  "Action": "lambda:GetFunctionConfiguration",
  "Resource": "*"
}

Run the following code:

import { Lambda, waitUntilFunctionUpdated } from '@aws-sdk/client-lambda';

async function main() {
  const client = new Lambda({ region: 'eu-west-1' });
  await waitUntilFunctionUpdated({ client, maxWaitTime: 30 }, {
    FunctionName: 'some-function-in-your-account',
  });
  console.log('OK');
}

main().catch(e => {
  console.error(e);
  process.exitCode = 1;
});

Observed Behavior

After 10 seconds, see:

Error [TimeoutError]: {"state":"TIMEOUT","reason":"Waiter has timed out"}

Expected Behavior

Immediately, see:

An error occurred (AccessDeniedException) when calling the GetFunctionConfiguration operation: ...etc...

Possible Solution

No response

Additional Information/Context

No response

@rix0rrr rix0rrr added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. needs-triage This issue or PR still needs to be triaged.
Projects
None yet
Development

No branches or pull requests

1 participant