Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

STS Client - Not authorized to perform sts:AssumeRoleWithWebIdentity #6225

Closed
3 tasks done
meenar-se opened this issue Jun 26, 2024 · 22 comments
Closed
3 tasks done

STS Client - Not authorized to perform sts:AssumeRoleWithWebIdentity #6225

meenar-se opened this issue Jun 26, 2024 · 22 comments
Assignees
Labels
bug This issue is a bug. closing-soon This issue will automatically close in 4 days unless further comments are made. p1 This is a high priority issue queued This issues is on the AWS team's backlog

Comments

@meenar-se
Copy link

meenar-se commented Jun 26, 2024

Checkboxes for prior research

Describe the bug

We have deployed a application which is using SSM Client to read the configuration from SSM parameter store and its deployed to EKS.
When we are using the @aws-sdk/client-ssm v3.596.0 we are getting an error Not authorized to perform sts:AssumeRoleWithWebIdentity

Detailed error stack:

{
        "name": "AccessDenied",
        "$fault": "client",
        "$metadata": {
            "httpStatusCode": 403,
            "requestId": "1d07a44e-af6b-4256-8f8d-f02e828023e7",
            "attempts": 1,
            "totalRetryDelay": 0
        },
        "Type": "Sender",
        "Code": "AccessDenied",
        "message": "Not authorized to perform sts:AssumeRoleWithWebIdentity"
    }

But the same code is working fine with @aws-sdk/[email protected]

SDK version number

Which JavaScript Runtime is this issue in?

Node.js

Details of the browser/Node.js/ReactNative version

v20.15.0

Reproduction Steps

Deploy this test project https://github.com/meenar-se/aws-sdk-issue
Call the API
curl --location --request GET 'localhost:8080/ping'

Expected behavior:
Get the values from SSM parameter

Actual Behavior:
Getting Access Denied Exception

{"err":{"type":"STSServiceException","message":"Not authorized to perform sts:AssumeRoleWithWebIdentity","stack":"AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n    at throwDefaultError (/fastify-sts-example/node_modules/@smithy/smithy-client/dist-cjs/index.js:839:20)\n    at /fastify-sts-example/node_modules/@smithy/smithy-client/dist-cjs/index.js:848:5\n    at de_CommandError (/fastify-sts-example/node_modules/@aws-sdk/client-sts/dist-cjs/index.js:476:14)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)","name":"AccessDenied","$fault":"client","$metadata":{"httpStatusCode":403,"requestId":"1c9069d9-04ad-4e93-85b2-330b10f232c8","attempts":1,"totalRetryDelay":0},"Type":"Sender","Code":"AccessDenied"},"msg":"Error getting SSM Parameter:"}

Observed Behavior

Initially we were using the @aws-sdk/client-ssm v3.577.0 and it has the peer dependency @aws-sdk/client-sts v3.577.0
there were no issues. As soon as we upgraded our project to use the latest version we are facing the issue.

And if we use the v3.577.0 there is no issue.

It seems to be an issue with @aws-sdk/client-sts library which is added as a peer dependency to all of the client libraries.

Expected Behavior

Expecting to connect to the get the credentials and make connection AWS services

Possible Solution

No response

Additional Information/Context

No response

@meenar-se meenar-se added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Jun 26, 2024
@RanVaknin
Copy link
Contributor

Hi @meenar-se ,

Thanks for reaching out. This is interesting. The error you are seeing is a permissions error and is usually unrelated to the SDK version.

From the description of the problem it sounds like it started happening after upgrading to a more recent version. Does rolling back to version 3.577.0 solve this issue for you? If so, can you please add this middleware hook to your snippet and share the logged request and response both for 3.577.0 and the latest version?

fastify.get('/ping', async function handler (request, reply) {
    fastify.log.info("ping method started")
    const client = new SSMClient({})

    client.middlewareStack.add(next => async (args) => {
        console.log(args.request)
        const response = await next(args);
        console.log(response);
        return response;
    }, {step: 'finalizeRequest'})

// rest of the code 

That would allow us to examine exactly what changed in the request to get a better idea of where the discrepancy in behavior might be coming from.

Thanks again,
Ran~

@RanVaknin RanVaknin self-assigned this Jun 26, 2024
@RanVaknin RanVaknin added response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. p2 This is a standard priority issue and removed needs-triage This issue or PR still needs to be triaged. labels Jun 26, 2024
@meenar-se
Copy link
Author

meenar-se commented Jun 28, 2024

Yes @RanVaknin rolling back to 3.577.0 resolves the issue. Its working perfectly fine.

With the latest version 3.596.0 i have added the middleware hook and tried. Its not even printing the request and response logs.

I have tried with below options

  1. step as 'finalizeRequest' - Logs are not getting printed
  2. step as 'build' - Logs are not getting printed

But with the version 3.577.0 middleware logs of request and response are printed

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. label Jun 29, 2024
@RanVaknin
Copy link
Contributor

Hi @meenar-se ,

I'm not sure why you are seeing this. Perhaps this is an issue with Fastify?

I'm able to deploy an EKS pod using the latest version of the SDK and call SSM just fine:

$ kubectl get pods                                                          
NAME                                       READY   STATUS             RESTARTS        AGE
repro                                      1/1     Running            0               45m

$ kubectl exec --stdin --tty repro -- /bin/bash                                                 

bash-5.2# cd repro/

bash-5.2# npm ls
[email protected] /repro
`-- @aws-sdk/[email protected]

bash-5.2# cat sample.mjs 
import { SSMClient, GetParameterCommand } from "@aws-sdk/client-ssm";
const client = new SSMClient({})

try {
  const command = new GetParameterCommand({Name: 'some-name', WithDecryption:true});
  const response = await client.send(command);
  console.log(response)
} catch (error) {
    console.log(error)
}

bash-5.2# node sample.mjs 
{
  '$metadata': {
    httpStatusCode: 200,
    requestId: 'REDACTED',
    extendedRequestId: undefined,
    cfId: undefined,
    attempts: 1,
    totalRetryDelay: 0
  },
  Parameter: {
    ARN: 'arn:aws:ssm:us-east-1:REDACTED:parameter/some-name',
    DataType: 'text',
    LastModifiedDate: 2024-07-03T20:13:37.758Z,
    Name: 'some-name',
    Type: 'String',
    Value: 'some-value',
    Version: 1
  }
}

The error you are seeing is a permissions issue, where the IAM role that gets assumed using the IRSA token does not have the right permissions to call getParameter. I can't think of any SDK update that would change the permissions of an assumed role, as this is an upstream resource that lives in your AWS account.

But with the version 3.577.0 middleware logs of request and response are printed

I'm not sure why this would not get logged as this middleware just uses console.log which prints to std.out. Also if you are getting a response, you should at least see the request part of the middleware as it fires before the response returns.

How are you deploying your code changes? are you using an Image from ECR? I would make sure you are deploying the correct image with the correct tag to your pod. You should also verify with doing a cat src/app.ts to make sure whatever you have in your pod currently is the most up to date snippet as there is no reason that you'll see a service response but not the middleware logs.

Thanks,
Ran~

@RanVaknin RanVaknin added response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. labels Jul 3, 2024
Copy link

This issue has not received a response in 1 week. If you still think there is a problem, please leave a comment to avoid the issue from automatically closing.

@github-actions github-actions bot added the closing-soon This issue will automatically close in 4 days unless further comments are made. label Jul 14, 2024
@meenar-se
Copy link
Author

@RanVaknin - Even after removing the fastify and tried with the simple example as shown your comment gives us the same error.

@github-actions github-actions bot removed closing-soon This issue will automatically close in 4 days unless further comments are made. response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. labels Jul 15, 2024
@RanVaknin
Copy link
Contributor

Hi @meenar-se ,

Can you please configure the logger to capture the raw request and response? You can do it like so:

const client = new SSMClient({logger: console})

please add this to your client creation code, and examine and share the output from these logs for both deployments (old version, and newer version which doesnt work)

This logger should capture the implicit AssumeRoleWithWebIdentity call that is failing. By examining the outgoing request we can see what part of the request changed.

Thanks,
Ran~

@RanVaknin RanVaknin added the response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. label Jul 15, 2024
@meenar-se
Copy link
Author

meenar-se commented Jul 21, 2024

@RanVaknin - Here is the logs..

calling with args {"middlewareStack":{},"input":{"Name":"ssm-name","WithDecryption":true}}

@aws-sdk/credential-provider-node - defaultProvider::fromSSO
@smithy/property-provider -> Skipping SSO provider in default chain (inputs do not include SSO fields).
@aws-sdk/credential-provider-node - defaultProvider::fromIni
@aws-sdk/credential-provider-ini - fromIni
    default isAssumeRoleWithSourceProfile source_profile=web_token
@aws-sdk/credential-provider-ini - resolveAssumeRoleCredentials (STS)
@aws-sdk/credential-provider-ini - finding credential resolver using source_profile=[web_token]
@aws-sdk/credential-provider-web-identity - fromTokenFile
@aws-sdk/credential-provider-web-identity - fromWebToken
@aws-sdk/client-sts::resolveRegion accepting first of: undefined (provider) us-east-1 (parent client) us-east-1 (STS default)
endpoints Initial EndpointParams: {
  "UseGlobalEndpoint": false,
  "UseFIPS": false,
  "Region": "us-east-1",
  "UseDualStack": false
}
endpoints evaluateCondition: booleanEquals($UseGlobalEndpoint, true) = false
endpoints evaluateCondition: isSet($Endpoint) = false
endpoints evaluateCondition: isSet($Region) = true
endpoints evaluateCondition: aws.partition($Region) = {
  "dnsSuffix": "amazonaws.com",
  "dualStackDnsSuffix": "api.aws",
  "implicitGlobalRegion": "us-east-1",
  "name": "aws",
  "supportsDualStack": true,
  "supportsFIPS": true,
  "description": "US East (N. Virginia)"
}

Note: We are using IRSA to grant kubernetes workloads access to AWS.

@RabahZeineddine
Copy link

RabahZeineddine commented Aug 5, 2024

I'm having the same issue, even downgrading the sdk version it didn't work

AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity\n    at throwDefaultError (/node_modules/.pnpm/@[email protected]/node_modules/@smithy/smithy-client/dist-cjs/index.js:839:20)\n    at /node_modules/.pnpm/@[email protected]/node_modules/@smithy/smithy-client/dist-cjs/index.js:848:5\n    at de_CommandError (/node_modules/.pnpm/@[email protected]/node_modules/@aws-sdk/client-sts/dist-cjs/index.js:478:14)\n    at process.processTicksAndRejections (node:internal/process/task_queues:95:5)\n    at async /node_modules/.pnpm/@[email protected]/node_modules/@smithy/middleware-serde/dist-cjs/index.js:35:20\n    at async /node_modules/.pnpm/@[email protected]/node_modules/@smithy/core/dist-cjs/index.js:165:18\n    at async /node_modules/.pnpm/@[email protected]/node_modules/@smithy/middleware-retry/dist-cjs/index.js:320:38\n    at async /node_modules/."

@RanVaknin
Copy link
Contributor

RanVaknin commented Aug 5, 2024

Hi @meenar-se ,

Thanks.

I dont see the underlying call made by the provider. We might be able to see why it is failing by overriding the provider. I'm not sure if the one that is being triggered is fromTokenFile or fromWebToken. Can you please try both and see if you are seeing logs?

import { fromTokenFile } from "@aws-sdk/credential-providers";

const client = new SSMClient({
    region: "us-east-1",
    credentials: fromTokenFile({
      clientConfig: {
        logger: console
      }
    })
});

or

import { fromWebToken } from "@aws-sdk/credential-providers";

const client = new SSMClient({
    region: "us-east-1",
    credentials: fromWebToken({
      clientConfig: {
        logger: console
      }
    })
});

Hi @RabahZeineddine this is unlikely tied to an SDK version. If the role you are assuming does not have the correct permissions, you will see an AccessDenied. You can enable logs the same way I show above, and examine the API call that the SDK is making under the hood to call AssumeRoleWithWebIdentity. The way you set up IRSA might be incorrect and inadvertently lead to this issue.

Thanks,
Ran~

@RanVaknin RanVaknin added response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. and removed response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. labels Aug 5, 2024
@RabahZeineddine
Copy link

@RanVaknin I'm not using the SSMClient... we are using the s3 and the athena client
it started after dependencies upgrade... no changes to the IRSA ... so it is weird right?

the IRSA configuration is set using terraform so no one changed it..

@RanVaknin
Copy link
Contributor

RanVaknin commented Aug 5, 2024

Hi @RabahZeineddine ,

The SSMClient in my snippet was just to show that you can override the credential provider on the client you are creating to enable logging for it.

In your case it will be

import { fromWebToken } from "@aws-sdk/credential-providers";

const client = new S3Client({
    region: "us-east-1",
    credentials: fromWebToken({
      clientConfig: {
        logger: console
      }
    })
});

it started after dependencies upgrade... no changes to the IRSA ... so it is weird right?

Which dependencies did you upgrade? The SDK version? If the SDK is the blame here, and it indeed started breaking after an upgrade, rolling back to an older version should fix the problem.

In order to move forward we will need to identify either the offending version that caused the change in behavior. Or seeing the request made to AssumeRoleWithWebIdentity that is failing. You should be able to see that request if you override to credential provider as shown in my previous comment and configure the logger on it.

Thanks,
Ran~

@RanVaknin RanVaknin added response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. and removed response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. labels Aug 5, 2024
@RabahZeineddine
Copy link

RabahZeineddine commented Aug 6, 2024

@RanVaknin check the logs

@aws-sdk/credential-provider-node defaultProvider::fromEnv
@aws-sdk/credential-provider-env fromEnv
@aws-sdk/credential-provider-node defaultProvider::fromSSO
@aws-sdk/credential-provider-node defaultProvider::fromIni
@aws-sdk/credential-provider-ini fromIni
@aws-sdk/credential-provider-node defaultProvider::fromProcess
@aws-sdk/credential-provider-process fromProcess
@aws-sdk/credential-provider-node defaultProvider::fromTokenFile
@aws-sdk/credential-provider-web-identity fromTokenFile
@aws-sdk/credential-provider-web-identity fromWebToken
@aws-sdk/client-sts::resolveRegion accepting first of: undefined (provider) sa-east-1 (parent client) us-east-1 (STS default)
endpoints Initial EndpointParams: {
  "UseGlobalEndpoint": false,
  "UseFIPS": false,
  "Region": "sa-east-1",
  "UseDualStack": false
}
endpoints evaluateCondition: booleanEquals($UseGlobalEndpoint, true) = false
endpoints evaluateCondition: isSet($Endpoint) = false
endpoints evaluateCondition: isSet($Region) = true
endpoints evaluateCondition: aws.partition($Region) = {
  "dnsSuffix": "amazonaws.com",
  "dualStackDnsSuffix": "api.aws",
  "implicitGlobalRegion": "us-east-1",
  "name": "aws",
  "supportsDualStack": true,
  "supportsFIPS": true,
  "description": "South America (Sao Paulo)"
}
endpoints assign: PartitionResult := {
  "dnsSuffix": "amazonaws.com",
  "dualStackDnsSuffix": "api.aws",
  "implicitGlobalRegion": "us-east-1",
  "name": "aws",
  "supportsDualStack": true,
  "supportsFIPS": true,
  "description": "South America (Sao Paulo)"
}
endpoints evaluateCondition: booleanEquals($UseFIPS, true) = false
endpoints evaluateCondition: booleanEquals($UseFIPS, true) = false
endpoints evaluateCondition: booleanEquals($UseDualStack, true) = false
endpoints evaluateCondition: stringEquals($Region, aws-global) = false
endpoints Resolving endpoint from template: {
  "url": "https://sts.{Region}.{PartitionResult#dnsSuffix}",
  "properties": {},
  "headers": {}
}
endpoints Resolved endpoint: {
  "headers": {},
  "properties": {},
  "url": "https://sts.sa-east-1.amazonaws.com/"
}

{
  clientName: 'STSClient',
  commandName: 'AssumeRoleWithWebIdentityCommand',
  input: {
    RoleArn: 'arn:aws:iam::<account-id>:role/<role-name>',
    RoleSessionName: 'aws-sdk-js-session-<session-id>',
    WebIdentityToken: '***SensitiveInformation***',
    ProviderId: undefined,
    PolicyArns: undefined,
    Policy: undefined,
    DurationSeconds: undefined
  },
  error: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity
      at throwDefaultError (/node_modules/.pnpm/@[email protected]/node_modules/@smithy/smithy-client/dist-cjs/index.js:838:20)
      at /node_modules/.pnpm/@[email protected]/node_modules/@smithy/smithy-client/dist-cjs/index.js:847:5
      at de_CommandError (/node_modules/.pnpm/@[email protected]/node_modules/@aws-sdk/client-sts/dist-cjs/index.js:478:14)
      at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
      at async /node_modules/.pnpm/@[email protected]/node_modules/@smithy/middleware-serde/dist-cjs/index.js:35:20
      at async /node_modules/.pnpm/@[email protected]/node_modules/@smithy/core/dist-cjs/index.js:165:18
      at async /node_modules/.pnpm/@[email protected]/node_modules/@smithy/middleware-retry/dist-cjs/index.js:320:38
      at async /node_modules/.pnpm/@[email protected]/node_modules/@aws-sdk/middleware-logger/dist-cjs/index.js:34:22
      at async /node_modules/.pnpm/@[email protected]/node_modules/@aws-sdk/client-sts/dist-cjs/index.js:1383:43
      at async /node_modules/.pnpm/@[email protected]/node_modules/@smithy/property-provider/dist-cjs/index.js:97:27
      at async coalesceProvider (/node_modules/.pnpm/@[email protected]/node_modules/@smithy/property-provider/dist-cjs/index.js:124:18)
      at async /node_modules/.pnpm/@[email protected]/node_modules/@smithy/property-provider/dist-cjs/index.js:142:18
      at async /node_modules/.pnpm/@[email protected]/node_modules/@smithy/core/dist-cjs/index.js:82:17
      at async /node_modules/.pnpm/@[email protected]/node_modules/@aws-sdk/middleware-logger/dist-cjs/index.js:34:22
    '$fault': 'client',
    '$metadata': {
      httpStatusCode: 403,
      requestId: '<requestId>',
      extendedRequestId: undefined,
      cfId: undefined,
      attempts: 1,
      totalRetryDelay: 0
    },
    Type: 'Sender',
    Code: 'AccessDenied'
  },
  metadata: {
    httpStatusCode: 403,
    requestId: '<requestId>',
    extendedRequestId: undefined,
    cfId: undefined,
    attempts: 1,
    totalRetryDelay: 0
  }
}
{
  clientName: 'AthenaClient',
  commandName: 'StartQueryExecutionCommand',
  input: {
    QueryString: '<query>',
    ResultConfiguration: {
      OutputLocation: 's3://<bucket>',
      EncryptionConfiguration: [Object],
      AclConfiguration: [Object]
    }
  },
  error: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity
      at throwDefaultError (/node_modules/.pnpm/@[email protected]/node_modules/@smithy/smithy-client/dist-cjs/index.js:838:20)
      at /node_modules/.pnpm/@[email protected]/node_modules/@smithy/smithy-client/dist-cjs/index.js:847:5
      at de_CommandError (/node_modules/.pnpm/@[email protected]/node_modules/@aws-sdk/client-sts/dist-cjs/index.js:478:14)
      at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
      at async /node_modules/.pnpm/@[email protected]/node_modules/@smithy/middleware-serde/dist-cjs/index.js:35:20
      at async /node_modules/.pnpm/@[email protected]/node_modules/@smithy/core/dist-cjs/index.js:165:18
      at async /node_modules/.pnpm/@[email protected]/node_modules/@smithy/middleware-retry/dist-cjs/index.js:320:38
      at async /node_modules/.pnpm/@[email protected]/node_modules/@aws-sdk/middleware-logger/dist-cjs/index.js:34:22
      at async /node_modules/.pnpm/@[email protected]/node_modules/@aws-sdk/client-sts/dist-cjs/index.js:1383:43
      at async /node_modules/.pnpm/@[email protected]/node_modules/@smithy/property-provider/dist-cjs/index.js:97:27
      at async coalesceProvider (/node_modules/.pnpm/@[email protected]/node_modules/@smithy/property-provider/dist-cjs/index.js:124:18)
      at async /node_modules/.pnpm/@[email protected]/node_modules/@smithy/property-provider/dist-cjs/index.js:142:18
      at async /node_modules/.pnpm/@[email protected]/node_modules/@smithy/core/dist-cjs/index.js:82:17
      at async /node_modules/.pnpm/@[email protected]/node_modules/@aws-sdk/middleware-logger/dist-cjs/index.js:34:22
    '$fault': 'client',
    '$metadata': {
      httpStatusCode: 403,
      requestId: '<requestId>',
      extendedRequestId: undefined,
      cfId: undefined,
      attempts: 1,
      totalRetryDelay: 0
    },
    Type: 'Sender',
    Code: 'AccessDenied'
  },
  metadata: {
    httpStatusCode: 403,
    requestId: '<requestId>',
    extendedRequestId: undefined,
    cfId: undefined,
    attempts: 1,
    totalRetryDelay: 0
  }
}

@RabahZeineddine
Copy link

@RanVaknin after checking and debugging I found that an internal updated changed the namespace of the IRSA.. that broke this application .. just solved and updated to the latest aws-sdk version and it is working!

Thank you for your help

@RanVaknin
Copy link
Contributor

Hey @RabahZeineddine ,
Glad this worked for you.

@meenar-se let me know if you are able to do the same sort of debugging.

Thanks,
Ran~

@RanVaknin RanVaknin added response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. and removed response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. labels Aug 6, 2024
@meenar-se
Copy link
Author

meenar-se commented Aug 9, 2024

@RanVaknin - fromWebToken is the one that being triggered. fromToken is throwing an error. Below are the detailed logs

fromTokenFile:: {"message":"{"name":"CredentialsProviderError","tryNextLink":true}"}I

fromWebToken: I have tried using the method fromWebToken and below is the detailed logs:

endpoints Initial EndpointParams: {
"UseGlobalEndpoint": false,
"UseFIPS": false,
"Region": "us-east-1",
"UseDualStack": false
}
endpoints evaluateCondition: booleanEquals($UseGlobalEndpoint, true) = false
endpoints evaluateCondition: isSet($Endpoint) = false
endpoints evaluateCondition: isSet($Region) = true
endpoints evaluateCondition: aws.partition($Region) = {
"dnsSuffix": "amazonaws.com",
"dualStackDnsSuffix": "api.aws",
"implicitGlobalRegion": "us-east-1",
"name": "aws",
"supportsDualStack": true,
"supportsFIPS": true,
"description": "US East (N. Virginia)"
}
endpoints assign: PartitionResult := {
"dnsSuffix": "amazonaws.com",
"dualStackDnsSuffix": "api.aws",
"implicitGlobalRegion": "us-east-1",
"name": "aws",
"supportsDualStack": true,
"supportsFIPS": true,
"description": "US East (N. Virginia)"
}
endpoints evaluateCondition: booleanEquals($UseFIPS, true) = false
endpoints evaluateCondition: booleanEquals($UseFIPS, true) = false
endpoints evaluateCondition: booleanEquals($UseDualStack, true) = false
endpoints evaluateCondition: stringEquals($Region, aws-global) = false
endpoints Resolving endpoint from template: {
"url": "https://sts.{Region}.{PartitionResult#dnsSuffix}",
"properties": {},
"headers": {}
}
endpoints Resolved endpoint: {
"headers": {},
"properties": {},
"url": "https://sts.us-east-1.amazonaws.com/"
}
{
clientName: 'STSClient',
commandName: 'AssumeRoleWithWebIdentityCommand',
input: {
RoleArn: 'arn:aws:iam::<account_id>:role/<role_name>',
RoleSessionName: 'session',
WebIdentityToken: '***SensitiveInformation***',
ProviderId: undefined,
PolicyArns: undefined,
Policy: undefined,
DurationSeconds: undefined
},
error: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity
at throwDefaultError (/fexample/node_modules/@smithy/smithy-client/dist-cjs/index.js:840:20)
at /fexample/node_modules/@smithy/smithy-client/dist-cjs/index.js:849:5
at de_CommandError (/fexample/node_modules/@aws-sdk/client-sts/dist-cjs/index.js:478:14)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
'$fault': 'client',
'$metadata': {
httpStatusCode: 403,
requestId: '8659fc10-ff63-4056-b411-90f01a01a098',
extendedRequestId: undefined,
cfId: undefined,
attempts: 1,
totalRetryDelay: 0
},
Type: 'Sender',
Code: 'AccessDenied'
},
metadata: {
httpStatusCode: 403,
requestId: '8659fc10-ff63-4056-b411-90f01a01a098',
extendedRequestId: undefined,
cfId: undefined,
attempts: 1,
totalRetryDelay: 0
}
}

Note: Again tried with v3.577.0 its working fine and also tried with the latest version v3.624.0 getting Access Denied error.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. label Aug 10, 2024
@RanVaknin
Copy link
Contributor

RanVaknin commented Aug 12, 2024

Hi @meenar-se ,

Can you please have these logs enabled in both versions and see what changes in the request logs?

If something in the SDK code changed between the versions that influences the way the request is sent, you should be able to see what has changed in the outgoing request part of the logs.
If the request in both versions is identical, then you have an issue with the way you are deploying your application to EKS.

commandName: 'AssumeRoleWithWebIdentityCommand',
input: {
RoleArn: 'arn:aws:iam::<account_id>:role/<role_name>',
RoleSessionName: 'session',
WebIdentityToken: 'SensitiveInformation',
ProviderId: undefined,
PolicyArns: undefined,
Policy: undefined,
DurationSeconds: undefined
},

You redacted the RoleArn, is that changing between the two requests? maybe a different role arn is configured

Note: Again tried with v3.577.0 its working fine and also tried with the latest version v3.624.0 getting Access Denied error.

Can you tell me how you are upgrading / downgrading versions? Do you have an ECR image you are using to roll the version back and forward? Are you just execting into the pod and using npm install @aws-sdk/client-ssm@version to move between versions? This part is crucial here because I have tried reproducing the reported behavior with my own cluster and Im not seeing this issue. This leads me to believe that you are deploying two different applications and the versions is not the only thing that is different here.

You didn't address the previous suggestion:

How are you deploying your code changes? are you using an Image from ECR? I would make sure you are deploying the correct image with the correct tag to your pod.

Thanks,
Ran~

@RanVaknin RanVaknin added the response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. label Aug 12, 2024
@meenar-se
Copy link
Author

@RanVaknin - We have performed a detailed analysis at our end today and below are the outcome,
We noticed that this issue started popping up from the version v3.587.0 and we suspecting it could be caused by this fix.

Also we have tried out with the version v3.583.0. Its working fine. The releases in between v3.587.0 and 3.583.0 does not have any changes related to the required packages.

Observations:
Below is the content from our aws credentials file.
v3.587.0 is using the app role from the profile [default] which is incorrect.
v3.583.0 is using the pod role from the profile [web_token].

@example-poc-xxx:$ cat /.aws/credentials
[default]
source_profile=web_token
role_arn=arn:aws:iam::xxxx:role/example-poc

[web_token]
web_identity_token_file=/var/run/secrets/eks.amazonaws.com/serviceaccount/token
role_arn=arn:aws:iam::xxxx:role/example-poc20240619222733960600000002 

Logs from version 3.587.0

@aws-sdk/credential-provider-node - defaultProvider::fromSSO
@smithy/property-provider -> Skipping SSO provider in default chain (inputs do not include SSO fields).
@aws-sdk/credential-provider-node - defaultProvider::fromIni
@aws-sdk/credential-provider-ini - fromIni
default isAssumeRoleWithSourceProfile source_profile=web_token
@aws-sdk/credential-provider-ini - resolveAssumeRoleCredentials (STS)
@aws-sdk/credential-provider-ini - finding credential resolver using source_profile=[web_token]
@aws-sdk/credential-provider-web-identity - fromTokenFile
@aws-sdk/credential-provider-web-identity - fromWebToken
@aws-sdk/client-sts::resolveRegion
endpoints Initial EndpointParams: {
"UseGlobalEndpoint": false,
"UseFIPS": false,
"Region": "us-east-1",
"UseDualStack": false
}
endpoints evaluateCondition: booleanEquals($UseGlobalEndpoint, true) = false
endpoints evaluateCondition: isSet($Endpoint) = false
endpoints evaluateCondition: isSet($Region) = true
endpoints evaluateCondition: aws.partition($Region) = {
"dnsSuffix": "amazonaws.com",
"dualStackDnsSuffix": "api.aws",
"implicitGlobalRegion": "us-east-1",
"name": "aws",
"supportsDualStack": true,
"supportsFIPS": true,
"description": "US East (N. Virginia)"
}
endpoints assign: PartitionResult := {
"dnsSuffix": "amazonaws.com",
"dualStackDnsSuffix": "api.aws",
"implicitGlobalRegion": "us-east-1",
"name": "aws",
"supportsDualStack": true,
"supportsFIPS": true,
"description": "US East (N. Virginia)"
}
endpoints evaluateCondition: booleanEquals($UseFIPS, true) = false
endpoints evaluateCondition: booleanEquals($UseFIPS, true) = false
endpoints evaluateCondition: booleanEquals($UseDualStack, true) = false
endpoints evaluateCondition: stringEquals($Region, aws-global) = false
endpoints Resolving endpoint from template: {
"url": "https://sts.{Region}.{PartitionResult#dnsSuffix}",
"properties": {},
"headers": {}
}
endpoints Resolved endpoint: {
"headers": {},
"properties": {},
"url": "https://sts.us-east-1.amazonaws.com/"
}
{
clientName: 'STSClient',
commandName: 'AssumeRoleWithWebIdentityCommand',
input: {
RoleArn: 'arn:aws:iam::xxxx:role/example-poc',
RoleSessionName: 'aws-sdk-js-session-1723518165718',
WebIdentityToken: '***SensitiveInformation***',
ProviderId: undefined,
PolicyArns: undefined,
Policy: undefined,
DurationSeconds: undefined
},
error: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity
at throwDefaultError (/fastify-sts-example/node_modules/@smithy/smithy-client/dist-cjs/index.js:840:20)
at /fastify-sts-example/node_modules/@smithy/smithy-client/dist-cjs/index.js:849:5
at de_CommandError (/fastify-sts-example/node_modules/@aws-sdk/client-sts/dist-cjs/index.js:478:14)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
'$fault': 'client',
'$metadata': {
httpStatusCode: 403,
requestId: 'c0aa6878-959a-4a79-8fba-c5df8a4a5675',
extendedRequestId: undefined,
cfId: undefined,
attempts: 1,
totalRetryDelay: 0
},
Type: 'Sender',
Code: 'AccessDenied'
},
metadata: {
httpStatusCode: 403,
requestId: 'c0aa6878-959a-4a79-8fba-c5df8a4a5675',
extendedRequestId: undefined,
cfId: undefined,
attempts: 1,
totalRetryDelay: 0
}
}
{
clientName: 'SSMClient',
commandName: 'GetParameterCommand',
input: {
Name: '/test/example-output',
WithDecryption: true
},
error: AccessDenied: Not authorized to perform sts:AssumeRoleWithWebIdentity
at throwDefaultError (/fastify-sts-example/node_modules/@smithy/smithy-client/dist-cjs/index.js:840:20)
at /fastify-sts-example/node_modules/@smithy/smithy-client/dist-cjs/index.js:849:5
at de_CommandError (/fastify-sts-example/node_modules/@aws-sdk/client-sts/dist-cjs/index.js:478:14)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
'$fault': 'client',
'$metadata': {
httpStatusCode: 403,
requestId: 'c0aa6878-959a-4a79-8fba-c5df8a4a5675',
extendedRequestId: undefined,
cfId: undefined,
attempts: 1,
totalRetryDelay: 0
},
Type: 'Sender',
Code: 'AccessDenied'
},
metadata: {
httpStatusCode: 403,
requestId: 'c0aa6878-959a-4a79-8fba-c5df8a4a5675',
extendedRequestId: undefined,
cfId: undefined,
attempts: 1,
totalRetryDelay: 0
}
}

Logs from 3.583.0

@aws-sdk/credential-provider-node
@aws-sdk/credential-provider-node
@aws-sdk/credential-provider-ini
{"level":30,"time":1723520503652,"pid":1,"hostname":"example-poc","msg":"ping method started"}
{"level":30,"time":1723520503657,"pid":1,"hostname":"example-poc-7ffcb48747-lzfgr","msg":"middeware stack request:, undefined"}
@aws-sdk/credential-provider-ini
@aws-sdk/credential-provider-web-identity
@aws-sdk/credential-provider-web-identity
@aws-sdk/client-sts::resolveRegion
endpoints Initial EndpointParams: {
"UseGlobalEndpoint": false,
"UseFIPS": false,
"Region": "us-east-1",
"UseDualStack": false
}
endpoints evaluateCondition: booleanEquals($UseGlobalEndpoint, true) = false
endpoints evaluateCondition: isSet($Endpoint) = false
endpoints evaluateCondition: isSet($Region) = true
endpoints evaluateCondition: aws.partition($Region) = {
"dnsSuffix": "amazonaws.com",
"dualStackDnsSuffix": "api.aws",
"implicitGlobalRegion": "us-east-1",
"name": "aws",
"supportsDualStack": true,
"supportsFIPS": true,
"description": "US East (N. Virginia)"
}
endpoints assign: PartitionResult := {
"dnsSuffix": "amazonaws.com",
"dualStackDnsSuffix": "api.aws",
"implicitGlobalRegion": "us-east-1",
"name": "aws",
"supportsDualStack": true,
"supportsFIPS": true,
"description": "US East (N. Virginia)"
}
endpoints evaluateCondition: booleanEquals($UseFIPS, true) = false
endpoints evaluateCondition: booleanEquals($UseFIPS, true) = false
endpoints evaluateCondition: booleanEquals($UseDualStack, true) = false
endpoints evaluateCondition: stringEquals($Region, aws-global) = false
endpoints Resolving endpoint from template: {
"url": "https://sts.{Region}.{PartitionResult#dnsSuffix}",
"properties": {},
"headers": {}
}
endpoints Resolved endpoint: {
"headers": {},
"properties": {},
"url": "https://sts.us-east-1.amazonaws.com/"
}
{
clientName: 'STSClient',
commandName: 'AssumeRoleWithWebIdentityCommand',
input: {
RoleArn: 'arn:aws:iam::xxxx:role/example-poc20240619222733960600000002',
RoleSessionName: 'aws-sdk-js-session-1723520503676',
WebIdentityToken: '***SensitiveInformation***',
ProviderId: undefined,
PolicyArns: undefined,
Policy: undefined,
DurationSeconds: undefined
},
output: {
Credentials: {
AccessKeyId: '<AccessKeyId>',
SecretAccessKey: '***SensitiveInformation***',
SessionToken: '<sessiontoken>',
Expiration: 2024-08-13T04:41:43.000Z
},
SubjectFromWebIdentityToken: 'system:serviceaccount:example-apis:example-poc-aws',
AssumedRoleUser: {
AssumedRoleId: '<AccessKeyId>:aws-sdk-js-session-1723520503676',
Arn: 'arn:aws:sts::xxxx:assumed-role/example-poc20240619222733960600000002/aws-sdk-js-session-1723520503676'
},
Provider: 'arn:aws:iam::xxxxx:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/xxxxx',
Audience: 'sts.amazonaws.com'
},
metadata: {
httpStatusCode: 200,
requestId: '2d4c0c3e-fdec-481d-adc8-75942d524052',
extendedRequestId: undefined,
cfId: undefined,
attempts: 1,
totalRetryDelay: 0
}
}

Testing & Deployment:

  1. We are using the same EKS app to test both version one after another.
  2. We are not upgrading or downgrading the libraries from pod directly. Every time we are making changes to our source code.
    • npm uninstall @aws-sdk/client-ssm
    • Delete the package-lock.json
    • npm install @aws-sdk/client-ssm@required-version
  3. We are not using ECR. We are uploading our docker images to our private registry and using it. We have verified the docker image and tags. No issues with that.

@github-actions github-actions bot removed the response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. label Aug 14, 2024
@meenar-se
Copy link
Author

@kuhe - can you please look into this issue once? We are suspecting something broken after this fix

@RanVaknin
Copy link
Contributor

RanVaknin commented Aug 15, 2024

Hi @meenar-se ,

While we investigate this internally, can you weigh in on why are you using an INI file in the first place?

In an EKS context the SDK should be able to operate without the need to configure an INI file. The EKS pod should be injected with the Role ARN env variable, and populated with the token file and the EKS provider automatically will use those to make that underlying AssumeRoleWithWebIdentity call.

Can you try working around this by removing the INI altogether? Or is there a specific use case for that INI file to exist on your pod's file system?

Thanks,
Ran~

@RanVaknin RanVaknin added the response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. label Aug 15, 2024
@karthikeyanjp
Copy link

karthikeyanjp commented Aug 22, 2024

Hi @RanVaknin

Here is our ini file

[default]
source_profile=web_token
role_arn=<app-role-arn>

[web_token]
web_identity_token_file=<token-file-location>
role_arn=<federation-role-arn>

The federation-role-arn has a only policy to allow assumerole for app-role-arn.

With the earlier versions of sdk, we were able to chain roles and assume then recursively. In the above example, the federation-role-arn was assumed first followed by assuming the app-role-arn.

But with recent sdk versions, it fails as it assumes app-role-arn first . Even if we remove the role-arn from default profile, it assumes federation-role-arn. The chaining no longer works, and we cannot make it assume other roles like app-role-arn.

@RanVaknin RanVaknin added queued This issues is on the AWS team's backlog p1 This is a high priority issue and removed response-requested Waiting on additional info and feedback. Will move to \"closing-soon\" in 7 days. p2 This is a standard priority issue labels Aug 23, 2024
@RanVaknin RanVaknin removed their assignment Aug 26, 2024
@kuhe kuhe added the pending-release This issue will be fixed by an approved PR that hasn't been released yet. label Sep 13, 2024
@kuhe
Copy link
Contributor

kuhe commented Sep 13, 2024

A fix for credential assume-role chaining was released in https://github.com/aws/aws-sdk-js-v3/releases/tag/v3.651.1.

The option of having a final credential_source with no role_arn was preserved.

@kuhe kuhe added closing-soon This issue will automatically close in 4 days unless further comments are made. and removed pending-release This issue will be fixed by an approved PR that hasn't been released yet. labels Sep 13, 2024
Copy link

github-actions bot commented Oct 9, 2024

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs and link to relevant comments in this thread.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 9, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug This issue is a bug. closing-soon This issue will automatically close in 4 days unless further comments are made. p1 This is a high priority issue queued This issues is on the AWS team's backlog
Projects
None yet
Development

No branches or pull requests

6 participants