Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mirrord crashes when process runs AWS STS Request #757

Closed
eyalb181 opened this issue Nov 17, 2022 · 9 comments
Closed

mirrord crashes when process runs AWS STS Request #757

eyalb181 opened this issue Nov 17, 2022 · 9 comments
Assignees
Labels
bug Something isn't working user

Comments

@eyalb181
Copy link
Member

Bug Description

See description.

Steps to Reproduce

Gist, including logs and repro steps, here:
https://gist.github.com/mentos1386/749f2b60c57adbb201a610700b1a0066#file-repro-ts

Backtrace

No response

Relevant Logs

No response

Your operating system and version

TBD (some Linux)

Local process

npx

Local process version

No response

Additional Info

No response

@eyalb181 eyalb181 added bug Something isn't working user labels Nov 17, 2022
@mentos1386
Copy link

Your operating system and version:

Manjaro Linux

Local process version:

node --version
v19.0.1

npm --version
8.19.2

@mentos1386
Copy link

mentos1386 commented Nov 21, 2022

I'm debugging this a bit more. It seems that it's not an aws-sdk specific issue. It seems that the first request "hangs" while the rest of them work. This even works if the first request is done to some other hostname.

Not sure what's going on here.

import { STSClient, GetCallerIdentityCommand } from '@aws-sdk/client-sts';
import { request } from 'https';

const createRequest = (name: string) => {
  const req = request(
    {
      hostname: 'sts.amazonaws.com',
      port: 443,
      path: '/',
      method: 'GET',
    },
    res => {
      console.log(`request ${name} done!`);
    },
  );
  req.end();
};

const main = async () => {
  createRequest('first');
  createRequest('second');

  const stsClient = new STSClient({});

  await stsClient.send(new GetCallerIdentityCommand({}));
  console.log('aws sts done!');
};

main();

Logs

✓ layer initialized
✓ agent running
  ✓ agent pod created
  ✓ pod is ready
request second done!
aws sts done!
Error: Client network socket disconnected before secure TLS connection was established
    at connResetException (node:internal/errors:705:14)
    at TLSSocket.onConnectEnd (node:_tls_wrap:1594:19)
    at TLSSocket.emit (node:events:525:35)
    at TLSSocket.emit (node:domain:489:12)
    at endReadableNT (node:internal/streams/readable:1358:12)
    at processTicksAndRejections (node:internal/process/task_queues:83:21) {
  code: 'ECONNRESET',
  path: null,
  host: 'encrypted.google.com',
  port: 443,
  localAddress: undefined
}

@mentos1386
Copy link

mentos1386 commented Nov 21, 2022

Might be related to #564? Not sure why I'm experiencing this issues even though the e2e test[1] should catch this?

[1] https://github.com/metalbear-co/mirrord/blob/main/tests/node-e2e/outgoing/test_outgoing_traffic_many_requests.mjs

Edit:
Huh, if I remove the Sts-client command. The both the first and the second request work fine. Maybe it's something weird that the aws-sdk is doing?

@aviramha
Copy link
Member

Edit: Huh, if I remove the Sts-client command. The both the first and the second request work fine. Maybe it's something weird that the aws-sdk is doing?

I had/have an assumption that it's a file that needs to be loaded remotely that we're not loading, not sure why the weird error though.

@mentos1386
Copy link

Reading file seems to work fine.

Looking at the source[1] and running just the readFileSync code works okay for me. I get the contents but one of the requests still fails.

[1] https://github.com/aws/aws-sdk-js-v3/blob/main/packages/credential-provider-web-identity/src/fromTokenFile.ts#L27-L42

@aviramha
Copy link
Member

I remember that in the trace you sent there was a read file contents just as same time as the SSL error, that's why I thought it might be the problem.
@eyalb181 is working on reproducing it on our end so we can fix it. Sorry for the inconvenience and thanks for providing more information!

@aviramha aviramha self-assigned this Nov 28, 2022
@aviramha
Copy link
Member

aviramha commented Nov 28, 2022

Hey,
I ran the same repro and it worked - >

{
  '$metadata': {
    httpStatusCode: 200,
    requestId: 'dc6ff972-186a-4cac-a2a9-9e610489852e',
    extendedRequestId: undefined,
    cfId: undefined,
    attempts: 1,
    totalRetryDelay: 0
  },
  UserId: 'censor',
  Account: 'censor',
  Arn: 'arn:aws:sts::11:assumed-role/eksctl-cluster-1-nodegroup-ng-6f1-NodeInstanceRole-v/i-111'
}

I think we might have fixed it with our last release, where we moved agent creation to the cli.
Can you re-test please?

@mentos1386
Copy link

@aviramha i can still recreate the issue on 3.11.2. Did you ever managed to reproduce it?

@aviramha
Copy link
Member

@aviramha i can still recreate the issue on 3.11.2. Did you ever managed to reproduce it?

No I couldn't test it also on older version since "aarch" build was broken :(
I used your Dockerfile and instructions, on a vanilla EKS cluster.
Do you want to do an interactive debug session maybe?

bors bot pushed a commit that referenced this issue Dec 13, 2022
Closes #757 

Co-authored-by: Aviram Hassan <[email protected]>
bors bot pushed a commit that referenced this issue Dec 13, 2022
Closes #757 

Co-authored-by: Aviram Hassan <[email protected]>
bors bot pushed a commit that referenced this issue Dec 13, 2022
Closes #757 

Co-authored-by: Aviram Hassan <[email protected]>
bors bot pushed a commit that referenced this issue Dec 13, 2022
Closes #757 

Co-authored-by: Aviram Hassan <[email protected]>
bors bot pushed a commit that referenced this issue Dec 13, 2022
Closes #757 

Co-authored-by: Aviram Hassan <[email protected]>
@bors bors bot closed this as completed in 501808c Dec 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working user
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants