Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

X-RAY creates new session with hardcoded region #167

Closed
polamayster opened this issue Jul 24, 2019 · 13 comments
Closed

X-RAY creates new session with hardcoded region #167

polamayster opened this issue Jul 24, 2019 · 13 comments

Comments

@polamayster
Copy link
Contributor

polamayster commented Jul 24, 2019

There is probably a reason why new session is created with specific region us-west-2 (is X-Ray available only in this region?):

def _create_xray_client(self, ip='127.0.0.1', port='2000'):
    session = botocore.session.get_session()
    url = 'http://%s:%s' % (ip, port)
    return session.create_client('xray', endpoint_url=url,
                                 region_name='us-west-2',
                                 config=Config(signature_version=UNSIGNED)
                                 )

but can it be changed to:

boto3.client('xray',...)

so existing default boto3 session could be reused under the hood?
Usecase: application using IAM role to not throttle on call to /v2/credentials/ in each process/worker when those credentials were already retrieved and stored in session?
If so, I can create MR with a change.

@polamayster polamayster changed the title X-RAY cretes new session with hardcoded region X-RAY creates new session with hardcoded region Jul 24, 2019
@chanchiem
Copy link
Contributor

chanchiem commented Jul 31, 2019

Hey polamayster,

The X-Ray client here is only used to obtain the Sampling Rules and Targets for centralized sampling. It communicates with the local X-Ray Daemon in order to fetch this, rather than the actual service endpoint itself (the Daemon is responsible for fetching this data). The region here should not affect communications with the Daemon, so we should safely be able to remove the hard coded region without any issues.

As for the session, I do believe that the call to get_session() actually returns a new instance of a session and the credentials are re-fetched in each instantiation. Are you experiencing any specific issues related to this? Here is the code to the new session instantiation in Botocore.

@polamayster
Copy link
Contributor Author

polamayster commented Jul 31, 2019

Hi chanchiem!

Thanks for a response. Yes we are experiencing a specific issue related to credentials retrieval (throttling) associated with new session instantiation each time.
Actually region is just a tip of the iceberg, that caught our attention.
Quick question: can

def _create_xray_client(self, ip='127.0.0.1', port='2000'):
    session = botocore.session.get_session()
    url = 'http://%s:%s' % (ip, port)
    return session.create_client('xray', endpoint_url=url,
                                 region_name='us-west-2',
                                 config=Config(signature_version=UNSIGNED)
                                 )

be changed to:

def _create_xray_client(self, ip='127.0.0.1', port='2000', region='us-west-2'):
    url = 'http://%s:%s' % (ip, port)
    return boto3.client('xray', endpoint_url=url,
                        region_name=region,
                        config=Config(signature_version=UNSIGNED)
                        )

explanation:
Turned out the real problem we faced is that _create_xray_client(self, ip='127.0.0.1', port='2000') is creating new session by calling session = botocore.session.get_session() which in turns does return Session(env_vars) and created client tries to get credentials from /v2/credentials/ for IAM role and when it is done in many gunicorn workers inside many docker containers running on ECS repeatedly results in throttling - 429 response code (we can't do gunicorn's preload_app() as it is not compatible with worker_class = 'gevent').

Just wondering why x-ray client needs separate session, why can't it reuse existing session by calling boto3.client('xray', ...) which results in return _get_default_session().client(*args, **kwargs)?
To overcome throttling by x-ray client trying to create session and grab credentials for IAM role in each worker we did something like this:

credentials_provider = ContainerProvider()
session = get_session()
session._credentials = credentials_provider.load()  # RefreshableCredentials
boto3.setup_default_session(botocore_session=session)
# monkey patch botocore session to always return existing session with credentials and not new one each time (x-ray)
botocore.session.get_session = lambda: session

@chanchiem
Copy link
Contributor

chanchiem commented Aug 14, 2019

Hey Polamayster,

We don't see an issue with re-using a default session, but there is one issue. Our SDK uses botocore as a dependency rather than boto3, so we'd ideally want a solution that can be accomplished in botocore. It looks like the default session is something that's implemented outside botocore in boto3.

Since the issue you're having is related to credential gathering, and our client doesn't use credentials (this part is handled by our Daemon), I wonder if a possible solution would be to predefine the credentials to a null value.

Maybe something like this

def _create_xray_client(self, ip='127.0.0.1', port='2000'):
    session = botocore.session.get_session()
    url = 'http://%s:%s' % (ip, port)
    return session.create_client('xray', endpoint_url=url,
                                 region_name='us-west-2',
                                 config=Config(signature_version=UNSIGNED),
                                 aws_access_key_id='', aws_secret_access_key=''
                                 )

Is it possible for you to test this in your configuration to see if it attempts to fetch credentials? I was able to test it on mine and it seems to have worked.

@polamayster
Copy link
Contributor Author

Hi chanchiem,

Thanks a lot for your comment! I will give your solution a try. Concern about boto3 dependency is understood.

@polamayster
Copy link
Contributor Author

polamayster commented Sep 16, 2019

Hi @chanchiem,

Sorry for a long silence, I finally got a chance to test your solution with aws_access_key_id='', aws_secret_access_key='' and looks like it works fine, so I can't see credentials = self.get_credentials() calls in logs originated from aws_xray_sdk anymore.
Would you consider to make aws_access_key_id='', aws_secret_access_key='' change in aws-xray-sdk-python or should I open a PR with your fix and some tests?

Thanks a lot once again!

@chanchiem
Copy link
Contributor

We would gladly accept PRs!

Glad the solution worked out for you!

Thanks,
Chan Chiem Jeffery Saeteurn

@polamayster
Copy link
Contributor Author

Side note: I was adding a fix and associated tests and turned out this issue is reproducible if botocore version < 1.12.68 (fix: boto/botocore@90ed2de, https://github.com/boto/botocore/blob/90ed2deadd9f8f64a3ba115b4c546019a89ef79e/botocore/session.py#L812:L813 in particular) and as

'botocore>=1.11.3',
it affects aws-xray-sdk-python clients with botocore version in this range.

So proposed fix if still relevant/appropriate regarding aws-xray-sdk-python vs botocore would be guarding against unnecessary credentials retrieval for clients with older versions of botocore.

@chanchiem, @haotianw465(https://github.com/aws/aws-xray-sdk-python/blame/master/aws_xray_sdk/core/sampling/connector.py#L163) what do you think?

@polamayster
Copy link
Contributor Author

polamayster commented Oct 15, 2019

So should fix still be implemented or issue closed with botocore version upgrade as a solution?

@chanchiem
Copy link
Contributor

I think since it's an issue that only exists in older versions of botocore, we should upgrade the supported version of botocore internally so long as it doesn't add any breaking changes for our current instrumentor. Were you able to find out why the particular version range fixes the issue?

@polamayster
Copy link
Contributor Author

Let me reiterate, because looks like I wasn't clear in explanation:
botocore starting version 1.12.68 does not require credentials for a client with config=Config(signature_version=UNSIGNED): boto/botocore@90ed2de#diff-29bf4883de496fd71c0deda976cc480fR812

if config is not None and config.signature_version is UNSIGNED:
    credentials = None

and this it used in x-ray-sdk for client creation:

def _create_xray_client(self, ip='127.0.0.1', port='2000', region='us-west-2'):
    url = 'http://%s:%s' % (ip, port)
    return boto3.client('xray', endpoint_url=url,
                        region_name=region,
                        config=Config(signature_version=UNSIGNED)
                        )

but as aws-xray-sdk-python/setup.py has botocore>=1.11.3 specified it has no affect for a range of versions that satisfy this requirement (from 1.11.3 to 1.12.67).
I do not see any breaking changes introduced by newer version of botocore neither in changelog, nor by running application with upgraded botocore version.

@chanchiem
Copy link
Contributor

Thanks for letting us know! I think since we want to be sure we accept the current version range and not constrain our customers to only use up to 1.12.67, we can use the solution of manually setting the credentials to an empty string: aws_access_key_id='', aws_secret_access_key=''

Can you confirm that it worked for versions less than as well as greater than and equal to 1.12.68?

@polamayster
Copy link
Contributor Author

Yes, credentials set to empty strings worked for botocore versions below and above 1.12.68.
Thanks on update, I will add a fix and tests shortly.

@polamayster
Copy link
Contributor Author

tests fail due to #179

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants