Skip to content
This repository has been archived by the owner on Mar 5, 2024. It is now read-only.

kiam-server still accesses to STS global endpoint when setting --region #368

Closed
offzale opened this issue Jan 23, 2020 · 6 comments
Closed
Labels
Milestone

Comments

@offzale
Copy link

offzale commented Jan 23, 2020

Kiam-server is still accessing the global endpoint when setting the --region argument, instead of using the STS regional endpoint. Even setting both --region argument and AWS_REGION environment variable (as #344 suggests) to the region value, us-east-1.

After limiting the egress to the internet (allowing access to AWS Metadata) and setting the necessary VPC Endpoints to access to AWS Services (including regional STS endpoint), we get this error in the server:

"level":"error","msg":"error requesting credentials: RequestError: send request failed\ncaused by: Post https://sts.amazonaws.com/: dial tcp 54.239.29.25:443: connect: connection timed out"

We would expect kiam-server to use the regional endpoint instead, sts.us-east-1.amazonaws.com. Is there any reason for kiam-server to still access the global endpoint? or maybe we are missing some configuration?

Kiam-server version: 3.5

@pingles
Copy link
Contributor

pingles commented Oct 16, 2020

This definitely seems like a bug, not sure what's happening.

If someone is able to write some tests around it to try and prove the error and help fix I'd really appreciate. I'll try and take a look when I can but can't promise when.

@pingles
Copy link
Contributor

pingles commented Oct 20, 2020

So having re-read the code, I wonder if the issue is actually the ordering of this:

https://github.com/uswitch/kiam/blob/master/pkg/aws/sts/gateway.go#L88

If assumeRoleArn is set, then Kiam creates a session to retrieve credentials for the server role, to then use ahead of all future calls.

Line 89 doesn't use the Region information when doing this, so it will end up defaulting to the global endpoint.

https://github.com/uswitch/kiam/blob/master/pkg/aws/sts/gateway.go#L89

Does that sound plausible given your setup @offzale?

@randomvariable @Joseph-Irving is there a reason it would need to use the global endpoint for retrieving server credentials, or can those also be forwarded to the local endpoint?

@randomvariable
Copy link

Doesn't matter which endpoint is used as long as it's reachable. The credentials will always be globally scoped.

Hitting the global endpoint becomea an issue in Govcloud/Secret partitions though, so ideally would use the local endpoint.

@pingles
Copy link
Contributor

pingles commented Oct 20, 2020

Awesome, thanks Naadir. I'll try and refactor it around, add some tests but sounds like this may have been the cause of the noted behaviour.

pingles added a commit that referenced this issue Oct 20, 2020
* Removed most of the region config setup from sts.DefaultGateway into a configBuilder, added more tests around configBuilder to confirm behaviour
* Changed server to request server credentials with the server assume role after configuring for region, should address #368
* Regional endpoint adds a us-iso prefix to handle airgapped regions addressing #410
* Updated version of AWS SDK to 1.35
pingles added a commit that referenced this issue Oct 23, 2020
* Improve region endpoint resolver
Pull out into separate type and add some more tests about handling of gov, fips and chinese regions.
Additionally removes comparison against SDK regions and instead relies on DNS resolution to verify.
* Removed most of the region config setup from sts.DefaultGateway into a configBuilder, added more tests around configBuilder to confirm behaviour
* Changed server to request server credentials with the server assume role after configuring for region, should address #368
* Regional endpoint adds a us-iso prefix to handle airgapped regions addressing #410
* Updated version of AWS SDK to 1.35
* Add server tests
* Refactorings and more tests
* Created a Kiam Server integration test to make it easier to test
* Extracted server and gateway into builders, tidies complicated construction and makes it easier to test
* Move the cacheSize metric from a counter to a gauge and define in metrics.go and outside of DefaultCache func, removes duplicate panic in tests
* KiamGatewayBuilder adds WithMaxRetries to configure the gRPC retry behaviour (by default, doesn't retry operations) and potentially helps address #425
@pingles
Copy link
Contributor

pingles commented Oct 23, 2020

I've just pushed a change in that I believe should address this (and added tests to confirm the resolver is working correctly).

It'll be in v4.0 whenever we push that but if you want to try something in a pre-prod environment you can take the latest cut (just don't depend on that for prod 😄 )

@pingles
Copy link
Contributor

pingles commented Oct 23, 2020

I'm going to close now, please re-open if you still see the same behaviour with the updated code.

@pingles pingles closed this as completed Oct 23, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants