-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use AWS SDK to access EC2 Metadata #6779
Conversation
Some code cleanup: * Use a field for setting EC2 metadata instead of env-vars in testing; but keep environment variables for backward compatibility reasons * Update tests to use testify
5ddb75d
to
b55bc64
Compare
} | ||
} | ||
|
||
func TestNetworkFingerprint_notAWS(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I removed this test as it's exactly the same as TestEnvAWSFingerprint_nonAws
test above.
} | ||
|
||
const aws_routes = ` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not obvious to me why we use json string here instead of writing the struct values directly?
res, err := client.Get(metadataURL + k) | ||
if err != nil { | ||
resp, err := ec2meta.GetMetadata(k) | ||
if awsErr, ok := err.(awserr.RequestFailure); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RequestFailures occur when we have a non-200 http code so it's the equivalent to res.StatusCode != http.StatusOK
part below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM one question about checking *url.Error
if !f.isAWS() { | ||
ec2meta := ec2MetaClient(f.endpoint, timeout) | ||
|
||
if !ec2meta.Available() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
if awsErr, ok := err.(awserr.RequestFailure); ok { | ||
f.logger.Debug("could not read attribute value", "attribute", k, "error", awsErr) | ||
continue | ||
} else if awsErr, ok := err.(awserr.Error); ok { | ||
// if it's a URL error, assume we're not in an AWS environment | ||
// TODO: better way to detect AWS? Check xen virtualization? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this TODO and check solved by the introduction of your ec2meta.Available()
check?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe so. ec2meta.Available()
does the same check as before, namely check a metadata entry (instance-id or ami-id).
I'm not fully sure what this protect against and how we expect it to behave. If the earlier check Available metadata lookup succeeded, but looking up another metadata failed with a connection error, it feels to me we ought to retry rather than give up completely?!
Either way, I decided to leave this alone /shrug.
Fix a regression where we accidentally started treating non-AWS environments as AWS environments, resulting in bad networking settings. Two factors some at play: First, in [1], we accidentally switched the ultimate AWS test from checking `ami-id` to `instance-id`. This means that nomad started treating more environments as AWS; e.g. Hetzner implements `instance-id` but not `ami-id`. Second, some of these environments return empty values instead of errors! Hetzner returns empty 200 response for `local-ipv4`, resulting into bad networking configuration. This change fix the situation by restoring the check to `ami-id` and ensuring that we only set network configuration when the ip address is not-empty. Also, be more defensive around response whitespace input. [1] #6779
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
One step to address #6744 .
Update AWS Fingerprinter to use the AWS SDK library rather than a hand-rolling httpClient invocations.
I maintained the seemingly arbitrary 2 second timeout (the SDK uses 5 second as timeout) mostly to avoid slowing node node registration outside AWS.
Also, I kept honoring
AWS_URL_ENV
, an environment variable that was used in tests only and never documented, just in case it's used in the wild.Lastly, one commit cleans up the tests to use testify and other minor changes.
A subsequent PR will update AWS and go-discover vendored libraries.