Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleanup AWS EC2 eventual consistency warnings #9637

Merged
merged 2 commits into from
Jul 29, 2020

Conversation

hakman
Copy link
Member

@hakman hakman commented Jul 27, 2020

When a cluster is created, there are warnings such as:

W0727 09:19:16.521016    5204 executor.go:128] error running task "AutoscalingGroup/nodes-ap-northeast-2c.e2e-8e9eb293c0-ff1eb.test-cncf-aws.k8s.io" (9m58s remaining to succeed): error creating AutoscalingGroup: ValidationError: You must use a valid fully-formed launch template. Value (nodes.e2e-8e9eb293c0-ff1eb.test-cncf-aws.k8s.io) for parameter iamInstanceProfile.name is invalid. Invalid IAM Instance Profile name
	status code: 400, request id: 56bfbee6-2dad-4c52-902e-5a8ddf9f5734

This happens because of AWS EC2 eventual consistency issues:
https://docs.aws.amazon.com/AWSEC2/latest/APIReference/query-api-troubleshooting.html#eventual-consistency

@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 27, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hakman

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jul 27, 2020
@k8s-ci-robot k8s-ci-robot added area/provider/aws Issues or PRs related to aws provider size/M Denotes a PR that changes 30-99 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jul 27, 2020
@hakman hakman force-pushed the aws-eventual-consistency branch from 159c185 to 0876165 Compare July 27, 2020 16:33
@johngmyers
Copy link
Member

Good question whether we prefer the individual tasks to sleep or letting the task scheduler retry after a delay.

@hakman
Copy link
Member Author

hakman commented Jul 27, 2020

Good question whether we prefer the individual tasks to sleep or letting the task scheduler retry after a delay.

The task already retries until it succeeds, but the problem is the useless warnings. Unless there is a way to tell the task just to retry and not print a warning, seems best to keep this logic in each individual task.

@hakman
Copy link
Member Author

hakman commented Jul 27, 2020

/retest

@johngmyers
Copy link
Member

We could create a new type which, if embedded in an error, causes fi.RunTasks() to print at info level instead of warning.

@hakman hakman force-pushed the aws-eventual-consistency branch from 07ea166 to 75c5c34 Compare July 28, 2020 03:16

if awsup.AWSErrorCode(err) == "ValidationError" {
message := awsup.AWSErrorMessage(err)
if strings.Contains(message, "not authorized") || strings.Contains(message, "Invalid IamInstance") {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I never seen these warnings in my tests. Would like to see if the handling for these errors is still needed.

@hakman
Copy link
Member Author

hakman commented Jul 28, 2020

We could create a new type which, if embedded in an error, causes fi.RunTasks() to print at info level instead of warning.

I think it should be better now.

I0728 03:21:56.274518    6572 executor.go:103] Tasks: 83 done / 85 total; 2 can run
I0728 03:21:57.196367    6572 executor.go:129] Task "AutoscalingGroup/nodes-ca-central-1b.e2e-4b6cbc8b91-ff1eb.test-cncf-aws.k8s.io" not ready: Waiting for IAM Instance Profile to be propagated
I0728 03:21:57.196407    6572 executor.go:129] Task "AutoscalingGroup/master-ca-central-1b.masters.e2e-4b6cbc8b91-ff1eb.test-cncf-aws.k8s.io" not ready: Waiting for IAM Instance Profile to be propagated
I0728 03:21:57.196416    6572 executor.go:147] No progress made, sleeping before retrying 2 task(s)
I0728 03:22:07.196668    6572 executor.go:103] Tasks: 83 done / 85 total; 2 can run

@hakman
Copy link
Member Author

hakman commented Jul 28, 2020

/retest

@hakman hakman force-pushed the aws-eventual-consistency branch from 75c5c34 to d4dcf3e Compare July 28, 2020 07:06
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jul 28, 2020
@hakman hakman force-pushed the aws-eventual-consistency branch 2 times, most recently from c7b0d46 to 4c04a65 Compare July 28, 2020 14:33
@hakman hakman changed the title WIP: Cleanup AWS EC2 eventual consistency warnings Cleanup AWS EC2 eventual consistency warnings Jul 28, 2020
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jul 28, 2020
@hakman
Copy link
Member Author

hakman commented Jul 28, 2020

/retest

@hakman hakman force-pushed the aws-eventual-consistency branch from 4c04a65 to 3da7927 Compare July 28, 2020 16:04
@hakman
Copy link
Member Author

hakman commented Jul 28, 2020

/retest

@hakman
Copy link
Member Author

hakman commented Jul 29, 2020

/assign @johngmyers

upup/pkg/fi/cloudup/awstasks/autoscalinggroup.go Outdated Show resolved Hide resolved
upup/pkg/fi/cloudup/awstasks/launchconfiguration.go Outdated Show resolved Hide resolved
upup/pkg/fi/cloudup/awstasks/route.go Outdated Show resolved Hide resolved
upup/pkg/fi/cloudup/awstasks/route.go Outdated Show resolved Hide resolved
@hakman hakman force-pushed the aws-eventual-consistency branch from 3da7927 to 3a11207 Compare July 29, 2020 20:38
@johngmyers
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 29, 2020
@k8s-ci-robot k8s-ci-robot merged commit be78301 into kubernetes:master Jul 29, 2020
@k8s-ci-robot k8s-ci-robot added this to the v1.19 milestone Jul 29, 2020
@hakman hakman deleted the aws-eventual-consistency branch July 30, 2020 02:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/provider/aws Issues or PRs related to aws provider cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants