-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Continue if asg instance is unknown #9900
Continue if asg instance is unknown #9900
Conversation
@@ -768,6 +768,9 @@ func awsBuildCloudInstanceGroup(c AWSCloud, cluster *kops.Cluster, ig *kops.Inst | |||
|
|||
for _, i := range g.Instances { | |||
id := aws.StringValue(i.InstanceId) | |||
if instances[id] == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps put after the following if block, as instances[""]
is unlikely to be non-nil
.
It should probably be after the LifecycleStateTerminating
check as well.
Would we want to klog.Warningf
for this as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense to move the function. Done.
This scenario is so common that I would find warnings to be noisy. It would happen consistently through a rolling update for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We consistently get the "ignoring instance as it is terminating" warning during rolling updates. I consider it more informative than noisy, but to be fair it should be info instead of warning.
I suppose in this case we don't have anything informative to log. This is a semantic change: ignoring these instances will cause them to not contribute to satisfying the "enough instances in group" cluster validation check (which might be good) but it will also prevent them from being updated if they're stuck in that state (and on an old spec).
Most likely this comes from ASG thinking the instance is inService, but it is terminating when we run describe from EC2
e2f9da0
to
f6abac3
Compare
I'll let this proceed on the assumption that it isn't likely that instances would get stuck in this state. By the way, I noticed that |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: johngmyers, olemarkus The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Most likely this comes from ASG thinking the instance is inService, but it is terminating when we run describe from EC2.
kops get instances
should maybe also show instance status: terminating at some point, but that would also require changes in the rolling update code as it now assumes we filter terminating/terminated nodes.