Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make node names unique in tests #1598

Merged
merged 1 commit into from
Mar 30, 2022

Conversation

tzneal
Copy link
Contributor

@tzneal tzneal commented Mar 30, 2022

1. Issue, if available:

N/A

2. Description of changes:

  • make node names unique with a sequential ID
  • if the node already exists, log a debug message

3. How was this change tested?

Unit tests & scaling up/down inflate on EKS.

4. Does this change impact docs?

  • Yes, PR includes docs updates
  • Yes, issue opened: link to issue
  • No

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

- make node names unique with a sequential ID
- if the node already exists, log a debug message
@tzneal tzneal requested a review from a team as a code owner March 30, 2022 16:13
@netlify
Copy link

netlify bot commented Mar 30, 2022

Deploy Preview for karpenter-docs-prod canceled.

Name Link
🔨 Latest commit 8b286f8
🔍 Latest deploy log https://app.netlify.com/sites/karpenter-docs-prod/deploys/624481aae7a811000910ffa8

@@ -158,7 +158,9 @@ func (p *Provisioner) launch(ctx context.Context, node *scheduling.Node) error {
// ourselves to enforce the binding decision and enable images to be pulled
// before the node is fully Ready.
if _, err := p.coreV1Client.Nodes().Create(ctx, k8sNode, metav1.CreateOptions{}); err != nil {
if !errors.IsAlreadyExists(err) {
if errors.IsAlreadyExists(err) {
logging.FromContext(ctx).Debugf("node %s already registered", k8sNode.Name)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added this log as it would have saved me a few hours this morning. In a test if you somehow happen to get a duplicate node name, there are no errors reported since it could happen normally if kubelet comes up. The end result is that your test doesn't fail with some "node already exists" error and if it does fail it will look like something is really wrong. In my case, a zonal topology spread was trying to schedule a node in "test-zone-2", the node already existed in "test-zone-1", no error occurred but the test for zone skew failed.

Copy link
Contributor

@bwagner5 bwagner5 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would guess the comment is correct that this is rare, but did you check this in a real cluster to make sure that debug doesn't spam in the common case?

otherwise, lgtm

@tzneal
Copy link
Contributor Author

tzneal commented Mar 30, 2022

I would guess the comment is correct that this is rare, but did you check this in a real cluster to make sure that debug doesn't spam in the common case?

otherwise, lgtm

Yup, nothing normally logs in a real cluster in my testing, I'm not sure how often kubelet would beat us to the node creation, but it doesn't seem likely:

karpenter-84894b6b77-7j5gb controller 2022-03-30T16:13:05.858Z	INFO	controller.provisioning	Batched 5 pods in 1.071391856s	{"commit": "8b286f8", "provisioner": "default"}
karpenter-84894b6b77-7j5gb controller 2022-03-30T16:13:06.009Z	DEBUG	controller.provisioning	Discovered security groups: [sg-00c5fed6615db429d sg-063c542ddd4538485]	{"commit": "8b286f8", "provisioner": "default"}
karpenter-84894b6b77-7j5gb controller 2022-03-30T16:13:06.011Z	DEBUG	controller.provisioning	Discovered kubernetes version 1.21	{"commit": "8b286f8", "provisioner": "default"}
karpenter-84894b6b77-7j5gb controller 2022-03-30T16:13:06.062Z	DEBUG	controller.provisioning	Discovered ami-05f60cea94ba81bec for query /aws/service/eks/optimized-ami/1.21/amazon-linux-2/recommended/image_id	{"commit": "8b286f8", "provisioner": "default"}
karpenter-84894b6b77-7j5gb controller 2022-03-30T16:13:06.227Z	DEBUG	controller.provisioning	Created launch template, Karpenter-tnealt-karpenter-demo-1937467200527745945	{"commit": "8b286f8", "provisioner": "default"}
karpenter-84894b6b77-7j5gb controller 2022-03-30T16:13:09.255Z	INFO	controller.provisioning	Launched instance: i-0983db01fa351a7a6, hostname: ip-192-168-24-90.us-west-2.compute.internal, type: t3a.2xlarge, zone: us-west-2b, capacityType: on-demand	{"commit": "8b286f8", "provisioner": "default"}
karpenter-84894b6b77-7j5gb controller 2022-03-30T16:13:09.273Z	INFO	controller.provisioning	Created node with 5 pods requesting {"cpu":"5125m","pods":"8"} from types c1.xlarge, c3.2xlarge, c4.2xlarge, c6i.2xlarge, c5d.2xlarge and 205 other(s)	{"commit": "8b286f8", "provisioner": "default"}
karpenter-84894b6b77-7j5gb controller 2022-03-30T16:13:09.326Z	INFO	controller.provisioning	Waiting for unschedulable pods	{"commit": "8b286f8", "provisioner": "default"}
karpenter-84894b6b77-7j5gb controller 2022-03-30T16:13:09.326Z	DEBUG	controller.selection	Relaxing soft constraints for pod since it previously failed to schedule, adding: toleration for PreferNoSchedule taints	{"commit": "8b286f8", "pod": "default/inflate-6b88c9fb68-qlm94"}
karpenter-84894b6b77-7j5gb controller 2022-03-30T16:13:10.327Z	INFO	controller.provisioning	Batched 1 pods in 1.000529468s	{"commit": "8b286f8", "provisioner": "default"}

If it does log, it will be Debug only.

@tzneal tzneal merged commit 9e31a3f into aws:main Mar 30, 2022
@tzneal tzneal deleted the make-node-names-unique-in-tests branch March 30, 2022 17:27
@suket22 suket22 mentioned this pull request May 23, 2022
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants