Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Karpenter fails to create node object if an instance does not have PrivateDnsName still an issue #1282

Closed
bwmetcalf opened this issue Feb 7, 2022 · 3 comments · Fixed by #1286
Assignees
Labels
bug Something isn't working burning Time sensitive issues

Comments

@bwmetcalf
Copy link

Version

Karpenter: v0.5.4

Kubernetes: v1.21.5

Actual Behavior

We have seen this issue twice in the past week running 0.5.4. According to #1067, the issue is fixed in 0.5.4.

#1: got instance i-0eaa445d1c9aa01bb but PrivateDnsName was not set
#2: got instance i-0eaa445d1c9aa01bb but PrivateDnsName was not set
#3: got instance i-0eaa445d1c9aa01bb but PrivateDnsName was not set     {"commit": "7e79a67", "provisioner": "gitlab-runner-provisioner-karpenter-provisioner"}

Resource Specs and Logs

Same issue as #1067

@bwmetcalf bwmetcalf added the bug Something isn't working label Feb 7, 2022
@ellistarn
Copy link
Contributor

To be clear from that issue, this is an unfortunate race condition that EC2 sometimes takes a long time to propagate the private dns name. If you're using karpenter owned launch templates, it should still register and work since karpenter injects all the expected labels into the userdata. If you're using a custom launch template, make sure you include karpenter.sh/provisioner-name as a label passed to the kubelet.

I think we could explore increasing the retries/timeout on EC2, as well.

@ellistarn ellistarn added the burning Time sensitive issues label Feb 7, 2022
@felix-zhe-huang
Copy link
Contributor

Is there a known way to reproduce this bug constantly?

@felix-zhe-huang felix-zhe-huang self-assigned this Feb 7, 2022
@felix-zhe-huang felix-zhe-huang added bug Something isn't working and removed bug Something isn't working labels Feb 7, 2022
@felix-zhe-huang
Copy link
Contributor

PR #1286 doubles the get instance retry to 6 times, one second apart.
This will double the time that a private DNS name can be assigned before Karpenter raises an error.

@felix-zhe-huang felix-zhe-huang linked a pull request Feb 8, 2022 that will close this issue
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working burning Time sensitive issues
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants