Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Constraint validation incorrectly excluding daemonset pods in provisioning #1084

Closed
felix-zhe-huang opened this issue Jan 4, 2022 · 6 comments · Fixed by #1155
Closed

Constraint validation incorrectly excluding daemonset pods in provisioning #1084

felix-zhe-huang opened this issue Jan 4, 2022 · 6 comments · Fixed by #1155
Assignees
Labels
api Issues that require API changes bug Something isn't working burning Time sensitive issues documentation Improvements or additions to documentation

Comments

@felix-zhe-huang
Copy link
Contributor

felix-zhe-huang commented Jan 4, 2022

Version

Karpenter: v0.5.3

Kubernetes: v1.21

Expected Behavior

Daemonset pods should be included in provisioning process.

Actual Behavior

Daemonset pods with NotIn node selector were excluded when node constraints do not specify the value of the corresponding key. Further investigation suggests the constraint validation logic is incorrect.

This error also applies to application pods.

Steps to Reproduce the Problem

Resource Specs and Logs

@felix-zhe-huang felix-zhe-huang added bug Something isn't working api Issues that require API changes burning Time sensitive issues labels Jan 4, 2022
@felix-zhe-huang felix-zhe-huang self-assigned this Jan 4, 2022
@ellistarn
Copy link
Contributor

Discussed offline.

Currently, we treat the NotIn operator for unknown labels (i.e., not well known or custom) to be invalid. We do this because we can't know the labels that may be applied to the node by other operators. This causes daemonsets with NotIn rules to be incorrectly ignored incalculations.

I think it probably makes sense to start treating this case as schedulable, and users must be aware of other operators and avoid this edge case through proper configuration.

@felix-zhe-huang
Copy link
Contributor Author

felix-zhe-huang commented Jan 4, 2022

I think it probably makes sense to start treating this case as schedulable, and users must be aware of other operators and avoid this edge case through proper configuration.

I agree. We should put some best practices in the document.

@felix-zhe-huang felix-zhe-huang added the documentation Improvements or additions to documentation label Jan 4, 2022
@felix-zhe-huang
Copy link
Contributor Author

felix-zhe-huang commented Jan 8, 2022

Discussed offline, we agreed that the current requirement implementation is becoming too complex to be sustainable. It may not be a bad idea to refactor requirements entirely. I will start working on it.

@felix-zhe-huang
Copy link
Contributor Author

felix-zhe-huang commented Jan 15, 2022

The refactored requirements enables a lot of new possibilities for improvement. For example, now we can tests if two different groups of requirements can co-exist or not, that allow us to create larger pod schedule when grouping pods. Also the requirements now supports Exists and DoesNotExists operators, which aligns with k8s' design.

The PR need to be cleaned up. Also need to discuss about whether we allow Labels outside WellKnownLabels?

@Moser-ss
Copy link

Hey,
Is this issue fixed in PR #1155 ?
Because I already try to use version 0.6.1 and still have the same behavior I report in this issue #1180 (comment)

@HenryYanTR
Copy link

I am getting the same issue with 0.6.3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Issues that require API changes bug Something isn't working burning Time sensitive issues documentation Improvements or additions to documentation
Projects
None yet
4 participants