-
Notifications
You must be signed in to change notification settings - Fork 979
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit "provisioner does not exist" logging and fix startup reconciliation bug #517
Conversation
if errors.IsNotFound(err) { | ||
// Queue and batch a reconcile request for a non-existent, empty provisioner | ||
// This will reduce the number of repeated error messages about a provisioner not existing | ||
c.Batcher.Add(&v1alpha3.Provisioner{}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like a nasty hack, but I don't have a better suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:)
pkg/controllers/allocation/filter.go
Outdated
return nil | ||
} | ||
if name == provisioner.Name { | ||
if !ok && v1alpha3.DefaultProvisioner.Name == provisioner.Name { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we invert the predicate it read a bit more smoothly
if ok && provisioner.Name == name
if !ok && provisioner.Name == v1alpha3.DefaultProvisioner.Name
If we could somehow assume that provisioner.Name was always populated, we could even remove the ok
bit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it's worth assuming that. Maybe we could, but feels like something that could be forgotten at some point. IMO the ok isn't too dirty and we should probably just leave it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job going red!
Issue, if available:
N/A
Description of changes:
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
Previous Log:
New Log (50 pod scale-up w/ no provisioner - 2 times, one per pod batch + 1):
New Log (1000 pod scale-up w/ no provisioner - prints 7 times, one per pod batch + 1):
Why is it "per pod batch +1" you ask?
The way the batch and work queue function. Karpenter watches for pod updates and adds the associated provisioner to the reconcile request work queue as they come in. So there will be a bunch of the same provisioner for a Deployment scale up. Reconcile will wait for a batch before starting, and then reconcile state for the cluster on the provisioner in the reconcile request. Since
Reconcile(..)
starts almost immediately (depending on the number of unique provisioners ahead in the queue and the configured reconcile workers which is 4 for karpenter currently), then a pod will be added to the work queue after reconcile has triggered. The work queue implementation won't know if the reconcile included the cluster state in the reconcile request, so it will dequeue one more after the batching ends, and since no provisioner requests are enqueued after that reconcile kicks off, no more will be dequeued.