-
Notifications
You must be signed in to change notification settings - Fork 985
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] add node metrics controller #701
Conversation
✔️ Deploy Preview for karpenter-docs-prod canceled. 🔨 Explore the source changes: fd532e0 🔍 Inspect the deploy log: https://app.netlify.com/sites/karpenter-docs-prod/deploys/614df149f1d159000714b234 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks really good, and very readable!
Would love to review this. Can you give me until Friday? |
Sure |
node := &v1.Node{} | ||
if err := c.KubeClient.Get(ctx, req.NamespacedName, node); err != nil { | ||
if errors.IsNotFound(err) { | ||
// Pass `nil` to UpdateCount rather than a "zeroed" node. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The typical reconciler pattern is to return reconcile.Result{} (terminate), since it only occurs in a race condition where the object is deleted after the watch event arrives but before the loop is executed.
I can potentially see some value in tracking this case, but doing so with node = nil
seems awkward and confusing to folks who might be looking at prometheus.
} | ||
} | ||
counter.UpdateCount( | ||
logging.WithLogger(ctx, logging.FromContext(ctx).Named("Count")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why redecorate the logger with a new name? Simpler to just counter.Update(ctx, node)
?
For(&v1.Node{}). | ||
WithOptions( | ||
controller.Options{ | ||
RateLimiter: workqueue.NewMaxOfRateLimiter( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know Karpenter defines this in other controllers, but I'd just leave this nil and rely on the default backoff retry settings. Not a bad idea to keep the concurrency though controller.Options{MaxConcurrentReconciles: 4}
.
[]string{ | ||
metrics.ProvisionerLabel, | ||
metricArchLabel, | ||
metricConditionDiskPressureLabel, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider leaving out DiskPressure, MemoryPressure, and PIDPressure, since they're out of scope of something Karpenter can help with. I'd keep these metrics focused on Karpenter's domain.
Namespace: metrics.KarpenterNamespace, | ||
Subsystem: "cluster", | ||
Name: "node_count", | ||
Help: "Count of cluster nodes. Broken out by topology and status.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: cluster nodes
is a bit redundant. Consider something like Nodes by topology and status
nodeCountGaugeVec = prometheus.NewGaugeVec( | ||
prometheus.GaugeOpts{ | ||
Namespace: metrics.KarpenterNamespace, | ||
Subsystem: "cluster", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thoughts on calling this capacity? Conceptually
- "Karpenter Capacity Metrics" = "How good of a job is Karpenter doing managing my capacity?
- "Karpenter Controller Metrics" = "How healthy is the karpenter process itself?"
karpenterapi "github.com/awslabs/karpenter/pkg/apis/provisioning/v1alpha4" | ||
"github.com/awslabs/karpenter/pkg/metrics" | ||
"github.com/prometheus/client_golang/prometheus" | ||
coreapi "k8s.io/api/core/v1" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kubernetes golang conventions recommend calling this v1
or corev1
if there's a conflict.
"reflect" | ||
"strings" | ||
|
||
karpenterapi "github.com/awslabs/karpenter/pkg/apis/provisioning/v1alpha4" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import are recommended to be of the form v1alpha4
or provisioningv1alpha4
(walking up the chain, if there's a conflict). Typically you'd only alias an import if there was some sort of conflict. This give developers a common convention to avoid alias sprawl of the same concept. It does mean that package names (and heirarchy) is critical, but it's a good sign of project organization if this convention yields english readable names. FWIW, I think you've done excellently re: package naming in this PR.
// UpdateCount updates the emitted metric based on the node's current status relative to the | ||
// past status. If the data for `node` cannot be populated then `nil` should be passed as the | ||
// argument. | ||
func UpdateCount(ctx context.Context, name types.NamespacedName, node *coreapi.Node) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider counter.Update() instead of counter.UpdateCount(), which stutters. You can also get the entire namespaced name from the node object node.Name
(every kubernetes object has it). In this case, nodes are a cluster global resource -- namespace for nodes will never make sense.
func UpdateCount(ctx context.Context, name types.NamespacedName, node *coreapi.Node) { | ||
currLabels := getLabels(node) | ||
pastLabels, isKnown := prometheusLabelsFor[name] | ||
switch { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see now why you're passing node as nil if not found.
This is a bit tricky since there's no guarantee that your controller will catch the deleted event (events are not guaranteed to arrive -- it could be delivered five times or never). The controller pattern is best effort and employs a combination of intervals (e.g. resync interval) and watches.
It feels a bit dirty, but I might just statelessly recompute everything on some interval. Even at 10k node scale, this should be pretty quick. If you're unfamiliar with the client.Client, all objects are kept in a cache and synced with incoming events (a.k.a ListWatch), so it's cheap to call client.List(nodes).
prometheusLabelsFor[name] = currLabels | ||
|
||
if err := decrementNodeCount(pastLabels); err != nil { | ||
logging.FromContext(ctx).Warnf("Failed to decrement previous count for updated node [labels=%s]: error=%s", pastLabels, err.Error()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd prefer we continue to follow the convention of ErrorF instead of WarnF.
if node == nil { | ||
return labels | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
consider just inlining all of this
labels := prometheus.Labels{
key: value,
...
}
prometheusLabelsFor[name] = currLabels | ||
|
||
if err := decrementNodeCount(pastLabels); err != nil { | ||
logging.FromContext(ctx).Warnf("Failed to decrement previous count for updated node [labels=%s]: error=%s", pastLabels, err.Error()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was testing this code out and I think there's a false assumption in here with the node properties being set. It could be that node properties like arch, os, instance type, etc are not set.
I'm getting floods of this message:
karpenter-controller-9f4878f7b-8rt9b manager 2021-09-27T21:54:04.263Z WARN controller.NodeMetrics.Count Failed to decrement previous count for updated node [labels=map[arch: instancetype: os: provisioner:default region: zone:]]: error=inconsistent label cardinality: expected 10 label values but got 6 in prometheus.Labels{"arch":"", "instancetype":"", "os":"", "provisioner":"default", "region":"", "zone":""} {"commit": "5655c94"}
There are some gaps in this approach. Closing this PR and will open a new PR with the revised approach. |
1. Issue, if available:
#612 "Emit Metrics for Karpenter"
This PR does not fully resolve the issue. More changes will be needed.
2. Description of changes:
Add a controller that watches node events and updates counters across multiple dimensions, e.g. provisioner, readiness, and instance type
3. Does this change impact docs?
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.