-
Notifications
You must be signed in to change notification settings - Fork 452
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[HPA] Add targetCPUUtilization field to collector config #1066
Conversation
@moh-osman3 please sign the CLA |
if otelcol.Spec.TargetCPUUtilization != nil { | ||
cpuTarget = *otelcol.Spec.TargetCPUUtilization | ||
} else { | ||
cpuTarget = defaultCPUTarget |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the default value of `otelcol.Spec.TargetCPUUtilization should be set in the defaulting webhook.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review! moved the setting of default to opentelemetrycollector_webhook.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems without this nil check, unit tests will begin failing with
--- FAIL: TestHPA (0.00s)
--- FAIL: TestHPA/v2 (0.00s)
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1cf2364]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a comment about where to add in the default behavior, also a question about testing
@@ -129,6 +129,10 @@ func (r *OpenTelemetryCollector) validateCRDSpec() error { | |||
return fmt.Errorf("the OpenTelemetry Spec autoscale configuration is incorrect, minReplicas should be one or more") | |||
} | |||
|
|||
if r.Spec.TargetCPUUtilization != nil && (*r.Spec.TargetCPUUtilization < int32(1) || *r.Spec.TargetCPUUtilization > int32(99)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also be setting a default in this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review! Setting the default in opentelemetrycollector_webhook.go now
tests/e2e/autoscale/00-assert.yaml
Outdated
kind: HorizontalPodAutoscaler | ||
metadata: | ||
name: simplest-collector | ||
spec: | ||
minReplicas: 1 | ||
maxReplicas: 2 | ||
|
||
metrics: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we keep a test for the default behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a new OpenTelemetryCollector that does not set targetCPUUtilization
for the default.
@@ -62,6 +62,12 @@ func (r *OpenTelemetryCollector) Default() { | |||
one := int32(1) | |||
r.Spec.Replicas = &one | |||
} | |||
|
|||
// if autoscaling is enabled then set default targetCPUUtilization | |||
if r.Spec.MaxReplicas != nil && r.Spec.TargetCPUUtilization == nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this guard needed. Perhaps we could default always.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm based on your comment below I think I included the guard bc I was worried the default might override a user set value (and I also think it makes it clear this setting is only used when autoscaling is enabled). Removed the guard here and the ones in your comment below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems this guard and the next comment's are needed otherwise the e2e tests begin to fail
@@ -119,13 +120,23 @@ func setAutoscalerSpec(params Params, autoscalingVersion autodetect.AutoscalingV | |||
} else { | |||
updated.(*autoscalingv2beta2.HorizontalPodAutoscaler).Spec.MinReplicas = &one | |||
} | |||
if params.Instance.Spec.TargetCPUUtilization != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this check needed? The defaulting webhook should make sure the value is always set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
} else { | ||
updated.(*autoscalingv2.HorizontalPodAutoscaler).Spec.MaxReplicas = *params.Instance.Spec.MaxReplicas | ||
if params.Instance.Spec.MinReplicas != nil { | ||
updated.(*autoscalingv2.HorizontalPodAutoscaler).Spec.MinReplicas = params.Instance.Spec.MinReplicas | ||
} else { | ||
updated.(*autoscalingv2.HorizontalPodAutoscaler).Spec.MinReplicas = &one | ||
} | ||
if params.Instance.Spec.TargetCPUUtilization != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the same question as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed.
linting failed |
I think that was because I had some spaces instead of tabs in the webhook file. |
@kevinearls could you please review as well? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@moh-osman3 CI failed |
@pavolloffay Sorry for delay I was on PTO last week! Yeah I noticed CI was failing due to unit test error
This was resolved by adding back the nil check in I also noticed that after I applied e54ef2a my e2e tests are failing but unable to figure out the exact issue. I think I'm a little confused why the default set in the webhook isn't being picked up in |
if otelcol.Spec.TargetCPUUtilization != nil { | ||
cpuTarget = *otelcol.Spec.TargetCPUUtilization | ||
} else { | ||
cpuTarget = int32(90) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like that the default CPU target utilization is defined in two places: defaulting webhook and here. There should be only one place where defaults are defined - preferably in the defaulting webhook.
if params.Instance.Spec.TargetCPUUtilization != nil { | ||
updated.(*autoscalingv2beta2.HorizontalPodAutoscaler).Spec.Metrics[0].Resource.Target.AverageUtilization = params.Instance.Spec.TargetCPUUtilization | ||
} else { | ||
updated.(*autoscalingv2beta2.HorizontalPodAutoscaler).Spec.Metrics[0].Resource.Target.AverageUtilization = &ninety |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same here - the default is defined in two places
the PR needs to be rebased |
b7c0650
to
d2fa170
Compare
// TargetCPUUtilization sets the target average CPU used across all replicas. | ||
// If average CPU exceeds this value, the HPA will scale up. Defaults to 90 percent. | ||
// +optional | ||
TargetCPUUtilization *int32 `json:"targetCPUUtilization,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kevinearls could you please review this PR.
I am curious if this should be moved into .spec.autoscaler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pavolloffay @moh-osman3 Yes, sorry I didn't comment on this earlier. spec.autoscaler is the best place for this (and in the future we should figure out how to move maxreplicas.)
In the future we should put any other metrics in spec.autoscaler too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the future we should put any other metrics in spec.autoscaler too.
We should put this comment somewhere in the CRD or book a ticket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out! I moved TargetCPUUtilization
to the appropriate place in the spec now. I also created #1115 to track moving MaxReplicas
and MinReplicas
to .spec.autoscaler
. I don't mind taking this issue on in a followup PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Implementation for #1065
Adding targetCPUUtilization (percentage) to the collector config instead of hard coding the value.