Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add autoscale option to enable support for Horizontal Pod Autoscaling #746

Merged
merged 10 commits into from
Mar 8, 2022

Conversation

binjip978
Copy link
Contributor

@binjip978 binjip978 commented Feb 26, 2022

Resolves #729

@binjip978 binjip978 marked this pull request as ready for review February 27, 2022 09:52
@binjip978 binjip978 requested review from a team and bogdandrutu February 27, 2022 09:52
}

if r.Spec.MinReplicas != nil && *r.Spec.MinReplicas > *r.Spec.MaxReplicas {
return fmt.Errorf("the OpenTelemetry Spec autoscale configuration is incorrect, minReplicas could be more than maxReplicas")
Copy link
Contributor

@avadhut123pisal avadhut123pisal Mar 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minReplicas could be more than maxReplicas

The error message is incorrect, it should be like "MinReplicas value must not be greater than MaxReplicas value".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

labels["app.kubernetes.io/name"] = naming.Collector(otelcol)

annotations := Annotations(otelcol)
var cpuTarget int32 = 90
Copy link
Contributor

@avadhut123pisal avadhut123pisal Mar 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

var cpuTarget int32 = 90

The value of the cpuTarget is hardcoded here. I think we should take the input of TargetCPUUtilizationPercentage as a part of OpenTelemetryCollectorSpec.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that make sense to do, but I initially tried to be consistent with jaeger operator https://github.com/jaegertracing/jaeger-operator/pull/856/files where it also hardcoded.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine for the first version.

I would recommend adding a comment and extracting the value to a constant.

Signed-off-by: Sergei Semenchuk <[email protected]>
@@ -33,10 +33,22 @@ type OpenTelemetryCollectorSpec struct {
// +optional
Args map[string]string `json:"args,omitempty"`

// Autoscale turns on/off the autoscale feature. By default, it's enabled if the Replicas field is not set.
// +optional
Autoscale *bool `json:"autoscale,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be set to true in the defaulting webhook if replicas is nil?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure, right now the default is one replica, if you don't specify replicas, I tried to not change default behaviour


// MaxReplicas sets an upper bound to the autoscaling feature. When autoscaling is enabled and no value is provided, a default value is used.
// +optional
MaxReplicas *int32 `json:"maxReplicas,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a proposal to make the config/CR simpler.

Could we add only MaxReplicas field in this PR to support autostacling?

If the max replicas is set the operator would create HPA. In the validating webhook it would check if maxReplicas is higher than replicas.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that MaxReplicas can be nil (e.g. for infinite scaling) in this case we need Autoscale flag to enable the HPA.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably we can do It (remove autoscale), https://kubernetes.io/docs/reference/kubernetes-api/workload-resources/horizontal-pod-autoscaler-v1/#HorizontalPodAutoscalerSpec
because maxReplicas is required:

maxReplicas (int32), required
upper limit for the number of pods that can be set by the autoscaler; cannot be smaller than MinReplicas.

labels["app.kubernetes.io/name"] = naming.Collector(otelcol)

annotations := Annotations(otelcol)
var cpuTarget int32 = 90
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine for the first version.

I would recommend adding a comment and extracting the value to a constant.

}
params.Log.V(2).Info("created", "hpa.name", desired.Name, "hpa.namespace", desired.Namespace)
continue
} else if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This construct can be made easier to read e.g.

if err != nil {
 if isNotFound(err) {
 // create
 } else {
  return fmt.Errorf("failed to get %w", err)
 }
}

or the following if isNotFound can be called with nil (I think it can)

if isNotFound(err){
} else err != nil {
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

Copy link
Member

@pavolloffay pavolloffay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just some minor comments

@@ -109,5 +109,16 @@ func (r *OpenTelemetryCollector) validateCRDSpec() error {
}
}

// validate autoscale with horizontal pod autoscaler
if r.Spec.MaxReplicas != nil {
if r.Spec.MaxReplicas == nil || *r.Spec.MaxReplicas < int32(1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r.Spec.MaxReplicas == nil can be removed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

return fmt.Errorf("the OpenTelemetry Spec autoscale configuration is incorrect, maxReplicas should be defined and more than one")
}

if r.Spec.Replicas != nil && *r.Spec.Replicas > *r.Spec.MaxReplicas {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

while we are at this, could you please set the Replicas in the defaulting webhook if it is null and remove other code that sets the replicas outside of the defaulting webhook?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure that I got this right, do you ask to set Spec.Replicas equal to 1 in Default() if r.Spec.Replicas == nil? Because I can see where we set Spec.Replicas outside of it, only in tests and here:

	// if autoscale is enabled, use replicas from current Status
		if params.Instance.Spec.MaxReplicas != nil {
			currentReplicas := existing.Status.Replicas
			// if replicas (minReplicas from HPA perspective) is bigger than
			// current status use it.
			if *params.Instance.Spec.Replicas > currentReplicas {
				currentReplicas = *params.Instance.Spec.Replicas
			}
			updated.Spec.Replicas = &currentReplicas
		}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The replicas should be set to 1 in deployment if it is nil.

It's fine I will take a look at it in another PR.

Signed-off-by: Sergei Semenchuk <[email protected]>
@pavolloffay pavolloffay merged commit cd4fa9b into open-telemetry:main Mar 8, 2022
ItielOlenick pushed a commit to ItielOlenick/opentelemetry-operator that referenced this pull request May 1, 2024
…open-telemetry#746)

* add autoscale, minReplicas maxReplicas to collector types

Signed-off-by: Sergei Semenchuk <[email protected]>

* manage hpa with min/max replicas if autoscale is enablad

Signed-off-by: Sergei Semenchuk <[email protected]>

* fix linter

Signed-off-by: Sergei Semenchuk <[email protected]>

* use status if autoscale is true

Signed-off-by: Sergei Semenchuk <[email protected]>

* add validation for autoscale

Signed-off-by: Sergei Semenchuk <[email protected]>

* remove minReplicas

Signed-off-by: Sergei Semenchuk <[email protected]>

* remove autoscale parameter, trigger HPA if maxReplicas is not nil

Signed-off-by: Sergei Semenchuk <[email protected]>

* use collector replicas for minReplicas in HPA if HPA status is currently
lower

Signed-off-by: Sergei Semenchuk <[email protected]>

* remove nil check

Signed-off-by: Sergei Semenchuk <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Collector replicas count and HorizontalPodAutoscaler
3 participants