Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent operator from overriding .Spec.Replicas #979

Merged

Conversation

jpkrohling
Copy link
Contributor

Fixes #963 by using the current .Spec.Replicas from the Deployment if the desired number of replicas isn't explicitly set.

Signed-off-by: Juraci Paixão Kröhling [email protected]

Signed-off-by: Juraci Paixão Kröhling <[email protected]>
@codecov
Copy link

codecov bot commented Mar 19, 2020

Codecov Report

Merging #979 into master will decrease coverage by 0.02%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #979      +/-   ##
==========================================
- Coverage   64.47%   64.44%   -0.03%     
==========================================
  Files          82       82              
  Lines        6536     6526      -10     
==========================================
- Hits         4214     4206       -8     
+ Misses       2179     2178       -1     
+ Partials      143      142       -1     
Impacted Files Coverage Δ
pkg/deployment/collector.go 100.00% <100.00%> (+0.99%) ⬆️
pkg/deployment/ingester.go 100.00% <100.00%> (ø)
pkg/deployment/query.go 100.00% <100.00%> (ø)
pkg/inventory/deployment.go 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4567d43...009b087. Read the comment docs.

@objectiser
Copy link
Contributor

@jpkrohling Once reviewed, would you be able to provide the same fix for ingester, so we don't run into the same issue with adding HPA to that?

Signed-off-by: Juraci Paixão Kröhling <[email protected]>
@jpkrohling
Copy link
Contributor Author

@objectiser great point. I just went ahead and changed the query and ingester to use the same approach as the collector when it comes to the replica size.

@@ -88,12 +88,6 @@ func (c *Collector) Get() *appsv1.Deployment {
// see https://github.com/jaegertracing/jaeger-operator/issues/334
sort.Strings(options)

replicaSize := c.jaeger.Spec.Collector.Replicas
if replicaSize == nil || *replicaSize < 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we no longer have to check on nill. We might remove the pointer and use plain int.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand your comment. The Deployment's Spec.Replica has the same type as our .Spec.Collector.Replicas, and nil means "not set".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c.jaeger.Spec.Collector.Replicas is a pointer. The question is whether we can change it to plain int

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The deployment's .Spec.Replicas is also a pointer. If we change our side to plain int, it suddenly has a value, like 0, which is not the same as nil. Once we get this ambiguity, our fix in the inventory package won't work anymore.

// if we have a nil value for the replicas in the desired deployment
// but we have a specific value in the current deployment, we override the desired with the current
// as this might have been written by an HPA
if tp.Spec.Replicas != nil && v.Spec.Replicas == nil {
v.Spec.Replicas = tp.Spec.Replicas
}

@@ -488,6 +488,21 @@ func TestAutoscalersDisabledByExplicitOption(t *testing.T) {
assert.Len(t, a, 0)
}

func TestAutoscalersSetMaxReplicas(t *testing.T) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what is going on. Could you please explain this test and is the name of the test correct?

Does the autoscaler set the number of replicas? The replicas are set directly in the CR and then autoscaler has the same value.

Copy link
Contributor Author

@jpkrohling jpkrohling Mar 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is covering a hole in coverage that I found when working on this PR. Basically, when MaxReplicas is set in our CR, we pass it along to the HPA's MaxReplicas value. If we don't have an explicit value, we set an hard upper bound, 100.

Replicas set directly in the deployment CR are seen as "initial replicas", which are then overridden by the autoscaler as needed. So, it's possible to have the autoscaler set for 2 as MinReplicas, 10 as MaxReplicas and 5 as the deployment's Replicas. In that case, it first gets set to 5, then scales down to 2 or up to 10 based on the actual demand. The problem in this CR is that we were explicitly setting a default value for Replicas in the deployments, which would cause deployments that were autoscaled to be scaled back to 1 when no Replicas were set at our CR level.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this func test all replica fields Replicas, MinReplicas and MaxReplicas?

It would help me to understand the behavior if we had all combinations in the single test via:

tests := []struct{

replicas int,
minReplicas int,
maxReplicas int,
//...
//expected values for these fields
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I should have mentioned that the behavior I described is the Deployment's and HPA's. With our operator as part of the mix, setting an explicit value for Replicas will disable HPA, as we understand that people would want a fixed amount of replicas. I just checked and we have a test describing this case (TestAutoscalersDisabledByExplicitReplicaSize)

@jpkrohling jpkrohling merged commit c6fa7bc into jaegertracing:master Mar 20, 2020
jpkrohling added a commit to jpkrohling/jaeger-operator that referenced this pull request Dec 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

During CRD reconciliation operator scales down replica set to 1 when HPA is enabled
3 participants