Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deployment migration error when changing from jaeger-operator v1.18.1 to higher. #1564

Open
DWonMtl opened this issue Sep 30, 2021 · 0 comments
Labels
bug Something isn't working

Comments

@DWonMtl
Copy link

DWonMtl commented Sep 30, 2021

Describe the bug

We ran into a Deployment resource error while migrating jaeger-operator from version v1.18.0 to v1.24.0.

time="2021-09-29T14:57:01Z" level=error msg="failed to apply the changes" error="Deployment.apps \"tracingstack-2a1696c6-sg-cce2e81c\" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{\"app\":\"jaeger\", \"app.kubernetes.io/component\":\"all-in-one\", \"app.kubernetes.io/instance\":\"tracingstack-2a1696c6-sg-cce2e81c\", \"app.kubernetes.io/managed-by\":\"jaeger-operator\", \"app.kubernetes.io/name\":\"tracingstack-2a1696c6-sg-cce2e81c\", \"app.kubernetes.io/part-of\":\"jaeger\", \"tracing.fleet.ubisoft.com/stack-hash\":\"2a1696c6\", \"tracing.fleet.ubisoft.com/stack-name\":\"tracingstack-2a1696c6\"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable" execution="2021-09-29 14:57:01.8305477 +0000 UTC" instance=tracingstack-2a1696c6-sg-cce2e81c namespace=fleet-system

This is related to this fix: #1153,
itself being a fix of this issue #629.
And the following issue #1531 is requesting a similar fix.

The reason why this breaks the migration is because deployment.spec.selector is an immutable field.
Therefore, as soon as the jaeger.spec.allInOne.labels is not identical to the labels that were previoussly hard-coded by the jaeger-operator (see here)
The migration from v1.18.1 to v1.19.0 of the Deployment resource fails.

I can see that @abstulo in #629 was proposing a solution that wouldn't caused any migration errors since he was proposing to:

  1. Keep the deployment selector labels from the default labels only.
  2. And to merge the jaeger.spec.allInOne.labels with the default labels into the deployment.spec.template.objectMeta.labels

example :

	return &appsv1.Deployment{
		...
		Spec: appsv1.DeploymentSpec{
			Selector: &metav1.LabelSelector{
				MatchLabels: a.labels(),                                 // <--- same as before
			},
			Template: corev1.PodTemplateSpec{
				ObjectMeta: metav1.ObjectMeta{
					Labels: mergeLabels(commonSpec.Labels, a.labels()),  // <--- merge between the default selector labels and the user defined labels
					...
				},
			},
		},
	}

where mergeLabels is something like:

func mergeLabels(labelsList ...map[string]string) map[string]string {
	result := map[string]string{}
	for _, labels := range labelsList {
		for k, v := range labels {
			result[k] = v
		}
	}
	return result
}

That being said!

Changing back to what it was or implementing the present suggestion will also cause the same migration error, since this will modify the LabelSelector.

Therefore, a solution that will always guaranty no migration error would be to:

  1. Fetch the current Deployment from kubernetes
  2. If it exist:
    2.1 Reuse the same labels as already in the deployment.spec.selector.matchLabels.
  3. If it doesn't exist:
    3.1 Use your standart selector labels a.labels()

This is not a big issue since the solution is as simple as deleting the faulty Deployment and let the jaeger-operator re-create it.
For now it only affect the AllInOne strategy that we only use for local developpement.
However, if #1531 is done the same way, it might cause some headache to our team 😁.

To Reproduce
Steps to reproduce the behavior:

  1. Deploy the jaeger-operator version 1.18.1.
  2. Deploy a Jaeger custom resource with valid and additional jaeger.spec.allInOne.labels.
  3. Wait for it to stabilize.
  4. Update the jaeger-operator to version 1.19.0. You should see a similar error:
time="2021-09-29T14:57:01Z" level=error msg="failed to apply the changes" error="Deployment.apps \"tracingstack-2a1696c6-sg-cce2e81c\" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{\"app\":\"jaeger\", \"app.kubernetes.io/component\":\"all-in-one\", \"app.kubernetes.io/instance\":\"tracingstack-2a1696c6-sg-cce2e81c\", \"app.kubernetes.io/managed-by\":\"jaeger-operator\", \"app.kubernetes.io/name\":\"tracingstack-2a1696c6-sg-cce2e81c\", \"app.kubernetes.io/part-of\":\"jaeger\", \"tracing.fleet.ubisoft.com/stack-hash\":\"2a1696c6\", \"tracing.fleet.ubisoft.com/stack-name\":\"tracingstack-2a1696c6\"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable" execution="2021-09-29 14:57:01.8305477 +0000 UTC" instance=tracingstack-2a1696c6-sg-cce2e81c namespace=fleet-system

Expected behavior
Updating from one version to another should ideally not require any external modification to the operator owned resources.
In this case, the only way we can fix it is by deleting the whole Jaeger custom resource or to delete the faulty Deployment resource.

Screenshots
N/A

Version (please complete the following information):

  • OS: Linux
  • Jaeger version: 1.18
  • Deployment: Kubernetes

What troubleshooting steps did you try?

  • I looked at the diff for the deployment selector labels between version v1.18.0 to v1.24.0 in order to understand why they were different.

Additional context
No additional context

@DWonMtl DWonMtl added the bug Something isn't working label Sep 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant