The "managedFields" isn't restored #5701

ywk253100 · 2022-12-16T01:54:51Z

When restoring data from a backup velero doesn't restore managedFields existing in the original data.

ManagedFields contains the history of which controller/tool is responsible for each field/list entry in the object, and this history is then used to determine the result of next operation when using server side apply patches as described in https://kubernetes.io/docs/reference/using-api/server-side-apply/

Not only valero intentionally drops those valuable informations, but after restoring an object velero itself becomes manager of all the informations, thus preventing any other controller to delete fields ore remove list entries.

In order for the system to work properly when using server side apply after a restore, it is required that velero restores managedFields existing in the original objects (removing any track of velero itself).

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

👍 for "I would like to see this bug fixed as soon as possible"
👎 for "There are more important bugs to focus on right now"

reasonerjt · 2022-12-19T06:22:42Z

@ywk253100
Thanks for the write-up.
We should also double-check the current choice in velero to clear most of the metadata of a resource.

Maybe we should by default keep most of the fields but only clean up the ones that will cause problems.

reasonerjt · 2023-01-10T08:20:59Z

#2416 was also trying to address the problem in managedFields

ywk253100 · 2023-01-17T08:53:29Z

Doesn't restore managedFields causes the issue when patching the objects with server-side apply:

Create a configmap

kubectl apply -f configmap.yaml --server-side

Backup the configmap
Delete the configmap and restore it
After the restoring, the manager becomes velero-server

kubectl get cm cm3 -o yaml --show-managed-fields
apiVersion: v1
data:
  player_initial_lives: "4"
  ui_properties_file_name: user-interface.properties
kind: ConfigMap
metadata:
  creationTimestamp: "2023-01-17T08:39:35Z"
  labels:
    velero.io/backup-name: my-backup-02
    velero.io/restore-name: my-restore-01
  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:data:
        .: {}
        f:player_initial_lives: {}
        f:ui_properties_file_name: {}
      f:metadata:
        f:labels:
          .: {}
          f:velero.io/backup-name: {}
          f:velero.io/restore-name: {}
    manager: velero-server
    operation: Update
    time: "2023-01-17T08:39:35Z"
  name: cm3
  namespace: default
  resourceVersion: "28857226"
  uid: c5a2dac4-c3b3-4925-908f-5cd415901f3e

Modify the configmap and apply it again, get conflict error

kubectl apply -f workload/configmap.yaml --server-side
error: Apply failed with 1 conflict: conflict with "velero-server" using v1: .data.player_initial_lives
Please review the fields above--they currently have other managers. Here
are the ways you can resolve this warning:
* If you intend to manage all of these fields, please re-run the apply
  command with the `--force-conflicts` flag.
* If you do not intend to manage all of the fields, please edit your
  manifest to remove references to the fields that should keep their
  current managers.
* You may co-own fields by updating your manifest to match the existing
  value; in this case, you'll become the manager if the other manager(s)
  stop managing the field (remove it from their configuration).
See https://kubernetes.io/docs/reference/using-api/server-side-apply/#conflicts

sbueringer · 2023-02-12T08:58:19Z

Would be nice to get this fixed. This can lead to significant issues with ClusterAPI (we are using server side apply very extensively)

ywk253100 · 2023-02-13T01:18:41Z

@sbueringer The plan is to fix this in v1.10 which should be consumed by the TKG H release.
We didn't see any issues during the testing after restoration(upgrade/scale/delete WL clusters) in the TKG G release. And per my understanding, the recommended way for controllers is to always "force" conflicts which should not be impacted by this issue. Correct me if I'm wrong.

sbueringer · 2023-02-13T09:27:29Z

The plan is to fix this in v1.10 which should be consumed by the TKG H release.

Sounds good!

And per my understanding, the recommended way for controllers is to always "force" conflicts which should not be impacted by this issue.

This is correct and Cluster API uses force conflict everywhere.

The issue is a bit more nuanced.

Following example:

A ClusterClass-based cluster has been created
- Just as an example, we have similar effects with legacy clusters
The Cluster API controller creates MachineDeployments for the cluster
- The Cluster API controller now owns ~ all fields of the MachineDeployment
Velero backup
Reset mgmt cluster
Velero restore
Initially "velero-server" now owns ~ all fields of the MachineDeployment
After a reconcile of the Cluster API controller, the Cluster API controller (field manager "capi-topology") now also owns ~ all fields of the MachineDeployment
Important to note: the fields are now owned by "velero-server" and "capi-topology"
Now let's assume a user wants to unset an annotation on the MD or an optional field like "nodeDrainTimeout"
The user removes the corresponding config from Cluster.spec.topology
The Cluster API controller reconciles and doesn't set the annotation or the optional field anymore
Usually at this point the annotation / field would be dropped, but because they are also owned by "velero-server" they are just kept.

So tl;dr velero backup / restore leads to co-ownership of all fields by "velero-server" and "capi-topology". It's still possible for the Cluster API controller to change fields (that's the part where force conflicts is relevant), but it is impossible to unset fields (including labels / annotations)

If a user wants to correct this they have to essentially either cleanup the managed fields manually or overwrite all fields with different values so that on the next reconcile the Cluster API controller can get sole ownership of the fields.

ywk253100 · 2023-02-20T08:57:40Z

Fixed by #5853

ywk253100 added Restore 1.11-candidate labels Dec 16, 2022

ywk253100 self-assigned this Dec 16, 2022

ywk253100 added target/1.10.1 target/v1.9.6 labels Dec 16, 2022

ywk253100 added this to the 1.10.1 milestone Dec 16, 2022

ywk253100 modified the milestones: 1.10.1, v1.11 Jan 4, 2023

ywk253100 removed the 1.11-candidate label Jan 4, 2023

reasonerjt removed the target/1.10.1 label Jan 11, 2023

ywk253100 mentioned this issue Jan 31, 2023

Restore finalizer and managedFields #5808

Merged

3 tasks

ywk253100 added target/1.10.2 and removed target/v1.9.6 labels Feb 8, 2023

ywk253100 mentioned this issue Feb 17, 2023

Restore finalizer and managedFields #5853

Merged

3 tasks

ywk253100 closed this as completed Feb 20, 2023

ywk253100 mentioned this issue Sep 4, 2024

The "managedFields" patching logic doesn't apply to "namespace" resources during the restoration #8183

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The "managedFields" isn't restored #5701

The "managedFields" isn't restored #5701

ywk253100 commented Dec 16, 2022

reasonerjt commented Dec 19, 2022

reasonerjt commented Jan 10, 2023

ywk253100 commented Jan 17, 2023

sbueringer commented Feb 12, 2023

ywk253100 commented Feb 13, 2023

sbueringer commented Feb 13, 2023 •

edited

Loading

ywk253100 commented Feb 20, 2023

The "managedFields" isn't restored #5701

The "managedFields" isn't restored #5701

Comments

ywk253100 commented Dec 16, 2022

reasonerjt commented Dec 19, 2022

reasonerjt commented Jan 10, 2023

ywk253100 commented Jan 17, 2023

sbueringer commented Feb 12, 2023

ywk253100 commented Feb 13, 2023

sbueringer commented Feb 13, 2023 • edited Loading

ywk253100 commented Feb 20, 2023

sbueringer commented Feb 13, 2023 •

edited

Loading