Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restoring pods in an Istio enabled namespace causes them to error #2997

Closed
Samze opened this issue Oct 9, 2020 · 6 comments
Closed

Restoring pods in an Istio enabled namespace causes them to error #2997

Samze opened this issue Oct 9, 2020 · 6 comments
Labels

Comments

@Samze
Copy link

Samze commented Oct 9, 2020

What steps did you take and what happened:

  1. Install Istio istioctl install --set profile=default
  2. Install Velero velero install
  3. Create namespace kubectl create namespace nginx-example
  4. Enabled istio on namespace kubectl label namespace nginx-example istio-injection=enabled
  5. Apply the Velero non-pvc example. kubectl apply -f https://raw.githubusercontent.com/vmware-tanzu/velero/main/examples/nginx-app/base.yaml
  6. Back up the namespace velero backup create nginx-backup --include-namespaces nginx-example
  7. Velero reports that the backup was succesfull.
  8. Simulate a disaster kubectl delete namespaces nginx-example
  9. Restore velero restore create --from-backup nginx-backup
  10. Velero reports that the restore was succesfull.
  11. Check on pods in namespace and note they are crashing
  12. Check pod logs and note an error in the istio container.

What did you expect to happen:
The pods to be restored successfully.

Anything else you would like to add:
This was also raised on the istio repo here: istio/istio#27675

It appears as though injection of istio is happening twice on a restore, once from velero once from istio.

(You can potentially work around this by excluding the pod resource and let the deployment recreate the pods, however this will mean that volumes for pods won't be backedup)

Environment:

  • Istio version: (istioctl version) client version: 1.7.2, control plane version: 1.7.3, data plane version: 1.7.3 (9 proxies)
  • Velero version (use velero version): v1.5.1
  • Velero features (use velero client config get features): <NOT_SET>
  • Kubernetes version (use kubectl version): Server: 1.16, Client: 1.19
  • Kubernetes installer & version: GKE 1.16
  • Cloud provider or hardware configuration: GKE
  • OS (e.g. from /etc/os-release):

Vote on this issue!

This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.

  • 👍 for "I would like to see this bug fixed as soon as possible"
  • 👎 for "There are more important bugs to focus on right now"

Thanks Sam and @teddyking

@carlisia
Copy link
Contributor

@Samze thank you for reporting this.

This issue seems to have enough information to try to replicate and evaluate an outcome.

@nrb nrb self-assigned this Oct 21, 2020
@nrb
Copy link
Contributor

nrb commented Oct 21, 2020

Thanks for the detailed production steps @Samze. I'm able to confirm this with Istio 1.7.3, and in reading the Istio issue, I think it's definitely a result of Velero applying the object "as-is" (which has the configuration once) and then Istio seeing the object being recreated and trying to configure it too.

This isn't necessarily unique to Istio and Velero. There are a number of operators/controllers that don't expect Velero to restore resources with their labels and annotations (though we do remove status), so they re-process the restored objects.

One potential fix from our side: remove specific Istio annotations via a Pod RestoreItemAction plugin.

I do think istio/istio#25931 is the more correct fix, though.

@nrb nrb added Size/M Medium amount of work Restore and removed Needs investigation labels Oct 21, 2020
@nrb nrb unassigned nrb Oct 21, 2020
@venksel
Copy link

venksel commented Dec 3, 2020

Facing the exact same issue, currently using the workaround of excluding POD resources at the time of restore. Did upgrade to latest version 1.5.2. Any resolution to this problem will be of great help.

@MRostanski
Copy link

Could you provide the example of command to exclude the pod resources that work for you @venksel ?
Also, this issue requires the specific documentation on how to proceed with istio-controlled cluster (because istio is an operator and we also experience racing conditions with an operator.
The one solution/workaround is to scale dwon the operator to 0 as suggested for example in https://banzaicloud.com/blog/vault-backup-velero/ or to set up the restore order policy so that operators are restored last - but again, I don;t have the working example.

@eleanor-millman
Copy link
Contributor

Closing because the Istio issue has been closed, so please check to see if they have resolved the issue. If not, then we suggest you reopen the Istio issue.

@venksel
Copy link

venksel commented May 11, 2022

Could you provide the example of command to exclude the pod resources that work for you @venksel ? Also, this issue requires the specific documentation on how to proceed with istio-controlled cluster (because istio is an operator and we also experience racing conditions with an operator. The one solution/workaround is to scale dwon the operator to 0 as suggested for example in https://banzaicloud.com/blog/vault-backup-velero/ or to set up the restore order policy so that operators are restored last - but again, I don;t have the working example.

Here is the command:
velero restore create --from-backup --include-namespaces default --exclude-resources='pods'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants