-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix Edge cases for already exists resources #5223
Comments
by Would we suggest the maintainer of this admissionwebhook return a more explicit 409? |
@reasonerjt Not reproducible in that the specific conditions that triggered the error for this particular resource type aren't there in earlier versions. But the problem is a more generic one. I know I've seen this manifest itself in the past when working on unrelated code which checked for IsAlreadyExistError as a means of determining whether something existed before. What other problems are you referring to? We have tested a fix internally that should resolve this which I think @shubham-pampattiwar will be submitting as a PR soon. Basically if the Create fails with an error other than IsAlreadyExistsError, we do a Get call to see if it actually exists. If it does, treat it as IsAlreadyExistsError. If not, add the error to the restore like we currently do. What problems are you seeing with this approach? |
@sseago +1, @reasonerjt please take a look at this proposed PR as a fix for this issue #5239 |
Which adminssion webhook triggered this error? |
@blackpiglet This was triggered by |
@blackpiglet The issue is that the admission webhook validation happens before the normal create call, which means we don't get an IsAlreadyExists error. In this case, the resource already exists, so we don't want a more self-explained error message. We don't want the error message at all. We want to recognize that the resource is already in the cluster (just as we do when we get an IsAlreadyExists error), and then we can apply our existing resource policy and either patch it or warn (if the object is different than the obj from backup), or do nothing (if the object is the same as in the backup). |
What steps did you take and what happened:
[A clear and concise description of what the bug is, and what commands you ran.)
On OpenShift 4.11 (Kubernetes 1.24) we are seeing an error(not warning) related to the restoration of volumesnapshotclass that it already exists, not reproducilbe on earlier versions. Restore fails with the following logs, This is added as an error by the restore controller because the create call itself is incomplete due to admission webhooks validation, had the create call executed and Velero had received an IsalreadyExists this would have been a warning.
What did you expect to happen:
Restore should complete without errors.
The following information will help us better understand what's going on:
If you are using velero v1.7.0+:
Please use
velero debug --backup <backupname> --restore <restorename>
to generate the support bundle, and attach to this issue, more options please refer tovelero debug --help
If you are using earlier versions:
Please provide the output of the following commands (Pasting long output into a GitHub gist or other pastebin is fine.)
kubectl logs deployment/velero -n velero
velero backup describe <backupname>
orkubectl get backup/<backupname> -n velero -o yaml
velero backup logs <backupname>
velero restore describe <restorename>
orkubectl get restore/<restorename> -n velero -o yaml
velero restore logs <restorename>
Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
Environment:
velero version
):velero client config get features
):kubectl version
):/etc/os-release
):Vote on this issue!
This is an invitation to the Velero community to vote on issues, you can see the project's top voted issues listed here.
Use the "reaction smiley face" up to the right of this comment to vote.
The text was updated successfully, but these errors were encountered: