skip forbidden error for restore priorityClass #5292

cleverhu · 2022-09-06T12:50:33Z

Thank you for contributing to Velero!

Please add a summary of your change

skip forbidden error for restore priorityClass

Does your change fix a particular issue?

Fixes #5288

Please indicate you've done the following:

Accepted the DCO. Commits without the DCO will delay acceptance.
Created a changelog file or added /kind changelog-not-required as a comment on this pull request.
Updated the corresponding documentation in site/content/docs/main.

Signed-off-by: cleverhu <[email protected]>

sseago · 2022-09-06T13:17:41Z

First, to clarify the error you got -- the priorityClass you're trying to restore that fails -- is there already a priorityClass in the cluster with the same name? If so, then this is a case where Velero should normally just skip over it, as the resource is already in the cluster (warning, but not erroring out, if the in-cluster version is not identical to the backup verson). If that's true, then we already have a fix merged from #5239 -- which will be in 1.10 and is also targeted for 1.9.2.

The issue is that Velero uses the error return (looking for already exists error) to determine whether the reason for failure is that the resource is already in the cluster, and any other error type would normally indicate a different failure that should fail a restore. The problem is that in some cases, there is other ingress validation which is failing before checking whether the resource already exists, so Velero is working with incomplete information when determining whether to error out or not. The PR linked above fixes this for all resource types, not just priority classes.

If the above description of the problem is correct, we don't need this PR as it's already been fixed. Also, this fix introduces a different bug. If you're trying to restore a completely different priority class (with a different name), then erroring out is the correct thing to do, since the user is trying to create a second default priority class, which is not allowed by cluster rules. Also, this PR would discard any error on priority class restore and not just "too many default classes" errors.

cleverhu · 2022-09-06T14:35:39Z

First, to clarify the error you got -- the priorityClass you're trying to restore that fails -- is there already a priorityClass in the cluster with the same name? If so, then this is a case where Velero should normally just skip over it, as the resource is already in the cluster (warning, but not erroring out, if the in-cluster version is not identical to the backup verson). If that's true, then we already have a fix merged from #5239 -- which will be in 1.10 and is also targeted for 1.9.2.

The issue is that Velero uses the error return (looking for already exists error) to determine whether the reason for failure is that the resource is already in the cluster, and any other error type would normally indicate a different failure that should fail a restore. The problem is that in some cases, there is other ingress validation which is failing before checking whether the resource already exists, so Velero is working with incomplete information when determining whether to error out or not. The PR linked above fixes this for all resource types, not just priority classes.

If the above description of the problem is correct, we don't need this PR as it's already been fixed. Also, this fix introduces a different bug. If you're trying to restore a completely different priority class (with a different name), then erroring out is the correct thing to do, since the user is trying to create a second default priority class, which is not allowed by cluster rules. Also, this PR would discard any error on priority class restore and not just "too many default classes" errors.

Thanks for your advice, I also think that it is not reasonable to add many priority errors, which is inconvenient for later program changes. However, this will not introduce new bugs. The 'restore a completely different priority class (with a different name)' mentioned above will lead to data loss. The program verifies whether the resource exists at the end. I checked the PR integrated above, and I found that the previous priority detection function has completely lost its function. I will propose a PR for everyone to discuss whether it is equivalent to the previous program.

cleverhu · 2022-09-06T16:20:37Z

The implementation is not friendly, so it needs to be closed. It has been handed over to another PR: #5293

github-actions bot assigned cleverhu Sep 6, 2022

github-actions bot requested review from blackpiglet and dsu-igeek September 6, 2022 12:50

skip forbidden error for restore priorityClass

eb86804

Signed-off-by: cleverhu <[email protected]>

cleverhu force-pushed the skip-priority-class-already-error branch from 7bcecb3 to eb86804 Compare September 6, 2022 12:58

github-actions bot added the has-changelog label Sep 6, 2022

cleverhu mentioned this pull request Sep 6, 2022

optimize code for restore exists resources #5293

Merged

3 tasks

cleverhu closed this Sep 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

skip forbidden error for restore priorityClass #5292

skip forbidden error for restore priorityClass #5292

cleverhu commented Sep 6, 2022

sseago commented Sep 6, 2022

cleverhu commented Sep 6, 2022

cleverhu commented Sep 6, 2022 •

edited

Loading

skip forbidden error for restore priorityClass #5292

skip forbidden error for restore priorityClass #5292

Conversation

cleverhu commented Sep 6, 2022

Please add a summary of your change

Does your change fix a particular issue?

Please indicate you've done the following:

sseago commented Sep 6, 2022

cleverhu commented Sep 6, 2022

cleverhu commented Sep 6, 2022 • edited Loading

cleverhu commented Sep 6, 2022 •

edited

Loading