Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't fail backup deletion if downloading tarball fails #2993

Conversation

zubron
Copy link
Contributor

@zubron zubron commented Oct 7, 2020

Previously, we would always attempt to download the tarball for a backup
for processing DeleteItemAction plugins, even if there weren't any.
This caused an issue for some users in the case where the backup tarball
had been deleted from object storage as the backup deletion would fail.

Now, we only attempt to download the tarball in the case where there are
DeleteItemAction plugins. If downloading that tarball fails, we log
the error, skip the processing of the DeleteItemAction plugins and
proceed with the rest of the deletion.

Fixes #2980

Signed-off-by: Bridget McErlean [email protected]

@zubron zubron force-pushed the fix-deletion-of-cloud-deleted-backups-2980 branch 2 times, most recently from 6cf52a6 to df96e0c Compare October 7, 2020 21:53
@zubron zubron marked this pull request as ready for review October 7, 2020 21:54
@zubron zubron force-pushed the fix-deletion-of-cloud-deleted-backups-2980 branch from df96e0c to 0808786 Compare October 7, 2020 21:55
@zubron zubron changed the title Don't fail backup if downloading tarball fails Don't fail backup deletion if downloading tarball fails Oct 13, 2020
@zubron zubron force-pushed the fix-deletion-of-cloud-deleted-backups-2980 branch from 0808786 to 383f3ac Compare October 19, 2020 20:40
@zubron zubron force-pushed the fix-deletion-of-cloud-deleted-backups-2980 branch from 383f3ac to d83759b Compare November 5, 2020 15:04
Copy link
Member

@ashish-amarnath ashish-amarnath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the correct fix is to not delete the backup from the object store if there were any errors in the process of deleting the backup.

refer #3094 (comment)

@zubron
Copy link
Contributor Author

zubron commented Nov 19, 2020

@ashish-amarnath #3094 is related but I think this PR and the issue it is addressing are different enough that they should be considered separately. This PR will only deal with the case where there are DeleteItemAction plugins to run and the backup tarball doesn't exist. The fact that a backup deletion fails due to missing cloud resources is a change in behaviour from 1.4 so this PR was to restore the previous behaviour. For example, if someone manually removes the backup tarball from their object storage or the backup tarball was never written in the first place, then attempting to delete those backups through velero will always result in an error and the deletion will fail.

In #3094, the issue is that Velero should not have deleted any backup resources in the first place if there was an error. They are related, in that the error described in #2980 will appear on following deletion attempts, but I still think they are distinct issues. I think the issue of orphaned resources described in #3094 would have existed prior to the introduction of DeleteItemActions.

All that said, I think we need both this change and the one you mentioned in #3094 😄

if err != nil {
log.WithError(err).Errorf("Unable to download tarball for backup %s, skipping associated DeleteItemAction plugins", backup.Name)
} else {
defer closeAndRemoveFile(backupFile, c.logger)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this defer be in the else block? I don't think the backupFile variable exists here, unless I'm reading something wrong.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's defined on 307. I put the defer here thinking that we shouldn't attempt to close and remove it unless it was successfully created, but I should check the implementation of both functions. downloadToTempFile might create the file and not write to it, or closeAndRemoveFile might be able to handle the case where it doesn't exist and in that case it would be safe to move it to the outer scope where backupFile was defined.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

downloadToTempFile returns a nil file pointer in the case where there was an error. This means that closeAndRemoveFile will panic as it was passed nil. I can leave the defer in the else block or update closeAndRemoveFile to handle the case where the file is nil. I'm leaning towards the latter option but let me know what you think!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, thanks for the detailed reply. I was actually confusing the ifs and thought the else was part of the len(actions) > 0 condition.

I think updating closeAndRemoveFile to handle a nil file is a good idea. That way, we can move the defer right after the downloadToTempFile call, which is more idiomatic Go, where the defer is put right after the creation or fetching of the resource.

The other option is a short comment above the defer statement where it's at now to explain why, but I think updating closeAndRemoveFile is the better choice.

@nrb
Copy link
Contributor

nrb commented Nov 19, 2020

I think we need both fixes as well. If nothing else, this is an optimization that we won't try to download the backup tarball when we don't need to act on it.

@zubron zubron force-pushed the fix-deletion-of-cloud-deleted-backups-2980 branch from d83759b to feaff55 Compare November 23, 2020 17:06
@github-actions github-actions bot requested a review from nrb November 23, 2020 17:06
Previously, we would always attempt to download the tarball for a backup
for processing DeleteItemAction plugins, even if there weren't any.
This caused an issue for some users in the case where the backup tarball
had been deleted from object storage as the backup deletion would fail.

Now, we only attempt to download the tarball in the case where there are
DeleteItemAction plugins. If downloading that tarball fails, we log
the error, skip the processing of the DeleteItemAction plugins and
proceed with the rest of the deletion.

Signed-off-by: Bridget McErlean <[email protected]>
@zubron zubron force-pushed the fix-deletion-of-cloud-deleted-backups-2980 branch from feaff55 to fd6035c Compare November 24, 2020 20:32
Copy link
Contributor

@nrb nrb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your patience @zubron :)

@ashish-amarnath Could you please revisit your review?

Copy link
Member

@ashish-amarnath ashish-amarnath left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@ashish-amarnath ashish-amarnath merged commit a877354 into vmware-tanzu:main Nov 30, 2020
@nrb nrb added this to the v1.5.3 milestone Dec 17, 2020
georgettica pushed a commit to georgettica/velero that referenced this pull request Dec 23, 2020
…#2993)

* Don't fail backup if downloading tarball fails

Previously, we would always attempt to download the tarball for a backup
for processing DeleteItemAction plugins, even if there weren't any.
This caused an issue for some users in the case where the backup tarball
had been deleted from object storage as the backup deletion would fail.

Now, we only attempt to download the tarball in the case where there are
DeleteItemAction plugins. If downloading that tarball fails, we log
the error, skip the processing of the DeleteItemAction plugins and
proceed with the rest of the deletion.

Signed-off-by: Bridget McErlean <[email protected]>

* Skip file removal in closeAndRemoveFile if nil

Signed-off-by: Bridget McErlean <[email protected]>
@haslersn
Copy link

When will this fix be released?

zubron added a commit that referenced this pull request Jan 14, 2021
* Don't fail backup if downloading tarball fails

Previously, we would always attempt to download the tarball for a backup
for processing DeleteItemAction plugins, even if there weren't any.
This caused an issue for some users in the case where the backup tarball
had been deleted from object storage as the backup deletion would fail.

Now, we only attempt to download the tarball in the case where there are
DeleteItemAction plugins. If downloading that tarball fails, we log
the error, skip the processing of the DeleteItemAction plugins and
proceed with the rest of the deletion.

Signed-off-by: Bridget McErlean <[email protected]>

* Skip file removal in closeAndRemoveFile if nil

Signed-off-by: Bridget McErlean <[email protected]>
georgettica pushed a commit to georgettica/velero that referenced this pull request Jan 26, 2021
…#2993)

* Don't fail backup if downloading tarball fails

Previously, we would always attempt to download the tarball for a backup
for processing DeleteItemAction plugins, even if there weren't any.
This caused an issue for some users in the case where the backup tarball
had been deleted from object storage as the backup deletion would fail.

Now, we only attempt to download the tarball in the case where there are
DeleteItemAction plugins. If downloading that tarball fails, we log
the error, skip the processing of the DeleteItemAction plugins and
proceed with the rest of the deletion.

Signed-off-by: Bridget McErlean <[email protected]>

* Skip file removal in closeAndRemoveFile if nil

Signed-off-by: Bridget McErlean <[email protected]>
vadasambar pushed a commit to vadasambar/velero that referenced this pull request Feb 3, 2021
…#2993)

* Don't fail backup if downloading tarball fails

Previously, we would always attempt to download the tarball for a backup
for processing DeleteItemAction plugins, even if there weren't any.
This caused an issue for some users in the case where the backup tarball
had been deleted from object storage as the backup deletion would fail.

Now, we only attempt to download the tarball in the case where there are
DeleteItemAction plugins. If downloading that tarball fails, we log
the error, skip the processing of the DeleteItemAction plugins and
proceed with the rest of the deletion.

Signed-off-by: Bridget McErlean <[email protected]>

* Skip file removal in closeAndRemoveFile if nil

Signed-off-by: Bridget McErlean <[email protected]>
dharmab pushed a commit to dharmab/velero that referenced this pull request May 25, 2021
…#2993)

* Don't fail backup if downloading tarball fails

Previously, we would always attempt to download the tarball for a backup
for processing DeleteItemAction plugins, even if there weren't any.
This caused an issue for some users in the case where the backup tarball
had been deleted from object storage as the backup deletion would fail.

Now, we only attempt to download the tarball in the case where there are
DeleteItemAction plugins. If downloading that tarball fails, we log
the error, skip the processing of the DeleteItemAction plugins and
proceed with the rest of the deletion.

Signed-off-by: Bridget McErlean <[email protected]>

* Skip file removal in closeAndRemoveFile if nil

Signed-off-by: Bridget McErlean <[email protected]>
ywk253100 pushed a commit to ywk253100/velero that referenced this pull request Jun 29, 2021
…#2993)

* Don't fail backup if downloading tarball fails

Previously, we would always attempt to download the tarball for a backup
for processing DeleteItemAction plugins, even if there weren't any.
This caused an issue for some users in the case where the backup tarball
had been deleted from object storage as the backup deletion would fail.

Now, we only attempt to download the tarball in the case where there are
DeleteItemAction plugins. If downloading that tarball fails, we log
the error, skip the processing of the DeleteItemAction plugins and
proceed with the rest of the deletion.

Signed-off-by: Bridget McErlean <[email protected]>

* Skip file removal in closeAndRemoveFile if nil

Signed-off-by: Bridget McErlean <[email protected]>
gyaozhou pushed a commit to gyaozhou/velero-read that referenced this pull request May 14, 2022
…#2993)

* Don't fail backup if downloading tarball fails

Previously, we would always attempt to download the tarball for a backup
for processing DeleteItemAction plugins, even if there weren't any.
This caused an issue for some users in the case where the backup tarball
had been deleted from object storage as the backup deletion would fail.

Now, we only attempt to download the tarball in the case where there are
DeleteItemAction plugins. If downloading that tarball fails, we log
the error, skip the processing of the DeleteItemAction plugins and
proceed with the rest of the deletion.

Signed-off-by: Bridget McErlean <[email protected]>

* Skip file removal in closeAndRemoveFile if nil

Signed-off-by: Bridget McErlean <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to delete backup if cloud resources have already been deleted (velero v1.5.1, similar to #308)
5 participants