Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restoring R4.0_rc3 backups >53.0 GB fail with error #3468

Closed
mossy-nw opened this issue Jan 16, 2018 · 13 comments
Closed

Restoring R4.0_rc3 backups >53.0 GB fail with error #3468

mossy-nw opened this issue Jan 16, 2018 · 13 comments
Labels
C: core P: major Priority: major. Between "default" and "critical" in severity. T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.
Milestone

Comments

@mossy-nw
Copy link

mossy-nw commented Jan 16, 2018

Qubes OS version:

R4.0_rc3

Affected TemplateVMs:

dom0


Steps to reproduce the behavior:

Attempt to restore backups of 53.0 GB or greater created in R4.0_rc3

[user@dom0 ~]$ qvm-backup-restore -d backupVMname /path/to/backup/qubes-backup-2018-01-15T123456`
  • backups are stored on ext4+LUKS encrypted external USB3.0 disk, mounted via qvm-block
  • backups were made to the same ext4+LUKS encrypted external USB3.0 disk, mounted via qvm-block
[user@dom0 ~]$ qvm-backup -x excludedVM1 -x excludedVM2 ... -x excludedVM8 -d backupVMname /path/to/backup` 

Expected behavior:

Backups are restored successfully regardless of size, so long as there is enough room on disk.

Actual behavior:

Backups fail for 53.0 GB and 138.0 GB data size, meanwhile backup of 46.8 GB is restored successfully without error. After listing VMs to be restored, the following errors are given:

qubesadmin.backup: Extracting data: 53.0 GiB to restore
qvm-backup-restore: error: failed to decrypt /var/tmp/restoren7i4il66/vm2/private.img.001.enc: b'scrypt: Input is not valid scrypt-encrypted block\n'

and

qubesadmin.backup: Extracting data: 138.0 GiB to restore
qvm-backup-restore: error: failed to decrypt /var/tmp/restorea750o7q1/vm1/root.img.000.enc: b'scrypt: Input is not valid scrypt-encrypted block\n'

General notes:

  • This has been verified on two different machines, including one brand new R4.0_rc3 install.
  • Not sure where the size cutoff is, I did restore a backup of ~40 GB but didn't test anything between 46.8 and 53 GB.
  • A larger backup made in Qubes R3.2 restores successfully (though seems to take a lot more time to restore than on R3.2)
  • A final thought--size on disk (in VM) of 138.0 GB backup is reported as 84GB, 53.0GB as 42GB and 46.8GB as 33GB, even though

Related issues:

don't think these are directly relevant, but they are recent issues re: backup-restore #3446 #3211

@andrewdavidwong
Copy link
Member

Just checking:

A final thought--size on disk (in VM) of 138.0 GB backup is reported as 84GB, 53.0GB as 42GB and 46.8GB as 33GB, even though

Was your sentence prematurely cut off here?

@andrewdavidwong andrewdavidwong added T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists. C: core P: major Priority: major. Between "default" and "critical" in severity. labels Jan 17, 2018
@andrewdavidwong andrewdavidwong added this to the Release 4.0 milestone Jan 17, 2018
@mossy-nw
Copy link
Author

I'm sorry, yes--"even though compression was not selected, it appears as though the backups are being compressed?"

@marmarek
Copy link
Member

Are you sure it isn't disk space issue? Try to observe disk space usage during the restore process, for example using watch df -h.

@mossy-nw
Copy link
Author

I'm sure it isn't disk space (df -h reports 195GB free for the 53GB backup, and fails shortly after restoring VM names, immediately after indicating backup size). But I will confirm as you suggest. Note also the backup-restore fails with the same error if --verify-only is used.

@marmarek
Copy link
Member

Can you say something more about this backup structure, I'd like to reproduce the problem. Is it many small VMs, or few large? Any custom template included? Any standalone VM?

@marmarek
Copy link
Member

Does it also fail when you restore only selected VMs? You can either use -x, or list VMs to restore as additional arguments.

@mossy-nw
Copy link
Author

mossy-nw commented Jan 17, 2018

OK I can confirm it fails and free (dom0) disk space remains at ~195GB throughout the (brief) backup-restore attempt. The external backup disk also has >270GB free, in case that matters (I don't think it does).

Apologies, after running this test I see additional error messages that I didn't note earlier:

qubesadmin.backup: -> Restoring finalbackedupVM...
qubesadmin.backup: Extracting data: 53.0 GiB to restore
qubesadmin.backup.extract: ERROR: unable to extract files for /var/tmp/restoreazajscpg/vm2/private.img.000, tar output:
  

  gzip: stdin: unexpected end of file

qubesadmin.backup: Error extracting data: failed to decrypt /var/tmp/restoreazajscpg/vm2/private.img.001.enc: b'scrypt: Input is not valid scrypt-encrypted block\n'
qubesadmin.backup: -> Done.
qubesadmin.backup: -> Please install updates for all the restored templates.
[dom0 appvms]$ Traceback (most recent call last):
  File "/bin/qubesd-query", line 9, in <module>
    load_entry_point('qubes==4.0.15', 'console_scripts', 'qubesd-query')()
  File "/usr/lib/python3.5/site-packages/qubes/tools/qubesd_query.py", line 100, in main
    returncode = loop.run_until_complete(coro)
  File "/usr/lib64/python3.5/asyncio/base_events.py", line 467, in run_until_complete
    return future.result()
  File "/usr/lib64/python3.5/asyncio/futures.py", line 294, in result
    raise self._exception
  File "/usr/lib64/python3.5/asyncio/tasks.py", line 240, in _step
    result = coro.send(None)
  File "/usr/lib/python3.5/site-packages/qubes/tools/qubesd_query.py", line 77, in qubesd_client
    sys.stdout.flush()
BrokenPipeError: [Errno 32] Broken pipe
Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>
BrokenPipeError: [Errno 32] Broken 

The additional messages don't appear using --verify-only (which I was doing for most of my tests reproducing this error, because of the pain of having to remove all the blank restored/renamed VMs). So perhaps also some of the additional errors could be due to many of the VMs needing to be renamed? I passed --rename-conflicting for the example above.

@mossy-nw mossy-nw reopened this Jan 17, 2018
@mossy-nw
Copy link
Author

mossy-nw commented Jan 17, 2018

OK I accidentally hit the "close and comment" button, sorry for confusion, obvs this is not resolved : )

please LMK if you would like me to send/post any full logs

@mossy-nw
Copy link
Author

mossy-nw commented Jan 17, 2018

Re: backup structure, I tried several different things but here is the simplest (53GB) that failed -- it takes a few hours at least to make each backups, so I haven't extensively tested different scenarios.

  • it is ~36 AppVMs (no TemplateVMs) ranging in size from 14MB to ~5GB (but I've also had this fail including a large AppVM of ~35GB)
    -- LMK if you would like me to test with a few large VM
  • some are based on custom--or at least renamed--templateVMs (based on qubes-template-debian-9 and qubes-template-fedora-26... but no custom kernels or anything, just cloned stock qubes templates with additional packages/or debian repositories added) but a few based on whonix-ws
    -- note that I have successfully restored the same appVMs (just fewer of them) based on these custom templates passing --ignore-missing
  • yes I used -x to exclude all templateVMs and a number of appVMs
  • no standalone VMs

HTH, thanks for looking in to this!

@marmarek
Copy link
Member

Does it also fail when you restore only selected VMs? You can either use -x, or list VMs to restore as additional arguments.

How about this? Can you successfully restore >53GB backup by doing it in parts - for example first restore VMs summing up to 30GB, then remaining 23GB? I want to check if this depends on backup archive size, or amount of data to restore.

PS I haven't managed to reproduce it yet, using various combinations of 100MB, 2GB and 28GB VMs. I've tried to restore from an archive in dom0, in a VM, on an external USB stick, and also artificially slowed down media (500K/s).

@mossy-nw
Copy link
Author

mossy-nw commented Jan 19, 2018

Sorry you're having trouble reproducing this. But yes, it looks like it depends on amount of data to restore, not archive size (only tested restoring a single ~30GB VM from the 53GB backup, but --verify-only and live restore both wokr). This is great news, since it seems to mean that the large backups I've made are still good!

@artemist
Copy link

artemist commented Jan 28, 2018

If I'm understanding this correctly, I had a similar problem earlier with a 200GB VM. To fully restore currently, you need twice the size of the backup on your target hard drive. The reason is that the encrypted pieces of the backup file are pulled into dom0 faster than they are decrypted, verified, and written into the LVM image representing the new virtual machine (where the restored VM will actually store data). To make matters worse, when the temporary files are deleted, the disk space is not freed until you manually trim the filesystem. I fixed the second and restored my 200GB VM by running sudo mount -o remount,rw,discard / in a dom0 terminal. This makes dom0 free disk space for use by the new AppVM (and other AppVMs) as soon as the temporary files are deleted.

@mossy-nw
Copy link
Author

mossy-nw commented Feb 4, 2018

That might be the problem but I think I always had plenty of free space (3-4x the backup size). In any case, I switched to R4.0_rc4 and was able to restore a 94GB backup, with no apparent problem. Since it seems no longer to be an issue, and @marmarek kindly suggested the previous fix to restore in parts, I'll go ahead and close this issue. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: core P: major Priority: major. Between "default" and "critical" in severity. T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.
Projects
None yet
Development

No branches or pull requests

4 participants