-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Raw send on encrypted datasets does not work when copying snapshots back #10523
Comments
I have disabled the
So the Line 2711 in ae7b167
which gets translated into an EIO here: Line 2160 in 3c42c9e
From the 2 mac checks it is the "local" one that fails: |
I have also seen this. This problem has existed in the master branch from some months now. For example zfs 0.8.0-699 (build from 20 March 2020) -796 (28 May 2020), -873 (build from 25th June 2020) and zfs 0.8.0-897 ( build from 8th Jul 2020 ) After an attempt to mount the re-constructed filesystem fails a zfs status -v immediately identifies the re-constructed
Unfortunately the defect introduced is NOT detected by a zpool scrub! A scrub after the receive and before the mount attempt will not show a problem.
Rolling back the faulty snapshot update (to @base) and re-starting scrub the corruption disappears as might be expected. |
The observation that zpool scrub will not detect the problem is particularly worrying. While one can quickly test a filesystem dataset via a mount command .. what about when you can't use mount? For example when the dataset is a volume? |
I believe that the reported "corrpution" is not actually a corruption (that's why scrub does not find any problems). The problem is reported, because the mac-mismatch reports an zfs/module/os/linux/zfs/zio_crypt.c Line 1212 in 10fa254
|
I did a small test using a volume rather than a filesystem. The rebuilt volume appears under /dev/zvol and the contents |
@felixdoerre after talking this issue over with @tcaputi we've got a good idea how to go about fixing this. As you mentioned in an earlier comment the core issue here is that the user space accounting is present when it's not expected to be. This leads to the subsequent mount failure due a checksum error when verifying the local mac. What we can do this handle this specific case it to clear the |
When sending raw encrypted datasets the user space accounting is present when it's not expected to be. This leads to the subsequent mount failure due a checksum error when verifying the local mac. Fix this by clearing the OBJSET_FLAG_USERACCOUNTING_COMPLETE and reset the local mac. This allows the user accounting to be correctly updated on first mount using the normal upgrade process. Closes openzfs#10523. Signed-off-by: George Amanakis <[email protected]>
When sending raw encrypted datasets the user space accounting is present when it's not expected to be. This leads to the subsequent mount failure due a checksum error when verifying the local mac. Fix this by clearing the OBJSET_FLAG_USERACCOUNTING_COMPLETE and reset the local mac. This allows the user accounting to be correctly updated on first mount using the normal upgrade process. Closes openzfs#10523. Signed-off-by: George Amanakis <[email protected]>
When sending raw encrypted datasets the user space accounting is present when it's not expected to be. This leads to the subsequent mount failure due a checksum error when verifying the local mac. Fix this by clearing the OBJSET_FLAG_USERACCOUNTING_COMPLETE and reset the local mac. This allows the user accounting to be correctly updated on first mount using the normal upgrade process. Closes openzfs#10523. Signed-off-by: George Amanakis <[email protected]>
I bumped recently into this issue. Given the excellent discussion/debug done here, I decided to give it a try. |
When sending raw encrypted datasets the user space accounting is present when it's not expected to be. This leads to the subsequent mount failure due a checksum error when verifying the local mac. Fix this by clearing the OBJSET_FLAG_USERACCOUNTING_COMPLETE and reset the local mac. This allows the user accounting to be correctly updated on first mount using the normal upgrade process. Closes openzfs#10523. Signed-off-by: George Amanakis <[email protected]>
When sending raw encrypted datasets the user space accounting is present when it's not expected to be. This leads to the subsequent mount failure due a checksum error when verifying the local mac. Fix this by clearing the OBJSET_FLAG_USERACCOUNTING_COMPLETE and reset the local mac. This allows the user accounting to be correctly updated on first mount using the normal upgrade process. Closes openzfs#10523. Signed-off-by: George Amanakis <[email protected]>
When sending raw encrypted datasets the user space accounting is present when it's not expected to be. This leads to the subsequent mount failure due a checksum error when verifying the local mac. Fix this by clearing the OBJSET_FLAG_USERACCOUNTING_COMPLETE and reset the local mac. This allows the user accounting to be correctly updated on first mount using the normal upgrade process. Closes openzfs#10523. Signed-off-by: George Amanakis <[email protected]>
When sending raw encrypted datasets the user space accounting is present when it's not expected to be. This leads to the subsequent mount failure due a checksum error when verifying the local mac. Fix this by clearing the OBJSET_FLAG_USERACCOUNTING_COMPLETE and reset the local mac. This allows the user accounting to be correctly updated on first mount using the normal upgrade process. Closes openzfs#10523. Signed-off-by: George Amanakis <[email protected]>
When sending raw encrypted datasets the user space accounting is present when it's not expected to be. This leads to the subsequent mount failure due a checksum error when verifying the local mac. Fix this by clearing the OBJSET_FLAG_USERACCOUNTING_COMPLETE and reset the local mac. This allows the user accounting to be correctly updated on first mount using the normal upgrade process. Reviewed-By: Brian Behlendorf <[email protected]> Reviewed-By: Tom Caputi <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes #10523 Closes #11221
When sending raw encrypted datasets the user space accounting is present when it's not expected to be. This leads to the subsequent mount failure due a checksum error when verifying the local mac. Fix this by clearing the OBJSET_FLAG_USERACCOUNTING_COMPLETE and reset the local mac. This allows the user accounting to be correctly updated on first mount using the normal upgrade process. Reviewed-By: Brian Behlendorf <[email protected]> Reviewed-By: Tom Caputi <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes openzfs#10523 Closes openzfs#11221
When sending raw encrypted datasets the user space accounting is present when it's not expected to be. This leads to the subsequent mount failure due a checksum error when verifying the local mac. Fix this by clearing the OBJSET_FLAG_USERACCOUNTING_COMPLETE and reset the local mac. This allows the user accounting to be correctly updated on first mount using the normal upgrade process. Reviewed-By: Brian Behlendorf <[email protected]> Reviewed-By: Tom Caputi <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes openzfs#10523 Closes openzfs#11221
When sending raw encrypted datasets the user space accounting is present when it's not expected to be. This leads to the subsequent mount failure due a checksum error when verifying the local mac. Fix this by clearing the OBJSET_FLAG_USERACCOUNTING_COMPLETE and reset the local mac. This allows the user accounting to be correctly updated on first mount using the normal upgrade process. Reviewed-By: Brian Behlendorf <[email protected]> Reviewed-By: Tom Caputi <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes openzfs#10523 Closes openzfs#11221
Good point, I'm reopening this. |
It smells like a design flaw. The dataset seems to be containing surplus metadata (in this context I mean everything else than that which is strictly necessary to mount/access the data) upon which on some higher than physical I/O level is decided to not access the data and report an error which is incorrectly escalated/handled. If this really IS a design flaw, I hope that a well coordinated effort will be able to carefully update the design, which will pay off hugely in future development. ZFS should be easily capable of using recursive incremental snapshots of (possibly) encrypted datasets that can be sent/received in raw form to safely and efficiently synchronize sets bi-directionally. Best regards. |
Definitely something that needs to be looked into. I just ran into this issue when testing the received encrypted snapshots on our backup system. I'm wondering right now whether zfs encryption is even usable with raw sends in this form. Is there any sort of ETA on when a fix might be available? |
I would not recommend making any plans that assume raw send works reliably
on any particular timetable, given the amount of time people seem willing
to invest in fixing it.
…On Wed, Jan 12, 2022 at 10:32 AM Felix Winterhalter < ***@***.***> wrote:
Definitely something that needs to be looked into. I just ran into this
issue when testing the received encrypted snapshots on our backup system.
I'm wondering right now whether zfs encryption is even usable with raw
sends in this form. Is there any sort of ETA on when a fix might be
available?
—
Reply to this email directly, view it on GitHub
<#10523 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABUI7J2OFLVBPUXVRCPV6DUVWNJDANCNFSM4ONXDKLA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
You are receiving this because you commented.Message ID:
***@***.***>
|
Raw receiving a snapshot back to the originating dataset is currently impossible because of user accounting being present in the originating dataset. One solution would be resetting user accounting when raw receiving on the receiving dataset. However, to recalculate it we would have to dirty all dnodes, which may not be preferable on big datasets. Instead, we rely on the os_phys flag OBJSET_FLAG_USERACCOUNTING_COMPLETE to indicate that user accounting is incomplete when raw receiving. Thus, on the next mount of the receiving dataset the local mac protecting user accounting is zeroed out. The flag is then cleared when user accounting of the raw received snapshot is calculated. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes #12981 Closes #10523 Closes #11221 Closes #11294 Closes #12594 Issue #11300
Raw receiving a snapshot back to the originating dataset is currently impossible because of user accounting being present in the originating dataset. One solution would be resetting user accounting when raw receiving on the receiving dataset. However, to recalculate it we would have to dirty all dnodes, which may not be preferable on big datasets. Instead, we rely on the os_phys flag OBJSET_FLAG_USERACCOUNTING_COMPLETE to indicate that user accounting is incomplete when raw receiving. Thus, on the next mount of the receiving dataset the local mac protecting user accounting is zeroed out. The flag is then cleared when user accounting of the raw received snapshot is calculated. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes openzfs#12981 Closes openzfs#10523 Closes openzfs#11221 Closes openzfs#11294 Closes openzfs#12594 Issue openzfs#11300
Raw receiving a snapshot back to the originating dataset is currently impossible because of user accounting being present in the originating dataset. One solution would be resetting user accounting when raw receiving on the receiving dataset. However, to recalculate it we would have to dirty all dnodes, which may not be preferable on big datasets. Instead, we rely on the os_phys flag OBJSET_FLAG_USERACCOUNTING_COMPLETE to indicate that user accounting is incomplete when raw receiving. Thus, on the next mount of the receiving dataset the local mac protecting user accounting is zeroed out. The flag is then cleared when user accounting of the raw received snapshot is calculated. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes openzfs#12981 Closes openzfs#10523 Closes openzfs#11221 Closes openzfs#11294 Closes openzfs#12594 Issue openzfs#11300
Raw receiving a snapshot back to the originating dataset is currently impossible because of user accounting being present in the originating dataset. One solution would be resetting user accounting when raw receiving on the receiving dataset. However, to recalculate it we would have to dirty all dnodes, which may not be preferable on big datasets. Instead, we rely on the os_phys flag OBJSET_FLAG_USERACCOUNTING_COMPLETE to indicate that user accounting is incomplete when raw receiving. Thus, on the next mount of the receiving dataset the local mac protecting user accounting is zeroed out. The flag is then cleared when user accounting of the raw received snapshot is calculated. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: George Amanakis <[email protected]> Closes openzfs#12981 Closes openzfs#10523 Closes openzfs#11221 Closes openzfs#11294 Closes openzfs#12594 Issue openzfs#11300
Describe the problem you're observing
With
--enable-debug
the example already crashes at thezfs receive testpool/source
. This produces the following kernel log:The text was updated successfully, but these errors were encountered: