-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Receive may skip objects in FREEOBJECTS record #6694
Comments
@jgottula I guess you may have to give each chunk a .gz extension to make github happy. Amended example above. This way to receive we can just do
|
Yeah I was just thinking about that. Will have something that GitHub can't complain about within a few minutes here. |
@nedbass Okay, here's what I came up with. (heavy sigh, grumbling, etc) |
Okay, so, I added some I did this on a tree at 0c484ab (pretty much the most recent commit on master) where I also un-reverted #6576. When doing the incremental receive from
Based on this output, you can tell that Looking at the I'm reasonably convinced that this particular line added in PR #3542 should probably never have existed. I did take a look at e6d3a84, which messed with the loop condition/increment slightly, but I'm pretty sure it wouldn't have had any effects relevant to this. If I'm right about that @nedbass What do you think? |
When receiving a FREEOBJECTS record, receive_freeobjects() incorrectly skips a freed object in some cases. This leaves an object allocated on disk on the receiving side which is unallocated on the sending side, which may cause receiving subsequent incremental streams to fail. The bug was caused by an incorrect increment of the object index variable when current object being freed doesn't exist. The increment is incorrect because incrementing the object index is handled by a call to dmu_object_next() in the increment portion of the for loop statement. Fixes openzfs#6694 Signed-off-by: Ned Bass <[email protected]>
@jgottula nice work. I came to the same conclusion and cooked up a patch. I don't think #5532 or #6564 were related to this issue, however. I came up with an simple reproducer for this bug, but I decided it was too specific to add as a test case for the test suite. @behlendorf let me know if you disagree and would like to see this added to the test suite.
|
When receiving a FREEOBJECTS record, receive_freeobjects() incorrectly skips a freed object in some cases. Specifically, this happens when the first object in the range to be freed doesn't exist, but the second object does. This leaves an object allocated on disk on the receiving side which is unallocated on the sending side, which may cause receiving subsequent incremental streams to fail. The bug was caused by an incorrect increment of the object index variable when current object being freed doesn't exist. The increment is incorrect because incrementing the object index is handled by a call to dmu_object_next() in the increment portion of the for loop statement. Add test case that exposes this bug. Fixes openzfs#6694 Signed-off-by: Ned Bass <[email protected]>
When receiving a FREEOBJECTS record, receive_freeobjects() incorrectly skips a freed object in some cases. Specifically, this happens when the first object in the range to be freed doesn't exist, but the second object does. This leaves an object allocated on disk on the receiving side which is unallocated on the sending side, which may cause receiving subsequent incremental streams to fail. The bug was caused by an incorrect increment of the object index variable when current object being freed doesn't exist. The increment is incorrect because incrementing the object index is handled by a call to dmu_object_next() in the increment portion of the for loop statement. Add test case that exposes this bug. Fixes openzfs#6694 Signed-off-by: Ned Bass <[email protected]>
@nedbass Thanks for your help! I've been doing some testing with (master + this fix patch + #6576 un-reverted), and I'm now encountering no errors when receiving datasets that were previously problematic. I'll see whether any other unanticipated issues crop up, or if I can finally get all of my data off of this dying pool. 😛 |
When receiving a FREEOBJECTS record, receive_freeobjects() incorrectly skips a freed object in some cases. Specifically, this happens when the first object in the range to be freed doesn't exist, but the second object does. This leaves an object allocated on disk on the receiving side which is unallocated on the sending side, which may cause receiving subsequent incremental streams to fail. The bug was caused by an incorrect increment of the object index variable when current object being freed doesn't exist. The increment is incorrect because incrementing the object index is handled by a call to dmu_object_next() in the increment portion of the for loop statement. Add test case that exposes this bug. Fixes openzfs#6694 Signed-off-by: Ned Bass <[email protected]>
When receiving a FREEOBJECTS record, receive_freeobjects() incorrectly skips a freed object in some cases. Specifically, this happens when the first object in the range to be freed doesn't exist, but the second object does. This leaves an object allocated on disk on the receiving side which is unallocated on the sending side, which may cause receiving subsequent incremental streams to fail. The bug was caused by an incorrect increment of the object index variable when current object being freed doesn't exist. The increment is incorrect because incrementing the object index is handled by a call to dmu_object_next() in the increment portion of the for loop statement. Add test case that exposes this bug. Fixes openzfs#6694 Signed-off-by: Ned Bass <[email protected]>
@nedbass Well, this fix did resolve a number of problematic datasets that previously wouldn't receive. But I'm still running into errors with a few particular datasets. Those ones are still giving me the unhelpful "cannot receive incremental stream: incompatible embedded data stream feature with encrypted receive" error message indicative of some kind of (Note that with the latest receives I'm doing with these datasets, I've been making sure to first destroy the partially-received dataset already on the destination pool, if any, and then starting over with a fresh replication stream of that dataset, since the problem here involved previously received snapshots failing to free objects.) I'll try to do some digging soon to determine if these errors look like they're still due to some variation on this failure-to-free-objects situation, or if they're due to that changed-dnodesize situation from #6366, or if this is something else entirely. (I am still using a tree with #6576 un-reverted for these receives, so presumably it's not a dnodesize thing...) |
Okay so here's the excerpt from
Will have to do some more investigation on this. Unfortunately, the 3 datasets of mine that are still being affected by this error all contain somewhat-private data, so I'll have to take a close look myself before I go uploading anything. (And as I don't yet know whether this is a #6366 problem or a #6694 problem, I'll probably just keep posting here for now.) |
When receiving a FREEOBJECTS record, receive_freeobjects() incorrectly skips a freed object in some cases. Specifically, this happens when the first object in the range to be freed doesn't exist, but the second object does. This leaves an object allocated on disk on the receiving side which is unallocated on the sending side, which may cause receiving subsequent incremental streams to fail. The bug was caused by an incorrect increment of the object index variable when current object being freed doesn't exist. The increment is incorrect because incrementing the object index is handled by a call to dmu_object_next() in the increment portion of the for loop statement. Add test case that exposes this bug. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ned Bass <[email protected]> Closes openzfs#6694 Closes openzfs#6695
When receiving a FREEOBJECTS record, receive_freeobjects() incorrectly skips a freed object in some cases. Specifically, this happens when the first object in the range to be freed doesn't exist, but the second object does. This leaves an object allocated on disk on the receiving side which is unallocated on the sending side, which may cause receiving subsequent incremental streams to fail. The bug was caused by an incorrect increment of the object index variable when current object being freed doesn't exist. The increment is incorrect because incrementing the object index is handled by a call to dmu_object_next() in the increment portion of the for loop statement. Add test case that exposes this bug. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ned Bass <[email protected]> Closes #6694 Closes #6695
When receiving a FREEOBJECTS record, receive_freeobjects() incorrectly skips a freed object in some cases. Specifically, this happens when the first object in the range to be freed doesn't exist, but the second object does. This leaves an object allocated on disk on the receiving side which is unallocated on the sending side, which may cause receiving subsequent incremental streams to fail. The bug was caused by an incorrect increment of the object index variable when current object being freed doesn't exist. The increment is incorrect because incrementing the object index is handled by a call to dmu_object_next() in the increment portion of the for loop statement. Add test case that exposes this bug. Reviewed-by: George Melikov <[email protected]> Reviewed-by: Giuseppe Di Natale <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Ned Bass <[email protected]> Closes openzfs#6694 Closes openzfs#6695
As reported in #6366 by @jgottula, it appears
receive_freeobjects()
may have incorrectly skipped over freeing some objects in a FREEOBJECTS record. This left an object allocated on disk on the receiving side which was unallocated in the corresponding snapshot on the sending side, which in turn caused receiving a subsequent incremental stream to fail.I am opening this new issue to track the problem, as it seems fundamentally different than #6366. @jgottula, the problem object 101 was unlinked in between snapshots
pool/Game/TF2/History/ServerLinux@20160625a
andpool/Game/TF2/History/ServerLinux@20160707a
. For debugging purposes it would be useful to attach a full send of 20160625a and an incremental for 20160707a. To attach directly in pieces maybe try the split command, rather than using an external site (mega.nz is blocked on my network) e.g.The text was updated successfully, but these errors were encountered: