-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a mechanism to wait for delete queue to drain #9707
Conversation
Codecov Report
@@ Coverage Diff @@
## master #9707 +/- ##
==========================================
- Coverage 79% 66% -14%
==========================================
Files 384 363 -21
Lines 121788 117604 -4184
==========================================
- Hits 96808 77558 -19250
- Misses 24980 40046 +15066
Continue to review full report at Codecov.
|
7a08d34
to
892dd68
Compare
8e7307d
to
ae6f413
Compare
tests/zfs-tests/tests/functional/cli_root/zfs_wait/zfs_wait.kshlib
Outdated
Show resolved
Hide resolved
From my perspective this is ready to go. But if possible I'd like to get a second approving review. |
@jgallag88 and I will take another look. |
tests/zfs-tests/tests/functional/cli_root/zfs_wait/zfs_wait_deleteq.ksh
Outdated
Show resolved
Hide resolved
tests/zfs-tests/tests/functional/cli_root/zfs_wait/zfs_wait_deleteq.ksh
Outdated
Show resolved
Hide resolved
mutex_enter(&dd->dd_activity_lock); | ||
dd->dd_activity_count++; | ||
|
||
dsl_dataset_long_hold(ds, FTAG); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a long hold? dsl_dir_cancel_waiters
waits for all the threads doing zfs_ioc_wait_fs
to wake up and finish using the dataset, which seems like enough to allow the dataset to be safely destroyed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could probably rewrite this correctly to work without having long-holds for each waiter, but I don't know if there would be a significant benefit? I would want to go through each case where we look at the longholds and fix them up to include the waiters count, which we basically get for free by just having long-holds for the waiters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"long holds" are normally used to prevent the dataset from being destroyed (but allow other manipulation of the dataset, e.g. renaming), but in this case we are not preventing the destruction but rather forcing destruction (and unmounting) to cancel the waiters. The long-hold ensures that the dataset_t / dsl_dir_t do not go away while we are waiting, without having a "real" hold (which must be short-lived because the pool config lock blocks spa_sync). An alternative mechanism for keeping the dataset around could be developed but this seemed simpler. Do we have a comment explaining all that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a lot of that is summed up in the comment that explains what long_holds are, though the specific case for the activity holds isn't written down anywhere. I'll add a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We get a long-hold here so that the dsl_dataset_t and dsl_dir_t
aren't evicted while we're waiting.
My thinking is: any thread that wants to evict/destroy a dsl_dataset_t
needs to wake up any waiters that exist, which it does by calling dsl_dir_cancel_waiters
. By the time dsl_dir_cancel_waiters
has returned, the threads that were doing the waiting have woken up and finished using the dsl_dataset_t
and dsl_dir_t
, and the dataset can be safely destroyed, even without having taken a long hold. Let me know if I'm missing something.
I don't think this is a huge deal either way, but I thought it might be simpler to not have to compare the number of long holds to the numbers of waiters when we are checking whether we can unmount / destroy a dataset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is correct, we could safely switch to that approach if needed. I'm just not sure if it is enough better to be worth reworking a change that is otherwise ready to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair. I'm OK with this as is.
Codecov Report
@@ Coverage Diff @@
## master #9707 +/- ##
==========================================
- Coverage 79.28% 79.15% -0.13%
==========================================
Files 385 385
Lines 122467 122616 +149
==========================================
- Hits 97094 97057 -37
- Misses 25373 25559 +186
Continue to review full report at Codecov.
|
@pcd1193182 would you mind rebasing this to resolve the conflict. |
Signed-off-by: Paul Dagnelie <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
Signed-off-by: Paul Dagnelie <[email protected]>
@pcd1193182 there appears to be a locking problem in the updated PR which needs to be resolved. Aside from that this looks good.
|
Signed-off-by: Paul Dagnelie <[email protected]>
Updated to "Revision Needed" until the VERIFY failure mentioned above is resolved. |
Thanks, merged. |
Add a mechanism to wait for delete queue to drain. When doing redacted send/recv, many workflows involve deleting files that contain sensitive data. Because of the way zfs handles file deletions, snapshots taken quickly after a rm operation can sometimes still contain the file in question, especially if the file is very large. This can result in issues for redacted send/recv users who expect the deleted files to be redacted in the send streams, and not appear in their clones. This change duplicates much of the zpool wait related logic into a zfs wait command, which can be used to wait until the internal deleteq has been drained. Additional wait activities may be added in the future. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: John Gallagher <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Closes openzfs#9707
Motivation and Context
When doing redacted send/recv, many workflows involve deleting files that contain sensitive data. Because of the way zfs handles file deletions, snapshots taken quickly after a rm operation can sometimes still contain the file in question, especially if the file is very large. This can result in issues for redacted send/recv users who expect the deleted files to be redacted in the send streams, and not appear in their clones.
Description
This PR duplicates much of the
zpool wait
related logic into azfs wait
command. I would be open to a proposal to combine a bunch of this logic if people would prefer that, though some changes would need to be made to correctly dispatch the waiting in the kernel into the appropriate dsl_dir_wait or spa_wait as needed.How Has This Been Tested?
New zfs-test added, and tested manually with files being held open by file descriptors. Verified that if there is nothing in the delete queue, the command returns immediately.
Types of changes
Checklist:
Signed-off-by
.