Add a mechanism to wait for delete queue to drain #9707

pcd1193182 · 2019-12-10T00:41:00Z

Motivation and Context

When doing redacted send/recv, many workflows involve deleting files that contain sensitive data. Because of the way zfs handles file deletions, snapshots taken quickly after a rm operation can sometimes still contain the file in question, especially if the file is very large. This can result in issues for redacted send/recv users who expect the deleted files to be redacted in the send streams, and not appear in their clones.

Description

This PR duplicates much of the zpool wait related logic into a zfs wait command. I would be open to a proposal to combine a bunch of this logic if people would prefer that, though some changes would need to be made to correctly dispatch the waiting in the kernel into the appropriate dsl_dir_wait or spa_wait as needed.

How Has This Been Tested?

New zfs-test added, and tested manually with files being held open by file descriptors. Verified that if there is nothing in the delete queue, the command returns immediately.

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Performance enhancement (non-breaking change which improves efficiency)
Code cleanup (non-breaking change which makes code smaller or more readable)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (a change to man pages or other documentation)

Checklist:

My code follows the ZFS on Linux code style requirements.
I have updated the documentation accordingly.
I have read the contributing document.
I have added tests to cover my changes.
All new and existing tests passed.
All commit messages are properly formatted and contain Signed-off-by.

module/zfs/dsl_dir.c

module/zfs/zfs_ioctl.c

man/man8/zfs-wait.8

codecov · 2019-12-10T08:21:42Z

Codecov Report

Merging #9707 into master will decrease coverage by 14%.
The diff coverage is 25%.

@@            Coverage Diff             @@
##           master    #9707      +/-   ##
==========================================
- Coverage      79%      66%     -14%     
==========================================
  Files         384      363      -21     
  Lines      121788   117604    -4184     
==========================================
- Hits        96808    77558   -19250     
- Misses      24980    40046   +15066

Flag	Coverage Δ
#kernel	`68% <40%> (-11%)`	⬇️
#user	`47% <5%> (-20%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 25df8fb...dcbb78a. Read the comment docs.

module/zfs/dsl_dir.c

module/os/linux/zfs/zfs_vfsops.c

man/man8/zfs-wait.8

module/zfs/zfs_ioctl.c

module/zfs/dsl_dataset.c

module/zfs/dsl_dir.c

module/zfs/dsl_destroy.c

include/sys/fs/zfs.h

module/zfs/dsl_dir.c

module/zfs/zfs_ioctl.c

tests/zfs-tests/tests/functional/cli_root/zfs_wait/zfs_wait.kshlib

module/zfs/zfs_ioctl.c

behlendorf · 2020-03-17T17:15:17Z

From my perspective this is ready to go. But if possible I'd like to get a second approving review.

ahrens · 2020-03-17T17:19:49Z

@jgallag88 and I will take another look.

module/os/linux/zfs/zfs_vfsops.c

include/sys/dsl_dir.h

module/zfs/dsl_dataset.c

tests/zfs-tests/tests/functional/cli_root/zfs_wait/zfs_wait_deleteq.ksh

jgallag88 · 2020-03-17T19:23:57Z

module/zfs/zfs_ioctl.c

+	mutex_enter(&dd->dd_activity_lock);
+	dd->dd_activity_count++;
+
+	dsl_dataset_long_hold(ds, FTAG);


Do we need a long hold? dsl_dir_cancel_waiters waits for all the threads doing zfs_ioc_wait_fs to wake up and finish using the dataset, which seems like enough to allow the dataset to be safely destroyed.

I think we could probably rewrite this correctly to work without having long-holds for each waiter, but I don't know if there would be a significant benefit? I would want to go through each case where we look at the longholds and fix them up to include the waiters count, which we basically get for free by just having long-holds for the waiters.

"long holds" are normally used to prevent the dataset from being destroyed (but allow other manipulation of the dataset, e.g. renaming), but in this case we are not preventing the destruction but rather forcing destruction (and unmounting) to cancel the waiters. The long-hold ensures that the dataset_t / dsl_dir_t do not go away while we are waiting, without having a "real" hold (which must be short-lived because the pool config lock blocks spa_sync). An alternative mechanism for keeping the dataset around could be developed but this seemed simpler. Do we have a comment explaining all that?

I think a lot of that is summed up in the comment that explains what long_holds are, though the specific case for the activity holds isn't written down anywhere. I'll add a comment.

We get a long-hold here so that the dsl_dataset_t and dsl_dir_t
aren't evicted while we're waiting.

My thinking is: any thread that wants to evict/destroy a dsl_dataset_t needs to wake up any waiters that exist, which it does by calling dsl_dir_cancel_waiters. By the time dsl_dir_cancel_waiters has returned, the threads that were doing the waiting have woken up and finished using the dsl_dataset_t and dsl_dir_t, and the dataset can be safely destroyed, even without having taken a long hold. Let me know if I'm missing something.

I don't think this is a huge deal either way, but I thought it might be simpler to not have to compare the number of long holds to the numbers of waiters when we are checking whether we can unmount / destroy a dataset.

I think that is correct, we could safely switch to that approach if needed. I'm just not sure if it is enough better to be worth reworking a change that is otherwise ready to go.

That's fair. I'm OK with this as is.

module/zfs/dsl_dataset.c

codecov-io · 2020-03-19T04:40:04Z

Codecov Report

Merging #9707 into master will decrease coverage by 0.12%.
The diff coverage is 76.77%.

@@            Coverage Diff             @@
##           master    #9707      +/-   ##
==========================================
- Coverage   79.28%   79.15%   -0.13%     
==========================================
  Files         385      385              
  Lines      122467   122616     +149     
==========================================
- Hits        97094    97057      -37     
- Misses      25373    25559     +186

Flag	Coverage Δ
#kernel	`79.63% <81.44%> (+0.01%)`	⬆️
#user	`65.96% <54.54%> (+0.75%)`	⬆️

Impacted Files	Coverage Δ
include/sys/dsl_dir.h	`100.00% <ø> (ø)`
module/zfs/dsl_dir.c	`88.75% <68.42%> (-1.10%)`	⬇️
cmd/zfs/zfs_main.c	`82.57% <69.44%> (-0.13%)`	⬇️
lib/libzfs/libzfs_dataset.c	`76.39% <71.42%> (-0.07%)`	⬇️
module/zfs/dsl_dataset.c	`92.59% <77.77%> (-0.09%)`	⬇️
module/zfs/zfs_ioctl.c	`86.31% <88.00%> (+0.01%)`	⬆️
lib/libzfs_core/libzfs_core.c	`86.45% <100.00%> (+0.27%)`	⬆️
module/os/linux/zfs/zfs_dir.c	`82.91% <100.00%> (-0.31%)`	⬇️
module/os/linux/zfs/zfs_vfsops.c	`75.75% <100.00%> (+0.02%)`	⬆️
module/zfs/dsl_destroy.c	`93.60% <100.00%> (+0.20%)`	⬆️
... and 71 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 652bdc9...2aed9fc. Read the comment docs.

behlendorf · 2020-03-20T16:06:47Z

@pcd1193182 would you mind rebasing this to resolve the conflict.

Signed-off-by: Paul Dagnelie <[email protected]>

behlendorf · 2020-03-26T17:45:58Z

@pcd1193182 there appears to be a locking problem in the updated PR which needs to be resolved. Aside from that this looks good.

[ 3660.685503] VERIFY(dsl_pool_config_held(ds->ds_dir->dd_pool)) failed
[ 3660.692369] PANIC at dmu_objset.c:648:dmu_objset_from_ds()
[ 3660.698814] Showing stack for process 23044
[ 3660.704305] CPU: 0 PID: 23044 Comm: zfs Kdump: loaded Tainted: P           OE  ------------   3.10.0-1062.18.1.el7.x86_64 #1
[ 3660.715582] Hardware name: Amazon EC2 m5d.large/, BIOS 1.0 10/16/2017
[ 3660.721736] Call Trace:
[ 3660.725989]  [<ffffffff89f7b416>] dump_stack+0x19/0x1b
[ 3660.809423]  [<ffffffffc06ba54b>] spl_dumpstack+0x2b/0x30 [spl]
[ 3660.815267]  [<ffffffffc06ba619>] spl_panic+0xc9/0x110 [spl]
[ 3660.907000]  [<ffffffffc07f585b>] dmu_objset_from_ds+0x22b/0x260 [zfs]
[ 3660.913129]  [<ffffffffc0840cdd>] dsl_dir_activity_in_progress+0x5d/0x260 [zfs]
[ 3660.938156]  [<ffffffffc0846042>] dsl_dir_wait+0x82/0xe0 [zfs]
[ 3660.944041]  [<ffffffffc08f10a4>] zfs_ioc_wait_fs+0x164/0x240 [zfs]
[ 3660.949960]  [<ffffffffc08f9554>] zfsdev_ioctl_common+0x1a4/0x5b0 [zfs]
[ 3660.962239]  [<ffffffffc09374d4>] zfsdev_ioctl+0x54/0xf0 [zfs]
[ 3660.967950]  [<ffffffff89a5fde0>] do_vfs_ioctl+0x3a0/0x5a0
[ 3660.973500]  [<ffffffff89a60081>] SyS_ioctl+0xa1/0xc0
[ 3660.978901]  [<ffffffff89f8dede>] system_call_fastpath+0x25/0x2a

Signed-off-by: Paul Dagnelie <[email protected]>

behlendorf · 2020-03-31T17:37:04Z

Updated to "Revision Needed" until the VERIFY failure mentioned above is resolved.

behlendorf · 2020-04-01T17:02:34Z

Thanks, merged.

Add a mechanism to wait for delete queue to drain. When doing redacted send/recv, many workflows involve deleting files that contain sensitive data. Because of the way zfs handles file deletions, snapshots taken quickly after a rm operation can sometimes still contain the file in question, especially if the file is very large. This can result in issues for redacted send/recv users who expect the deleted files to be redacted in the send streams, and not appear in their clones. This change duplicates much of the zpool wait related logic into a zfs wait command, which can be used to wait until the internal deleteq has been drained. Additional wait activities may be added in the future. Reviewed-by: Matthew Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Reviewed-by: John Gallagher <[email protected]> Signed-off-by: Paul Dagnelie <[email protected]> Closes openzfs#9707

pcd1193182 added Type: Feature Feature request or new feature Status: Code Review Needed Ready for review and testing labels Dec 10, 2019

pcd1193182 requested review from behlendorf, ahrens and shartse December 10, 2019 00:41

ahrens reviewed Dec 10, 2019

View reviewed changes

module/zfs/dsl_dir.c Outdated Show resolved Hide resolved

module/zfs/zfs_ioctl.c Show resolved Hide resolved

man/man8/zfs-wait.8 Show resolved Hide resolved

pcd1193182 force-pushed the dq_wait branch from 059e68a to 52c9ef3 Compare December 17, 2019 18:50

ahrens reviewed Dec 17, 2019

View reviewed changes

module/zfs/dsl_dir.c Outdated Show resolved Hide resolved

module/zfs/dsl_dir.c Show resolved Hide resolved

module/zfs/dsl_dir.c Show resolved Hide resolved

module/zfs/dsl_dir.c Show resolved Hide resolved

module/os/linux/zfs/zfs_vfsops.c Outdated Show resolved Hide resolved

pcd1193182 force-pushed the dq_wait branch from dbfc117 to 3e82e50 Compare January 27, 2020 21:59

ahrens reviewed Jan 28, 2020

View reviewed changes

pcd1193182 force-pushed the dq_wait branch 4 times, most recently from 7a08d34 to 892dd68 Compare February 26, 2020 16:54

pcd1193182 force-pushed the dq_wait branch 2 times, most recently from 8e7307d to ae6f413 Compare March 6, 2020 18:44

behlendorf reviewed Mar 9, 2020

View reviewed changes

behlendorf approved these changes Mar 9, 2020

View reviewed changes

behlendorf requested a review from jgallag88 March 12, 2020 17:59

behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Code Review Needed Ready for review and testing labels Mar 17, 2020

ahrens reviewed Mar 17, 2020

View reviewed changes

module/os/linux/zfs/zfs_vfsops.c Outdated Show resolved Hide resolved

include/sys/dsl_dir.h Outdated Show resolved Hide resolved

module/zfs/dsl_dataset.c Outdated Show resolved Hide resolved

module/zfs/dsl_dataset.c Outdated Show resolved Hide resolved

jgallag88 reviewed Mar 17, 2020

View reviewed changes

ahrens approved these changes Mar 18, 2020

View reviewed changes

module/zfs/dsl_dataset.c Show resolved Hide resolved

pcd1193182 added 9 commits March 20, 2020 10:41

Add a mechanism to wait for delete queue to drain

d4f9dff

Signed-off-by: Paul Dagnelie <[email protected]>

try long hold approach

a6be6a2

address more feedback

b7b5b2f

new readonly?

b82c4be

matt feedback

3110153

fix test bug

238f6e3

brian feedback

8ced06b

Matt feedback

9a2ecc8

Signed-off-by: Paul Dagnelie <[email protected]>

more feedback

61c8c27

Signed-off-by: Paul Dagnelie <[email protected]>

pcd1193182 force-pushed the dq_wait branch from 6453aa7 to 61c8c27 Compare March 20, 2020 17:52

jgallag88 approved these changes Mar 23, 2020

View reviewed changes

fix pool config hold bug

2aed9fc

Signed-off-by: Paul Dagnelie <[email protected]>

behlendorf added Status: Revision Needed Changes are required for the PR to be accepted and removed Status: Accepted Ready to integrate (reviewed, tested) labels Mar 31, 2020

behlendorf added Status: Accepted Ready to integrate (reviewed, tested) and removed Status: Revision Needed Changes are required for the PR to be accepted labels Apr 1, 2020

behlendorf merged commit 5a42ef0 into openzfs:master Apr 1, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a mechanism to wait for delete queue to drain #9707

Add a mechanism to wait for delete queue to drain #9707

pcd1193182 commented Dec 10, 2019

codecov bot commented Dec 10, 2019 •

edited

Loading

behlendorf commented Mar 17, 2020

ahrens commented Mar 17, 2020

jgallag88 Mar 17, 2020

pcd1193182 Mar 18, 2020

ahrens Mar 18, 2020

pcd1193182 Mar 18, 2020

jgallag88 Mar 18, 2020

pcd1193182 Mar 23, 2020

jgallag88 Mar 23, 2020

codecov-io commented Mar 19, 2020 •

edited

Loading

behlendorf commented Mar 20, 2020

behlendorf commented Mar 26, 2020

behlendorf commented Mar 31, 2020

behlendorf commented Apr 1, 2020

Add a mechanism to wait for delete queue to drain #9707

Add a mechanism to wait for delete queue to drain #9707

Conversation

pcd1193182 commented Dec 10, 2019

Motivation and Context

Description

How Has This Been Tested?

Types of changes

Checklist:

codecov bot commented Dec 10, 2019 • edited Loading

Codecov Report

behlendorf commented Mar 17, 2020

ahrens commented Mar 17, 2020

jgallag88 Mar 17, 2020

Choose a reason for hiding this comment

pcd1193182 Mar 18, 2020

Choose a reason for hiding this comment

ahrens Mar 18, 2020

Choose a reason for hiding this comment

pcd1193182 Mar 18, 2020

Choose a reason for hiding this comment

jgallag88 Mar 18, 2020

Choose a reason for hiding this comment

pcd1193182 Mar 23, 2020

Choose a reason for hiding this comment

jgallag88 Mar 23, 2020

Choose a reason for hiding this comment

codecov-io commented Mar 19, 2020 • edited Loading

Codecov Report

behlendorf commented Mar 20, 2020

behlendorf commented Mar 26, 2020

behlendorf commented Mar 31, 2020

behlendorf commented Apr 1, 2020

codecov bot commented Dec 10, 2019 •

edited

Loading

codecov-io commented Mar 19, 2020 •

edited

Loading