-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Long hold the dataset during upgrade #6837
Conversation
module/zfs/dmu_objset.c
Outdated
@@ -1320,10 +1320,19 @@ dmu_objset_upgrade_task_cb(void *data) | |||
static void | |||
dmu_objset_upgrade(objset_t *os, dmu_objset_upgrade_cb_t cb) | |||
{ | |||
boolean_t config_held = B_FALSE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than add this conditional logic to dmu_objset_upgrade()
it looks like it might be cleaner to ASSERT that the caller has the dsl pool config log held. That would allow you to unconditionally take the long hold as you originally intended. Doing so should be straight forward in the zfs_ioc_userobjspace_upgrade
and dmu_objset_own
callers by moving down the dsl_pool_rele
call a little bit. As for the call in zfs_fuid_overobjquota()
you should be able to safely take and drop the pool config lock in the upgrade case unconditionally here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I'll fix this.
Codecov Report
@@ Coverage Diff @@
## master #6837 +/- ##
==========================================
- Coverage 75.19% 75.14% -0.06%
==========================================
Files 297 297
Lines 94453 94450 -3
==========================================
- Hits 71026 70970 -56
- Misses 23427 23480 +53
Continue to review full report at Codecov.
|
75a8546
to
f716f03
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, the patch looks good. Just one remaining minor issue to wrap up.
@@ -1311,6 +1313,7 @@ dmu_objset_upgrade_task_cb(void *data) | |||
os->os_upgrade_exit = B_TRUE; | |||
os->os_upgrade_id = 0; | |||
mutex_exit(&os->os_upgrade_lock); | |||
dsl_dataset_long_rele(dmu_objset_ds(os), upgrade_tag); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's one case in dmu_objset_upgrade_stop()
when you need to drop the long hold. When taskq_cancel_id()
return's 0 it indicates the taskqid was able to be be successfully canceled before it was executed by a thread. In which case you'll need to drop the long hold there. In both the ENOENT and EBUSY error cases the callback will either have run or be currently running so there's no problem there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I'll fix this.
If the receive or rollback is performed while filesystem is upgrading the objset may be evicted in `dsl_dataset_clone_swap_sync_impl`. This will lead to NULL pointer dereference when upgrade tries to access evicted objset. This commit adds long hold of dataset during whole upgrade process. The receive and rollback will return an EBUSY error until the upgrade is not finished. Signed-off-by: Arkadiusz Bubała <[email protected]>
f716f03
to
4e6ac3c
Compare
If the receive or rollback is performed while filesystem is upgrading the objset may be evicted in `dsl_dataset_clone_swap_sync_impl`. This will lead to NULL pointer dereference when upgrade tries to access evicted objset. This commit adds long hold of dataset during whole upgrade process. The receive and rollback will return an EBUSY error until the upgrade is not finished. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Arkadiusz Bubała <[email protected]> Closes openzfs#5295 Closes openzfs#6837
If the receive or rollback is performed while filesystem is upgrading the objset may be evicted in `dsl_dataset_clone_swap_sync_impl`. This will lead to NULL pointer dereference when upgrade tries to access evicted objset. This commit adds long hold of dataset during whole upgrade process. The receive and rollback will return an EBUSY error until the upgrade is not finished. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Arkadiusz Bubała <[email protected]> Closes openzfs#5295 Closes openzfs#6837
If the receive or rollback is performed while filesystem is upgrading the objset may be evicted in `dsl_dataset_clone_swap_sync_impl`. This will lead to NULL pointer dereference when upgrade tries to access evicted objset. This commit adds long hold of dataset during whole upgrade process. The receive and rollback will return an EBUSY error until the upgrade is not finished. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Arkadiusz Bubała <[email protected]> Closes openzfs#5295 Closes openzfs#6837
If the receive or rollback is performed while filesystem is upgrading the objset may be evicted in `dsl_dataset_clone_swap_sync_impl`. This will lead to NULL pointer dereference when upgrade tries to access evicted objset. This commit adds long hold of dataset during whole upgrade process. The receive and rollback will return an EBUSY error until the upgrade is not finished. Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Arkadiusz Bubała <[email protected]> Closes openzfs#5295 Closes openzfs#6837
Description
Added long hold of dataset during whole upgrade process. The receive and rollback will return an EBUSY error until the upgrade is not finished.
Motivation and Context
If the receive or rollback is performed while filesystem is upgrading the object set may be evicted in
dsl_dataset_clone_swap_sync_impl
. This will lead to NULL pointer dereference when upgrade tries to access evicted object set. Solves #5295.How Has This Been Tested?
Tested in following scenario:
zfs send -I src/vol00 autosnap_2017-11-07-080400 src/vol00 autosnap_2017-11-07-080415 | zfs recv -F dst/vol00
z_upgrade
will run during execution ofzfs recv
andzfs rollback
zfs recv
commands failed with a busy error.zfs rollback
command failed with a busy error.Types of changes
Checklist:
Signed-off-by
.