-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Temporarily exclude known-failing tests #12849
Conversation
When restructing the zvol_open() logic for the Linux 5.13 kernel a lock inversion was accidentally introduced. In the updated code the spa_namespace_lock is now taken before the zv_suspend_lock allowing the following scenario to occur: down_read <=== waiting for zv_suspend_lock zvol_open <=== holds spa_namespace_lock __blkdev_get blkdev_get_by_dev blkdev_open ... mutex_lock <== waiting for spa_namespace_lock spa_open_common spa_open dsl_pool_hold dmu_objset_hold_flags dmu_objset_hold dsl_prop_get dsl_prop_get_integer zvol_create_minor dmu_recv_end zfs_ioc_recv_impl <=== holds zv_suspend_lock via zvol_suspend() zfs_ioc_recv ... This commit resolves the issue by moving the acquistion of the spa_namespace_lock back to after the zv_suspend_lock which restores the original ordering. Additionaly, as part of this change the error exit paths were simplified where possible. Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#12849
When restructing the zvol_open() logic for the Linux 5.13 kernel a lock inversion was accidentally introduced. In the updated code the spa_namespace_lock is now taken before the zv_suspend_lock allowing the following scenario to occur: down_read <=== waiting for zv_suspend_lock zvol_open <=== holds spa_namespace_lock __blkdev_get blkdev_get_by_dev blkdev_open ... mutex_lock <== waiting for spa_namespace_lock spa_open_common spa_open dsl_pool_hold dmu_objset_hold_flags dmu_objset_hold dsl_prop_get dsl_prop_get_integer zvol_create_minor dmu_recv_end zfs_ioc_recv_impl <=== holds zv_suspend_lock via zvol_suspend() zfs_ioc_recv ... This commit resolves the issue by moving the acquisition of the spa_namespace_lock back to after the zv_suspend_lock which restores the original ordering. Additionally, as part of this change the error exit paths were simplified where possible. Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#12849
When restructing the zvol_open() logic for the Linux 5.13 kernel a lock inversion was accidentally introduced. In the updated code the spa_namespace_lock is now taken before the zv_suspend_lock allowing the following scenario to occur: down_read <=== waiting for zv_suspend_lock zvol_open <=== holds spa_namespace_lock __blkdev_get blkdev_get_by_dev blkdev_open ... mutex_lock <== waiting for spa_namespace_lock spa_open_common spa_open dsl_pool_hold dmu_objset_hold_flags dmu_objset_hold dsl_prop_get dsl_prop_get_integer zvol_create_minor dmu_recv_end zfs_ioc_recv_impl <=== holds zv_suspend_lock via zvol_suspend() zfs_ioc_recv ... This commit resolves the issue by moving the acquisition of the spa_namespace_lock back to after the zv_suspend_lock which restores the original ordering. Additionally, as part of this change the error exit paths were simplified where possible. Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#12849
1071fc6
to
1dd2993
Compare
Well, the CI being marked X for every run isn't that useful, so let's mark these as known failures until someone gets a chance to look at them... Signed-off-by: Rich Ercolani <[email protected]>
1dd2993
to
149a16a
Compare
@@ -294,8 +298,6 @@ elif sys.platform.startswith('linux'): | |||
'mmp/mmp_active_import': ['FAIL', known_reason], | |||
'mmp/mmp_exported_import': ['FAIL', known_reason], | |||
'mmp/mmp_inactive_import': ['FAIL', known_reason], | |||
'refreserv/refreserv_raidz': ['FAIL', known_reason], | |||
'snapshot/rollback_003_pos': ['FAIL', known_reason], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both of these exceptions were recently removed from master (#12897, #12898), so you can drop this hunk with a rebase. In fact, looking more carefully I see all three of the remaining tests are already included in the exceptions. So it seems like we could just close out this PR as no longer needed.
That said, I seem to be able to reliably reproduce the enospc_002_pos
failure so I'm going to have a look at that.
Motivation and Context
Make the 20.04 runner not report failure every time until #12848 is resolved.
Description
"Yes, we know these might fail, don't scream about it."
How Has This Been Tested?
Not even slightly.
Types of changes
Checklist:
Signed-off-by
.