-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
looping in zfs_unlinked_drain? #2240
Comments
could it be related to this or no? openzfs/spl#340 |
@prometheanfire I don't think this is related to the spl issue. I assume the only problem this is causing in the rcu stall? It would be useful to get the Beyond that if you're rebooting the server it would be very useful just to run with the latest master code. It contains @ryao's |
@behlendorf I asked @prometheanfire to run http://dev.gentoo.org/~ryao/zfs-issue-2240.svg The pre-processed data from http://bpaste.net/show/199721/ Interestingly, the infinite loop appears to be in |
This reverts commit 7973e46. That had been intended to workaround a deadlock issue involving zfs_zget(), which was fixed by 6f9548c. The workaround had the side effect of causing zfs_zinactive() to cause excessive cpu utilization in zfs_iput_taskq by queuing an iteration of all objects in a dataset on every unlink on a directory that had extended attributes. That resulted in many issue reports about iput_taskq spinning. Since the original rationale for the change is no longer valid, we can safely revert it to resolve all of those issue reports. Conflicts: module/zfs/zfs_dir.c Closes: openzfs#457 openzfs#2058 openzfs#2128 openzfs#2240
This reverts commit 7973e46. That had been intended to workaround a deadlock issue involving zfs_zget(), which was fixed by 6f9548c. The workaround had the side effect of causing zfs_zinactive() to cause excessive cpu utilization in zfs_iput_taskq by queuing an iteration of all objects in a dataset on every unlink on a directory that had extended attributes. That resulted in many issue reports about iput_taskq spinning. Since the original rationale for the change is no longer valid, we can safely revert it to resolve all of those issue reports. Conflicts: module/zfs/zfs_dir.c Closes: openzfs#457 openzfs#2058 openzfs#2128 openzfs#2240
This is just a guess. Let me explain the question.
So, the code iterates over object IDs in z_unlinkedobj and it may call zfs_purgedir which ends up adding stuff to z_unlinkedobj. |
I can reproduce this almost every time I crash a system or forcefully reboot it (e.g. power cycle). |
@avg-I How many files do you have in the filesystem? Is the forced crash/reboot under any particular workload? |
This reverts commit 7973e46 which brings the basic flow of the code back inline with the other ZFS implementations. This was possible due to the work done in these in previous commits. e89260a Directory xattr znodes hold a reference on their parent 26cb948 Avoid 128K kmem allocations in mzap_upgrade() 4acaaf7 Add zfs_iput_async() interface Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#2408 Issue openzfs#457 Issue openzfs#2058 Issue openzfs#2128 Issue openzfs#2240
@avg-I Could you try the patches in #2573, they're designed to address this problem. They've passed all the automated testing and held up perfectly but I'd like to make sure they fix your test case. Specifically these changes revert the code back so it largely matches the OpenZFS implementation. However, a few iput()'s were left asynchronous to avoid certain deadlocks which can occur under Linux but not the other platforms.
This is safe in the sense that it won't deadlock. But unless you lock the entire zap for the traversal you might entries due to concurrent adds/removes. So it's a good thing to avoid if possible. |
This reverts commit 7973e46 which brings the basic flow of the code back in line with the other ZFS implementations. This was possible due to the following related changes. e89260a Directory xattr znodes hold a reference on their parent 6f9548c Fix deadlock in zfs_zget() 26cb948 Avoid 128K kmem allocations in mzap_upgrade() 4acaaf7 Add zfs_iput_async() interface Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#457 Issue openzfs#2058 Issue openzfs#2128 Issue openzfs#2240
This reverts commit 7973e46 which brings the basic flow of the code back in line with the other ZFS implementations. This was possible due to the following related changes. e89260a Directory xattr znodes hold a reference on their parent 6f9548c Fix deadlock in zfs_zget() ca043ca Add zfs_iput_async() interface dd2a794 Avoid 128K kmem allocations in mzap_upgrade() Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#457 Issue openzfs#2058 Issue openzfs#2128 Issue openzfs#2240
This reverts commit 7973e46 which brings the basic flow of the code back in line with the other ZFS implementations. This was possible due to the following related changes. e89260a Directory xattr znodes hold a reference on their parent 6f9548c Fix deadlock in zfs_zget() ca043ca Add zfs_iput_async() interface dd2a794 Avoid 128K kmem allocations in mzap_upgrade() Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#457 Issue openzfs#2058 Issue openzfs#2128 Issue openzfs#2240
This reverts commit 7973e46 which brings the basic flow of the code back in line with the other ZFS implementations. This was possible due to the following related changes. e89260a Directory xattr znodes hold a reference on their parent 6f9548c Fix deadlock in zfs_zget() ca043ca Add zfs_iput_async() interface dd2a794 Avoid 128K kmem allocations in mzap_upgrade() Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#457 Issue openzfs#2058 Issue openzfs#2128 Issue openzfs#2240
This reverts commit 7973e46 which brings the basic flow of the code back in line with the other ZFS implementations. This was possible due to the following related changes. e89260a Directory xattr znodes hold a reference on their parent 6f9548c Fix deadlock in zfs_zget() 0a50679 Add zfs_iput_async() interface 4dd1893 Avoid 128K kmem allocations in mzap_upgrade() Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes openzfs#457 Closes openzfs#2058 Closes openzfs#2128 Closes openzfs#2240
This reverts commit 7973e46 which brings the basic flow of the code back in line with the other ZFS implementations. This was possible due to the following related changes. e89260a Directory xattr znodes hold a reference on their parent 6f9548c Fix deadlock in zfs_zget() ca043ca Add zfs_iput_async() interface dd2a794 Avoid 128K kmem allocations in mzap_upgrade() Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes openzfs#457 Closes openzfs#2058 Closes openzfs#2128 Closes openzfs#2240
here are all backtraces (and a couple of sysrq-w) thown in.
The system is still up, so if you want any live debugging, tell me now :D
forgot the link... https://gist.github.com/prometheanfire/9988183
and the patches I'm using... https://gist.github.com/9999082
The text was updated successfully, but these errors were encountered: