-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zfs_iput_taskq spinning again #2128
Comments
Interesting. Thanks for filing this detailed debugging for us! |
HardwareCPU: AMD E-350 Processor I think my issue is related or the same as this issue. The pool itself is only ~3 days old. Snapshots are enabled using zfs-auto-snapshot
zpool iostat
And I saw once the following kernel log message.
Since the scrub was so slow I wanted to stop the scrub and export/import the pool.
Most of the time I see the following stack for the process zfs_iput_taskq
zpool status
Until now I saw the issue 2 times. |
Hi, Similar problem here. Running rdiff-backup of another host. Perhaps 100GB total. Running on a Supermicro server w/ a single dual core Opteron CPU (details below) and 8GB of RAM. 16 2TB drives in two ZFS pools. Ubuntu 12.04.4 LTS (64bit) OUTPUT OF TOP: PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND Strangely, htop does not show zfs_input_taskq VERSIONS OF ZFS, et. Al. Installed via package manager: CPU INFO FROM PROC Second core would repeat above. ZPOOL STATUS
errors: No known data errors pool: fergus
errors: No known data errors Is there other info I should forward to you? Thanks for all your hard work. |
This looks like a duplicate of #1469. It's a known issue which hasn't yet been resolved, using SA based xattrs should minimize the issue.
|
Interesting. I'll have to wait until Monday to check out my system. I thought xattrs were not in use but I don't recall explicitly disabling them either. |
They're enabled by default. If you know they aren't needed explicitly disabling them should prevent the issue as well. |
# zpool history -i 2014-02-04.13:29:50 [txg:5] create pool version 5000; software version 5000/5; uts H264 2.6.32-431.3.1.el6.x86_64 #1 SMP Fri Jan 3 21:39:27 UTC 2014 x86_64 ... (receive snapshot remotely) ... (make clones) ... 2014-02-04.13:37:00 [txg:385] set netlab (21) xattr=0 ... While the filesystem may have xattrs from the receive operation they do have the xattr mount option disabled. Since setting xattr=off was the last thing to happen maybe that makes sense? (If existing xattrs exist they would NOT have been 'sa' based.) (History requires -i due to the pool being affected by an earlier bug) |
This reverts commit 7973e46. That had been intended to workaround a deadlock issue involving zfs_zget(), which was fixed by 6f9548c. The workaround had the side effect of causing zfs_zinactive() to cause excessive cpu utilization in zfs_iput_taskq by queuing an iteration of all objects in a dataset on every unlink on a directory that had extended attributes. That resulted in many issue reports about iput_taskq spinning. Since the original rationale for the change is no longer valid, we can safely revert it to resolve all of those issue reports. Conflicts: module/zfs/zfs_dir.c Closes: openzfs#457 openzfs#2058 openzfs#2128 openzfs#2240
This reverts commit 7973e46. That had been intended to workaround a deadlock issue involving zfs_zget(), which was fixed by 6f9548c. The workaround had the side effect of causing zfs_zinactive() to cause excessive cpu utilization in zfs_iput_taskq by queuing an iteration of all objects in a dataset on every unlink on a directory that had extended attributes. That resulted in many issue reports about iput_taskq spinning. Since the original rationale for the change is no longer valid, we can safely revert it to resolve all of those issue reports. Conflicts: module/zfs/zfs_dir.c Closes: openzfs#457 openzfs#2058 openzfs#2128 openzfs#2240
This reverts commit 7973e46 which brings the basic flow of the code back inline with the other ZFS implementations. This was possible due to the work done in these in previous commits. e89260a Directory xattr znodes hold a reference on their parent 26cb948 Avoid 128K kmem allocations in mzap_upgrade() 4acaaf7 Add zfs_iput_async() interface Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#2408 Issue openzfs#457 Issue openzfs#2058 Issue openzfs#2128 Issue openzfs#2240
This reverts commit 7973e46 which brings the basic flow of the code back in line with the other ZFS implementations. This was possible due to the following related changes. e89260a Directory xattr znodes hold a reference on their parent 6f9548c Fix deadlock in zfs_zget() 26cb948 Avoid 128K kmem allocations in mzap_upgrade() 4acaaf7 Add zfs_iput_async() interface Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#457 Issue openzfs#2058 Issue openzfs#2128 Issue openzfs#2240
This reverts commit 7973e46 which brings the basic flow of the code back in line with the other ZFS implementations. This was possible due to the following related changes. e89260a Directory xattr znodes hold a reference on their parent 6f9548c Fix deadlock in zfs_zget() ca043ca Add zfs_iput_async() interface dd2a794 Avoid 128K kmem allocations in mzap_upgrade() Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#457 Issue openzfs#2058 Issue openzfs#2128 Issue openzfs#2240
This reverts commit 7973e46 which brings the basic flow of the code back in line with the other ZFS implementations. This was possible due to the following related changes. e89260a Directory xattr znodes hold a reference on their parent 6f9548c Fix deadlock in zfs_zget() ca043ca Add zfs_iput_async() interface dd2a794 Avoid 128K kmem allocations in mzap_upgrade() Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#457 Issue openzfs#2058 Issue openzfs#2128 Issue openzfs#2240
This reverts commit 7973e46 which brings the basic flow of the code back in line with the other ZFS implementations. This was possible due to the following related changes. e89260a Directory xattr znodes hold a reference on their parent 6f9548c Fix deadlock in zfs_zget() ca043ca Add zfs_iput_async() interface dd2a794 Avoid 128K kmem allocations in mzap_upgrade() Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#457 Issue openzfs#2058 Issue openzfs#2128 Issue openzfs#2240
This reverts commit 7973e46 which brings the basic flow of the code back in line with the other ZFS implementations. This was possible due to the following related changes. e89260a Directory xattr znodes hold a reference on their parent 6f9548c Fix deadlock in zfs_zget() 0a50679 Add zfs_iput_async() interface 4dd1893 Avoid 128K kmem allocations in mzap_upgrade() Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes openzfs#457 Closes openzfs#2058 Closes openzfs#2128 Closes openzfs#2240
This reverts commit 7973e46 which brings the basic flow of the code back in line with the other ZFS implementations. This was possible due to the following related changes. e89260a Directory xattr znodes hold a reference on their parent 6f9548c Fix deadlock in zfs_zget() ca043ca Add zfs_iput_async() interface dd2a794 Avoid 128K kmem allocations in mzap_upgrade() Signed-off-by: Brian Behlendorf <[email protected]> Signed-off-by: Richard Yao <[email protected]> Closes openzfs#457 Closes openzfs#2058 Closes openzfs#2128 Closes openzfs#2240
Hardware
CPU: Intel(R) Core(TM) i5-3330 CPU @ 3.00GHz
RAM: 4 GB
OS: CentOS 6
Kernel: Linux H264 2.6.32-431.3.1.el6.x86_64 # 1 SMP Fri Jan 3 21:39:27 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
SPL: spl-0.6.2-23-g4c99541
ZFS: zfs-0.6.2-171-g2e7b765
ZFS stats
LVM partition used for ZFS, total 294 filesystems (obviously I removed a lot for the sake of brevity), all mounted. Only ~10 of them are in use for LXC, the rest are sitting idle. Each container runs a very minimal CentOS install - sshd, rsyslog, 6 ttys, and a bunch of network sessions up.
Task info
Stack traces
Interesting supplemental information
... which is a lot, but IO on the system is pretty much settled.
While I was collecting the above information the thread finally went back to sleep. My ability to access the pool did not appear to be negatively impacted, but this system only uses its disk casually.
The text was updated successfully, but these errors were encountered: