zfs_iput_taskq hits 100% with large rsync, but not if you break it up into pieces #1469

tonymaro · 2013-05-21T11:58:57Z

Kind of a difficult situation to reproduce and may be related to using Gluster so I don't know how helpful this will be.

Running four 2TB drives in a raidz with a 128GB SSD cache drive on Ubuntu 12.04 with 12 GB of ECC RAM. On top of ZFS i'm running Gluster with two identical servers running replication to mirror the same data on both servers.

Source data to copy to the servers consists of 284 directories. Each directory consists of multiple sub-directories running 3 levels deep and each final sub containing up to 255 files of around 40KB each. Think of it as a hex naming path: dir1/00/00/1.file dir1/00/00/2.file ... dir1/00/01/1.file, etc. Total storage space in use: 250 GB within 5,805 distinct directories according to du | wc -l.

If I attempt to rsync the root of the tree in one fell swoop (284 directories + subs) within 10 minutes my zfs_iput_taskq process hits 80-100% usage of one core and stays there on BOTH of the servers. I allowed it to run for a full 24 hours after the transfer had finished and zfs_iput_taskq was still at 80-100%. Only unmounting / remounting the zfs pool would stop it.

If I rsync over each root directory individually, doing 284 separate rsync's, zfs_iput_taskq never even creeps high enough to show with top.

Edit: Thought it might be important to mention that I do have noatime set on zfs

Ubuntu 12.04 64 bit server
quad-core Xeon E5606
12GB ECC RAM
Linux gfs5 3.2.0-43-generic #68-Ubuntu SMP Wed May 15 03:33:33 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

zfsutils 0.6.1-1~precise

zpool status:
pool: data
state: ONLINE
scan: none requested
config:

NAME                               STATE     READ WRITE CKSUM
data                               ONLINE       0     0     0
  raidz1-0                         ONLINE       0     0     0
    pci-0000:00:1f.2-scsi-2:0:0:0  ONLINE       0     0     0
    pci-0000:00:1f.2-scsi-4:0:0:0  ONLINE       0     0     0
    pci-0000:00:1f.2-scsi-5:0:0:0  ONLINE       0     0     0
    pci-0000:00:1f.2-scsi-3:0:0:0  ONLINE       0     0     0
cache
  pci-0000:00:1f.2-scsi-1:0:0:0    ONLINE       0     0     0

errors: No known data errors

The text was updated successfully, but these errors were encountered:

behlendorf · 2013-06-03T20:35:19Z

@tonymaro This looks like a duplicate of #457 which is a known issue. It's my understanding is that Gluster makes heavy use of xattrs which is what triggers this issue. You should be able to avoid the issue, and improve Gluster performance by setting xattr=sa on your Gluster datasets. This will causes the xattrs to be saved on disk in a different on-disk format which is faster to access, but incompatible with the non-Linux ZFS implementations.

Since the on-disk format is different you will need to copy the fs to a new dataset with xattr=sa set to take advantage of this.

akorn · 2013-06-27T08:15:18Z

I'm sorry, but I'm confused; is xattr=sa safe and fast now? I was under the impression that I hit #1176 when it was fast, and you fixed it by making it safe but slow again. What's the current deal? Should we be using xattr=sa if we don't care about compatibility with non-linux implementations?

behlendorf · 2013-06-27T16:48:01Z

@akorn Yes, xattr=sa is safe and fast now, there are no outstanding bugs I'm aware on regarding this. The issue you referenced, #1176, impacts the xattr=on mainly makes it even slower but safe.

ryao · 2014-06-20T04:20:13Z

I believe pull request #2408 will fix this.

behlendorf · 2014-10-16T18:44:04Z

This was resolved by the #2573.

behlendorf mentioned this issue Apr 4, 2014

zfs_iput_taskq spinning again #2128

Closed

behlendorf closed this as completed Oct 16, 2014

behlendorf modified the milestones: 0.6.4, 0.7.0 Oct 16, 2014

behlendorf added Bug - Major and removed Bug labels Oct 16, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

zfs_iput_taskq hits 100% with large rsync, but not if you break it up into pieces #1469

zfs_iput_taskq hits 100% with large rsync, but not if you break it up into pieces #1469

tonymaro commented May 21, 2013

behlendorf commented Jun 3, 2013

akorn commented Jun 27, 2013

behlendorf commented Jun 27, 2013

ryao commented Jun 20, 2014

behlendorf commented Oct 16, 2014

zfs_iput_taskq hits 100% with large rsync, but not if you break it up into pieces #1469

zfs_iput_taskq hits 100% with large rsync, but not if you break it up into pieces #1469

Comments

tonymaro commented May 21, 2013

behlendorf commented Jun 3, 2013

akorn commented Jun 27, 2013

behlendorf commented Jun 27, 2013

ryao commented Jun 20, 2014

behlendorf commented Oct 16, 2014