Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zfs_iput_taskq hits 100% with large rsync, but not if you break it up into pieces #1469

Closed
tonymaro opened this issue May 21, 2013 · 5 comments
Milestone

Comments

@tonymaro
Copy link

Kind of a difficult situation to reproduce and may be related to using Gluster so I don't know how helpful this will be.

Running four 2TB drives in a raidz with a 128GB SSD cache drive on Ubuntu 12.04 with 12 GB of ECC RAM. On top of ZFS i'm running Gluster with two identical servers running replication to mirror the same data on both servers.

Source data to copy to the servers consists of 284 directories. Each directory consists of multiple sub-directories running 3 levels deep and each final sub containing up to 255 files of around 40KB each. Think of it as a hex naming path: dir1/00/00/1.file dir1/00/00/2.file ... dir1/00/01/1.file, etc. Total storage space in use: 250 GB within 5,805 distinct directories according to du | wc -l.

If I attempt to rsync the root of the tree in one fell swoop (284 directories + subs) within 10 minutes my zfs_iput_taskq process hits 80-100% usage of one core and stays there on BOTH of the servers. I allowed it to run for a full 24 hours after the transfer had finished and zfs_iput_taskq was still at 80-100%. Only unmounting / remounting the zfs pool would stop it.

If I rsync over each root directory individually, doing 284 separate rsync's, zfs_iput_taskq never even creeps high enough to show with top.

Edit: Thought it might be important to mention that I do have noatime set on zfs

Ubuntu 12.04 64 bit server
quad-core Xeon E5606
12GB ECC RAM
Linux gfs5 3.2.0-43-generic #68-Ubuntu SMP Wed May 15 03:33:33 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux

zfsutils 0.6.1-1~precise

zpool status:
pool: data
state: ONLINE
scan: none requested
config:

NAME                               STATE     READ WRITE CKSUM
data                               ONLINE       0     0     0
  raidz1-0                         ONLINE       0     0     0
    pci-0000:00:1f.2-scsi-2:0:0:0  ONLINE       0     0     0
    pci-0000:00:1f.2-scsi-4:0:0:0  ONLINE       0     0     0
    pci-0000:00:1f.2-scsi-5:0:0:0  ONLINE       0     0     0
    pci-0000:00:1f.2-scsi-3:0:0:0  ONLINE       0     0     0
cache
  pci-0000:00:1f.2-scsi-1:0:0:0    ONLINE       0     0     0

errors: No known data errors

@behlendorf
Copy link
Contributor

@tonymaro This looks like a duplicate of #457 which is a known issue. It's my understanding is that Gluster makes heavy use of xattrs which is what triggers this issue. You should be able to avoid the issue, and improve Gluster performance by setting xattr=sa on your Gluster datasets. This will causes the xattrs to be saved on disk in a different on-disk format which is faster to access, but incompatible with the non-Linux ZFS implementations.

Since the on-disk format is different you will need to copy the fs to a new dataset with xattr=sa set to take advantage of this.

@akorn
Copy link
Contributor

akorn commented Jun 27, 2013

I'm sorry, but I'm confused; is xattr=sa safe and fast now? I was under the impression that I hit #1176 when it was fast, and you fixed it by making it safe but slow again. What's the current deal? Should we be using xattr=sa if we don't care about compatibility with non-linux implementations?

@behlendorf
Copy link
Contributor

@akorn Yes, xattr=sa is safe and fast now, there are no outstanding bugs I'm aware on regarding this. The issue you referenced, #1176, impacts the xattr=on mainly makes it even slower but safe.

@ryao
Copy link
Contributor

ryao commented Jun 20, 2014

I believe pull request #2408 will fix this.

@behlendorf
Copy link
Contributor

This was resolved by the #2573.

@behlendorf behlendorf modified the milestones: 0.6.4, 0.7.0 Oct 16, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants