Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZFS start to be slow after some work #723

Closed
mikhmv opened this issue May 5, 2012 · 10 comments
Closed

ZFS start to be slow after some work #723

mikhmv opened this issue May 5, 2012 · 10 comments
Labels
Component: ZVOL ZFS Volumes Type: Performance Performance improvement or performance problem

Comments

@mikhmv
Copy link

mikhmv commented May 5, 2012

The main sign of this problem is activity zfs_iput_taskq.
4085 root 0 -20 0 0 0 R 98 0.0 38:56.32 zfs_iput_taskq/

all writes start to be extremely slow:

time echo "test" >> test.txt

real 0m12.392s
user 0m0.004s
sys 0m0.000s

Unmounting system takes hours....

and standard product info:

~$ zpool status
pool: tank
state: ONLINE
scan: scrub repaired 0 in 53h10m with 0 errors on Fri Apr 6 18:49:24 2012
config:

    NAME        STATE     READ WRITE CKSUM
    tank        ONLINE       0     0     0
      raidz1-0  ONLINE       0     0     0
        d1      ONLINE       0     0     0
        d2      ONLINE       0     0     0
        d3      ONLINE       0     0     0
        d4      ONLINE       0     0     0
        d5      ONLINE       0     0     0

errors: No known data errors

$ dpkg -s zfs-dkms zfsutils libuutil1 libzfs1 libzpool1

Package: zfs-dkms
Status: install ok installed
Priority: extra
Section: kernel
Installed-Size: 9500
Maintainer: Darik Horn [email protected]
Architecture: amd64
Source: zfs-linux
Version: 0.6.0.62-0ubuntu1~precise1
Replaces: lzfs, lzfs-dkms
Provides: lustre-backend-fs, lzfs, lzfs-dkms
Depends: dkms (>> 2.1.1.2-5ubuntu1), spl-dkms (>= 0.6.0.62)
Conflicts: lzfs, lzfs-dkms
Description: Native ZFS filesystem kernel modules for Linux
An advanced integrated volume manager and filesystem that is designed for
performance and data integrity. Snapshots, clones, checksums, deduplication,
compression, and RAID redundancy are built-in features..
.
Includes the SPA, DMU, ZVOL, and ZPL components of ZFS.

Package: zfsutils
Status: install ok installed
Priority: extra
Section: admin
Installed-Size: 702
Maintainer: Darik Horn [email protected]
Architecture: amd64
Source: zfs-linux
Version: 0.6.0.62-0ubuntu1~precise1
Replaces: zfs
Depends: libc6 (>= 2.8), libnvpair1, libselinux1 (>= 1.32), libuuid1 (>= 2.16), libuutil1, libzfs1, libzpool1
Recommends: zfs-dkms
Suggests: nfs-kernel-server, zfs-initramfs
Conflicts: zfs, zfs-fuse
Conffiles:
/etc/default/zfs a36d4561e19974d80a7b6cec75639c24
/etc/bash_completion.d/zfs 3e1c4a29c4f7d590e6a3041f2c61d6ff
/etc/init.d/zfs-share 003fc628cf32324eefe6ad6239a0ed80
/etc/init.d/zfs-mount cb9c3d88deeef1356198e8c5ca913ed4
/etc/zfs/zdev.conf b006284e64b215ca619aeb56d2df9bf5
Description: Native ZFS management utilities for Linux
This package provides the zpool and zfs commands that are used to
manage ZFS filesystems.

Package: libuutil1
Status: install ok installed
Priority: extra
Section: libs
Installed-Size: 148
Maintainer: Darik Horn [email protected]
Architecture: amd64
Source: zfs-linux
Version: 0.6.0.62-0ubuntu1~precise1
Replaces: libuutil0
Depends: libc6 (>= 2.14), libuuid1 (>= 2.16), zlib1g (>= 1:1.1.4)
Description: Solaris userland utility library for Linux
This library provides a variety of glue functions for ZFS on Linux:

  • libspl: The Solaris Porting Layer library, which provides APIs that make it
    possible to run Solaris user code in a Linux environment with relatively
    minimal modification.
  • libavl: The Adelson-Velskii Landis balanced binary tree manipulation library.
  • libefi: The Extensible Firmware Interface library for GUID disk partitioning.
  • libshare: NFS and SMB service integration for ZFS.

Package: libzfs1
Status: install ok installed
Priority: extra
Section: libs
Installed-Size: 308
Maintainer: Darik Horn [email protected]
Architecture: amd64
Source: zfs-linux
Version: 0.6.0.62-0ubuntu1~precise1
Replaces: libzfs0
Depends: libc6 (>= 2.8), libnvpair1, libzpool1
Description: Native ZFS filesystem library for Linux
The zfs management library.

Package: libzpool1
Status: install ok installed
Priority: extra
Section: libs
Installed-Size: 1120
Maintainer: Darik Horn [email protected]
Architecture: amd64
Source: zfs-linux
Version: 0.6.0.62-0ubuntu1~precise1
Replaces: libzpool0
Depends: libc6 (>= 2.14), libuutil1, zlib1g (>= 1:1.1.4)
Description: Native ZFS pool library for Linux
The zpool management library.

@mikhmv
Copy link
Author

mikhmv commented May 5, 2012

I started reboot, which took too long and looks like with problems. During boot I got following message:
~$ dmesg | grep SPLE
[ 53.868295] SPLError: 4344:0:(spl-err.c:67:vcmn_err()) WARNING: ZFS replay transaction error 5, dataset tank/OpenNebula, seq 0xb24f5, txtype 9

@chrisrd
Copy link
Contributor

chrisrd commented May 6, 2012

By any chance, did "some work" include deleting a lot of files with xattrs, and the filesystem is xattr=dir (default)? If so, the slowness and the long time to unmount may be due to #457.

@mikhmv
Copy link
Author

mikhmv commented May 6, 2012

we often mv big files inside zfs. We remove some big files but not often. We often make hard links and remove them later.
and ofcourse many small files created/removed.

@mikhmv
Copy link
Author

mikhmv commented May 7, 2012

One thing which usually kill zfs is a 20 threads reading the same file. This is a new issue. I didn't have it two weeks ago.

@doubledr
Copy link

doubledr commented Jun 5, 2012

Actually I am facing this problem now. I am copying 1.29T file to my zfs. The starting speed is around 130MB/s(nice enought), but after around 10G data, the speed comes down to around 6MB/s. It is a fresh block, no delete at all. But at the time, tsc_sync is only around 0.7% CPU times, There is no thread consuming significant amount of CPU. Every thread is around 1% CPU time. It is really hard to tell where is the problem.

@jeff-dagenais
Copy link

Getting this also, here's my setup:

ext4 (noatime,data=writeback,barrier=0,nobh,commit=60) on a zvol.
I build yocto images in the ext4. This is many many packages building in parallel, making many build products and updating many little log files. It gets to a point where zfs_iput_taskq/ runs 1 CPU 100% and htop displays all the other CPUs (8) at almost 100% iowait.

May be relevant: I run the yocto bitbake command as: ionice -c3 nice -n 18 bitbake ...

When I cancel the build, zfs eventually falls back onto it's feet after a few minutes.

config:
    NAME                                            STATE     READ WRITE CKSUM
    a                                               ONLINE       0     0     0
      mirror-0                                      ONLINE       0     0     0
        ata-WDC_WD1002FAEX-00Z3A0_WD-WCATRA589982   ONLINE       0     0     0
        ata-Hitachi_HDS721010CLA330_JP2911N0324E1V  ONLINE       0     0     0
      mirror-1                                      ONLINE       0     0     0
        ata-WDC_WD10EZEX-00RKKA0_WD-WMC1S0215048    ONLINE       0     0     0
        ata-Hitachi_HDS721010CLA330_JP2911N03214KV  ONLINE       0     0     0
    cache
      ata-WDC_WD1002FAEX-00Z3A0_WD-WCATRA589587     ONLINE       0     0     0

my package info: http://pastebin.ca/2488158

@jeff-dagenais
Copy link

... if I make the yocto build straight into a regular zfs dataset, all is well. So this really looks like it's a zvol thing.

@behlendorf behlendorf removed this from the 0.7.0 milestone Oct 6, 2014
@behlendorf behlendorf added Difficulty - Medium Type: Performance Performance improvement or performance problem Component: ZVOL ZFS Volumes and removed Bug labels Oct 6, 2014
@github-ivan
Copy link

I face with zfs_iput_taskq use 100% of one core within last 6.5 hours.
It's on a production server operating 121 days.
Today I had to rsync some content to server using xattr and acl.
After some time I realized zfs_iput_taskq high CPU usage.
Currently I experience about 20% write performance degradation.

I have some interesting munin graphs.
Daily disk usage start to fall...
I rsync'd content to server and after 18 PM there was no change...
https://www.ivancso.net/zfs_issues/
Weekly graph shows disk usage fall.
Yearly graph shows a linear disk usage growth after 3rd week of April which can't be real.

zfs_iput_taskq high CPU usage started about 18 PM. There is no kernel stack trace currently.
I use version 0.6.3:
ii debian-zfs 7jessie amd64 Native ZFS filesystem metapackage for Debian.
ii libzfs2 0.6.3-1
jessie amd64 Native ZFS filesystem library for Linux
ii zfs-dkms 0.6.3-1jessie all Native ZFS filesystem kernel modules for Linux
ii zfsonlinux 4 all archive.zfsonlinux.org trust package
ii zfsutils 0.6.3-1
jessie amd64 command-line tools to manage ZFS filesystems

I think version is between 2104. sept. and 2014. dec. (version 1.2).
I don't plan to upgrade on production systems because I experience serious stack traces and stability issues starting from 0.6.3-1.2.
(I have some stack traces related to xattr=sa and open issues.)

@behlendorf
Copy link
Contributor

@github-ivan thanks for drawing my attention to this old issue. The root cause of this was addressed in the 0.6.4.1 tag which is available through the ppa. I'd encourage you to give the latest version another shot and please report and stability issues you encounter so they can be addressed.

@github-ivan
Copy link

@behlendorf thanks for your kind reply. I don't plan to upgrade, because I have really bad experiences about 0.6.4. I tried it in two test systems and two backup systems.
Where I have to use xattr 0.6.4 series is a mortal combat. :( Sorry, I really appreciate your work about zfs, but my experience about new releases, affecting xattr and rsync usage is really awful.
I try to follow github issues but I see open issues similar to my issues, may you have other forum to follow or discuss current problems?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: ZVOL ZFS Volumes Type: Performance Performance improvement or performance problem
Projects
None yet
Development

No branches or pull requests

6 participants