Very slow I/O on Kernel 3.11 #3618

angstymeat · 2015-07-21T06:37:07Z

I moved one of our production systems from ZFS 0.6.3 to 0.6.4-139_g1cd7773 about 2 weeks ago, and since then I'm seeing some big slowdowns when running my nightly rsync backup (the slowdown is on the client being backed-up, not on the host the data is being backed-up to, this time).

The backups starts ok, but after 30 minutes or so it slows down. Doing an ls on a ZFS directory with 150 files takes between 1 and 2 seconds. Just doing a zfs list takes upwards of 40 seconds, with just 5 file systems. Each filesystem has about 67 snapshots (we keep 36 hourly snapshots, along with some weekly and monthlies).

When the rsync finishes everything goes back to normal.

I don't see anything in the logs concerning CPU stalls, or any kind of dumps.

It doesn't seem to be memory.

When I run a scrub it runs pretty fast. For the first 2 hours it says it will finish in 3 hours, but then there's a massive slowdown that causes it to take 12 hours in total. I don't know if this could be because of file fragmentation or not, but I don't see a lot of disk thrashing when this is happening. zpool list reports 20% fragmentation on the volume (5.44TB in size with 3.46TB allocated).

I'm not seeing this on any other systems running recent nightly versions of ZFS, but most of our other ZFS systems are now running kernels 3.17 or newer. Also, this system's workload consists of almost constant small writes and reads.

The OS is currently Fedora 18 with the 3.11.10-100.fc18.x86_64 kernel, 24GB of RAM and 6 CPUs under vSphere 5.1.

I can't reboot this one too often because it is used for data collection, but I was planning on upgrading it to F22 this Thursday and installing whatever the newest version of ZFS is on GIT at that time. I don't know if that will necessarily fix the problem, but I've been trying to schedule an update for the last month, even before this problem appeared.

I've been looking through the commits and recent issues to see if this is something that has been solved in the last two weeks, but I'm not seeing anything seems similar to what I am seeing.

In the meantime, is there anything I can post to diagnose what is going on? I still have the stats-dumping scripts I put together while trying to debug #3303.

The text was updated successfully, but these errors were encountered:

angstymeat · 2015-07-21T07:36:36Z

I don't know, this could be related to #3616.

siebenmann · 2015-07-21T15:26:31Z

If this is tied to #3616, actual disk IO levels should be low and 'echo w >/proc/systrq-trigger' should show processes with stack traces that look like:

schedule+0x37/0x90
cv_wait_common+0x105/0x140 [spl]
? wake_atomic_t_function+0x70/0x70
__cv_wait+0x15/0x20 [spl]
arc_get_data_buf.isra.21+0x3d7/0x3f0 [zfs]
[....]

(the arc_get_data_buf to __cv_wait bit is the important sign.)

Also, 'strace -T' on slow processes will show that system calls that do IO (like stat* calls, getdents(), and so on) will take anywhere from just under a second to several seconds.

(I believe that /proc/spl/kstat/zfs/arcstats should also show 'size' the same as 'c' or just above it, but I haven't verified this on my system.)

angstymeat · 2015-07-21T18:00:10Z

Thanks, I'll try it the next time it's under load tonight.

angstymeat · 2015-07-22T09:56:26Z

@siebenmann That's exactly what I'm seeing.

I'll compare it again after I've done the upgrade and applied the patch on Thursday.

angstymeat · 2015-07-23T05:25:04Z

Since this is the same thing as #3616, I'm going to close this and comment over if necessary.

* Fix regression - "OVERLAY_MOUNTS" should have been "DO_OVERLAY_MOUNTS". * Fix update-rc.d commands in postinst. Thanx to subzero79@GitHub. * Fix make sure a filesystem exists before trying to mount in mount_fs() * Fix local variable usage. * Fix to read_mtab(): * Strip control characters (space - \040) from /proc/mounts GLOBALY, not just first occurrence. * Don't replace unprintable characters ([/-. ]) for use in the variable name with underscore. No need, just remove them all together. * Add check_boolean() to check if a user configure option is set ('yes', 'Yes', 'YES' or any combination there of) OR '1'. Anything else is considered 'unset'. * Add a ZFS_POOL_IMPORT to the default config. * This is a semi colon separated list of pools to import ONLY. * This is intended for systems which have _a lot_ of pools (from a SAN for example) and it would be to many to put in the ZFS_POOL_EXCEPTIONS variable.. * Add a config option "ZPOOL_IMPORT_OPTS" for adding additional options to "zpool import". * Add documentation and the chance of overriding the ZPOOL_CACHE variable in the config file. * Remove "sort" from find_pools() and setup_snapshot_booting(). Sometimes not available, and not really necessary. Signed-off-by: Turbo Fredriksson <[email protected]> Signed-off-by: Brian Behlendorf <[email protected]> Issue openzfs#3618

angstymeat closed this as completed Jul 23, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Very slow I/O on Kernel 3.11 #3618

Very slow I/O on Kernel 3.11 #3618

angstymeat commented Jul 21, 2015

angstymeat commented Jul 21, 2015

siebenmann commented Jul 21, 2015

angstymeat commented Jul 21, 2015

angstymeat commented Jul 22, 2015

angstymeat commented Jul 23, 2015

Very slow I/O on Kernel 3.11 #3618

Very slow I/O on Kernel 3.11 #3618

Comments

angstymeat commented Jul 21, 2015

angstymeat commented Jul 21, 2015

siebenmann commented Jul 21, 2015

angstymeat commented Jul 21, 2015

angstymeat commented Jul 22, 2015

angstymeat commented Jul 23, 2015