scrubbing negatively impacts pool IO performance #944

cwedgwood · 2012-09-07T04:42:13Z

When a scrub is active, it negatively affects pool IO to a much greater extent than it ideally should.

A simple test of this:

drop caches

then

find tankfoo/ -type f -ls

whilst watching zpool iostat -v

with and without a scrub active (making sure to flush and cache devices as well)

behlendorf · 2012-09-07T16:15:43Z

There are tunables for this, however we haven't gone to any great lengths to tune each to the exact right value. The current setting were brought over from OpenSolaris and may not be exactly right for Linux. And feedback you can proved on what the default should be would be helpful.

int zfs_top_maxinflight = 32;           /* maximum I/Os per top-level */
int zfs_resilver_delay = 2;             /* number of ticks to delay resilver */
int zfs_scrub_delay = 4;                /* number of ticks to delay scrub */
int zfs_scan_idle = 50;                 /* idle window in clock ticks */
int zfs_scan_min_time_ms = 1000; /* min millisecs to scrub per txg */
int zfs_free_min_time_ms = 1000; /* min millisecs to free per txg */
int zfs_resilver_min_time_ms = 3000; /* min millisecs to resilver per txg */
int zfs_no_scrub_io = B_FALSE; /* set to disable scrub i/o */
int zfs_no_scrub_prefetch = B_FALSE; /* set to disable srub prefetching */

cwedgwood · 2012-09-07T16:32:46Z

Are clock ticks comparable?

behlendorf · 2012-09-07T16:52:03Z

Nope. This code should probably be adjusted to express the tunables in miliseconds instead of clock ticks. But as I said this was the Solaris code and it hasn't been significantly reworked yet.

Every Solaris kernel I've even seen using an internal HZ of 100. Under Linux HZ tends to range from 100 - 1000 and is configurable via a kernel option so any timing needs to be scaled appropriately. Most modern kernels these days ship with HZ set to 1000 so the default may be off by a factor of 10 or so.

cwedgwood · 2012-09-07T16:55:22Z

Actually, since i have NOHZ in most places, HZ=100 for me, so likely that isn't the easy/simple fix I hoped it wold be.

behlendorf · 2012-09-07T17:49:33Z

It still may be worth tuning even if your running with HZ=100. There was a comment about this from @byteharmony #566 (comment). He suggested setting zfs_scrub_limit=1, you might try that.

byteharmony · 2012-09-07T18:19:57Z

This is our current optimization for small machines:

[root@nas123 ~]# cat /etc/modprobe.d/zfs.conf
options zfs zfs_arc_max=4294967296 zfs_scrub_limit=1 zvol_threads=8
[root@nas123 ~]#

I know how exactly to configure the options was a bit of a pain so above you can see exactly what to put in your system. A reboot will make it live. You can set some of these parameters live, others you can not.

Even with these parameters the zfs scrub does create load, it's not not nearly as bad.

BK

cwedgwood · 2012-09-07T18:26:45Z

@byteharmony could you define what small means in this case? ram, pool size, setup, antyhing else relevant?

ryao · 2013-10-26T08:25:05Z

The IO elevator rework from Illumos in #1775 should fix this.

behlendorf · 2013-11-07T19:19:13Z

What we really need for this issue is some hard data. How much does scrubbing impact normal IO.

byteharmony · 2013-11-12T16:16:58Z

I’ll get info to you as I get back on those projects… busy work push right now.

BK

From: Brian Behlendorf [mailto:[email protected]]
Sent: Thursday, November 07, 2013 1:20 PM
To: zfsonlinux/zfs
Cc: Brian Kerhin
Subject: Re: [zfs] scrubbing negatively impacts pool IO performance (#944)

What we really need for this issue is some hard data. How much does scrubbing impact normal IO.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/944#issuecomment-27997113.

cwedgwood · 2014-07-07T04:07:11Z

@byteharmony @behlendorf
A lot has changed since this was created - is this still valid in it's current form or can we close this out?

behlendorf · 2014-07-15T22:15:33Z

This is going to be stale. The write throttle changes changed all of these dynamics. Let's close it out and start again.

cwedgwood mentioned this issue Sep 9, 2012

soft lockup errors writing to zvol #922

Closed

behlendorf closed this as completed Jul 15, 2014

behlendorf modified the milestone: 0.6.6 Nov 8, 2014

rlaager mentioned this issue Oct 1, 2016

allow control of I/O priority class of zpool scrub #2928

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

scrubbing negatively impacts pool IO performance #944

scrubbing negatively impacts pool IO performance #944

cwedgwood commented Sep 7, 2012

behlendorf commented Sep 7, 2012

cwedgwood commented Sep 7, 2012

behlendorf commented Sep 7, 2012

cwedgwood commented Sep 7, 2012

behlendorf commented Sep 7, 2012

byteharmony commented Sep 7, 2012

cwedgwood commented Sep 7, 2012

ryao commented Oct 26, 2013

behlendorf commented Nov 7, 2013

byteharmony commented Nov 12, 2013

cwedgwood commented Jul 7, 2014

behlendorf commented Jul 15, 2014

scrubbing negatively impacts pool IO performance #944

scrubbing negatively impacts pool IO performance #944

Comments

cwedgwood commented Sep 7, 2012

behlendorf commented Sep 7, 2012

cwedgwood commented Sep 7, 2012

behlendorf commented Sep 7, 2012

cwedgwood commented Sep 7, 2012

behlendorf commented Sep 7, 2012

byteharmony commented Sep 7, 2012

cwedgwood commented Sep 7, 2012

ryao commented Oct 26, 2013

behlendorf commented Nov 7, 2013

byteharmony commented Nov 12, 2013

cwedgwood commented Jul 7, 2014

behlendorf commented Jul 15, 2014