Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance regression on ZVOL between 0.6.4.2 and 0.7.0rc4 #6278

Closed
trenb opened this issue Jun 27, 2017 · 21 comments
Closed

Performance regression on ZVOL between 0.6.4.2 and 0.7.0rc4 #6278

trenb opened this issue Jun 27, 2017 · 21 comments
Labels
Type: Performance Performance improvement or performance problem

Comments

@trenb
Copy link

trenb commented Jun 27, 2017

System information

Type Version/Name
Distribution Name Virtuozzo Linux (RHEL clone)
Distribution Version 7.2
Linux Kernel 3.10.0-327.18.2.vz7.15.2
Architecture x86_64
ZFS Version 0.7.0rc4 and 0.6.4.2
SPL Version 0.7.0rc4 and 0.6.4.2

Describe the problem you're observing

When running ZoL 0.7.0rc4 and using FIO to test random write performance, I get 3x fewer random write IOPS on ZVOL's than on 0.6.4.2.

Describe how to reproduce the problem

Test server: Sun x4140 mark 2. 2x6 core AMD 2435 with 64GB DDR2-ECC RAM, LSI 1068e HBA and 6x600GB SAS disks in mirrored pairs.

I ran 5 FIO tests for each test case, destroying and re-creating the pool on each run:

  1. ZFS 0.6.4.2 ZVOL 4kb block size (average 44459 write IOPS)
    https://gist.github.com/trenb/c772e68498fbaa1369d680b3f12898dd
  2. ZFS 0.7.0rc4 ZVOL 4kb block size (average 12517 write IOPS)
    https://gist.github.com/trenb/19ecd049a20190fa8714240c85eed311
  3. ZFS 0.7.0rc4 ZVOL 4kb block size with zvol_request_sync=1 (from zvol write performance issue 0.7.0-rc4 #6127 ) (average 23612 write IOPS)
    https://gist.github.com/trenb/8ef32b2205807d8071b4b664a4f76549

And for "fun" I also ran the tests using a file on ZFS.

Here's the gist for 0.6.4.2 (average 64370 write IOPS):
https://gist.github.com/trenb/87429658968fabe7b1c3aa560b75d39e

and the gist for 0.7.0rc4 (average 55677 write IOPS):
https://gist.github.com/trenb/1a4c1e2ffcf12351db890cfcbf64410e

I realize there are a bunch of tickets open for this issue currently. If you prefer I can move this content to whatever ticket you prefer as I didn't want to muck up existing tickets.

The FIO config I'm using for ZVOL is:

[Measure_W_4KB_QD31]
ioengine=libaio
direct=0
rw=randwrite
norandommap
randrepeat=0
iodepth=31
size=25%
numjobs=1
bs=4k
overwrite=1
filename=/dev/tank/zvol1
runtime=5m
time_based
group_reporting
stonewall

and the fio for testing zfs file system

[Measure_W_4KB_QD31]
ioengine=libaio
direct=0
rw=randwrite
norandommap
randrepeat=0
iodepth=31
size=25%
numjobs=1
bs=4k
overwrite=1
filename=/tank/500G
runtime=5m
size=2g
time_based
group_reporting
stonewall

zpool creation command:

zpool create -f -o ashift=9 -o cachefile=none tank mirror sda sdb mirror sdc sdd mirror sde sdf

zvol creation command:

zfs create -V 500G -b 4k tank/zvol1

I am able to reproduce these results reliably on my test system here. If there is anything you would like me to test please let me know.

@behlendorf behlendorf added the Type: Performance Performance improvement or performance problem label Jun 28, 2017
@dinatale2
Copy link
Contributor

@trenb I opened PR #6294 in hopes that it would improve performance on ZVOLs. Would you mind running your tests again with the patch included and reporting back some results? I'm also going to try and run the tests you posted in here to see what kind of results I get.

@trenb
Copy link
Author

trenb commented Jul 3, 2017

@dinatale2 - woot! Awesome! It's a Holiday today where I am, so I'll get this built tomorrow. Is this against the rc4 revision, or do I need to pull master for this PR?

Thanks!

@dinatale2
Copy link
Contributor

@trenb - At the bottom of the PR, there is a link to my fork of ZFS. You can add it as a remote to your git repository and cherry pick the patch on top of rc4 if that is the revision you're interested in testing. Then you can directly measure the performance gain/loss with my patch and minimize noise from other patches that have landed in master.

@richardelling
Copy link
Contributor

FYI, on Linux fio, using libaio with direct=0 might be misleading when comparing filesystem against volumes. See the fio HOWTO and make sure it tests to your workload.

@trenb
Copy link
Author

trenb commented Jul 4, 2017

Thanks for that @richardelling - The file system tests were just for interest, not meant to be directly compared against the zvol results.

@trenb
Copy link
Author

trenb commented Jul 5, 2017

@dinatale2 - Sorry for the delay in getting these numbers to you. I did 4 tests, each with 5 runs each. Same hardware as before, same OS.

For SPL I used tag: spl-0.7.0-rc4-5-g7a35f2b
For ZFS I used tag: zfs-0.7.0-rc4-92-g688c94c

I built 2 versions of RPMs for ZFS, with and without #6294

  1. fio - zfs-0.7.0-rc4-92-g688c94c WITHOUT PR6294 - 4kb zvol - (Average Write IOPS: 12615)
    https://gist.github.com/trenb/e3080954570d998b6270ab46d2dd68c9
  2. fio - zfs-0.7.0-rc4-92-g688c94c WITHOUT PR6294 - 4kb zvol WITH zvol_request_sync=1 - (Average Write IOPS: 24154)
    https://gist.github.com/trenb/40f45ed609b923377fc859835c4f8c7f
  3. fio - zfs-0.7.0-rc4-92-g688c94c WITH PR6294 - 4kb zvol (Average Write IOPS: 23208)
    https://gist.github.com/trenb/d5825d84a816caab94426a01ea0fc139
  4. fio - zfs-0.7.0-rc4-92-g688c94c WITH PR6294 - 4kb zvol WITH zvol_request_sync=1 (Average Write IOPS: 22830)
    https://gist.github.com/trenb/c0f908ab1a86be2c8d5a8229a9d47bbf

It seems that your patch and setting zvol_requests_sync=1 both perform about the same, and you patch combined with zvol_request_sync=1 doesn't seem to make a difference.

Let me know if you have any questions or require additional information, and thanks again for your help and work!

@kernelOfTruth
Copy link
Contributor

kernelOfTruth commented Jul 5, 2017

So essentially combining it with #4804 the IOPS would be at least on par with 0.6.4* ?
(not production safe !)

Would be interesting to know if that level of lock contention (or those cases) was there with 0.6.4* and if it even applies to zvol usage cases (if it got worse on zvols or similar) ...

@trenb
Copy link
Author

trenb commented Jul 5, 2017

@kernelOfTruth - Do you have a pull request or patch that I can apply to test this on my setup? This is purely a test system so I can beat up on it as much as is needed.

@kernelOfTruth
Copy link
Contributor

@trenb the commit is referenced in that issue: tonyhutter@7481237

@trenb
Copy link
Author

trenb commented Jul 5, 2017

@kernelOfTruth - Hmm. I cannot get that to apply to the master I pulled today. I can go and do it by hand, but maybe you have a better idea?

@kernelOfTruth
Copy link
Contributor

@trenb it applied rather flawless with git cherry-pick - I've opened up a PR to let the buildbots run tests on it - I'm curious how it'll end up with all these changes: #6316

@trenb
Copy link
Author

trenb commented Jul 6, 2017

@kernelOfTruth - Thank you for the PR, that did the trick! I'll have updated results tomorrow morning.

@trenb
Copy link
Author

trenb commented Jul 6, 2017

@kernelOfTruth I was paying attention to iostat while I was running fio on this new PR and noticed that every once in a while all disks will go to 100% and IOPS drops about 3x, but when IOPS returns to previous levels, for some reason 1 mirror is still at 100% and doing 3x less IOPS. zpool iostat -v 1 shows the same thing. Here's a gist: https://gist.github.com/trenb/2e7134a3ddc74e16b86c575f180cd41d

Also, I'm ending up with pretty much the same IOPS as my previous tests, mid 20k's. Things start great, but slowly spin down.

I'll dig into this more tomorrow :)

@trenb
Copy link
Author

trenb commented Jul 6, 2017

Here's the gist of my testing with PR#6316:
https://gist.github.com/trenb/b8fe272abc0254974e89da892ff438f9

Average Write IOPS is 24477 which is more or less the same.

When I was watching the run last night, I was seeing strange behaviour on the mirrored pairs. For some reason 1 pair of disks was going much slower for periods of time, even though all disks are identical and the same firmware.

What information can I provide to help with this? Would iostat and zpool iostat -r/-w help? How much data would be helpful?

Thanks!

@redtex
Copy link

redtex commented Jul 22, 2017

@trenb It's just the same, that I've observed since 0.6.5 release, see #3871

@trenb
Copy link
Author

trenb commented Jul 22, 2017

@redtex Thank you! Yeah, I see the same unbalanced IO between 3 pairs of mirrored disks on hardware. No SLOG. It's easy to run into with fio over a 5 minute run. I'll see IO come to a crawl several times as well, usually around 3 minutes into the 5 minute run.

@ironMann
Copy link
Contributor

ironMann commented Aug 6, 2017

@trenb I'm curious if setting the zvol_threads parameter to something smaller, like 4, would yield similar benefit while using the master or the 0.7 branch. The parameter must be provided during loading of the zfs module.

@trenb
Copy link
Author

trenb commented Aug 7, 2017

@ironMann I will give that a shot this week and update with the results. Thanks!

@trenb
Copy link
Author

trenb commented Sep 14, 2017

@ironMann Sorry for the large delay in getting back to you. I'm testing 0.7.1 currently on my platform, and thus far I'm not seeing any difference between the default of 32 for zvol_threads, and 4. I'm testing with 6, 8, 10 and 12 to see what those numbers look like. I'll update once done, but so far this isn't looking promising as I'm ending up with about 12k write iops on a 5 minute average and I'm still seeing fio drop down to hundreds of iops while a zpool iostat shows thousands of writes hitting the pool.

@trenb
Copy link
Author

trenb commented Sep 14, 2017

Just to quickly show, here's the output from fio while it's running:

Jobs: 1 (f=1): [w(1)] [86.0% done] [0KB/396KB/0KB /s] [0/99/0 iops] [eta 00m:42s]

and here's the zpool iostat 1 running while I'm seeing the above:

[0] [email protected]:~# zpool iostat 1
              capacity     operations     bandwidth
pool        alloc   free   read  write   read  write
----------  -----  -----  -----  -----  -----  -----
tank        10.4G  1.62T      0  4.61K  11.2K   112M
tank        10.4G  1.62T      0  2.93K      0  11.7M
tank        10.4G  1.62T      0  2.85K      0  11.4M
tank        10.4G  1.62T      0  2.87K      0  11.5M
tank        10.4G  1.62T      0  2.95K      0  11.8M
tank        10.4G  1.62T      0  3.20K      0  19.8M
tank        10.4G  1.62T      0  2.99K      0  12.3M
tank        10.4G  1.62T      0  2.94K      0  12.3M
tank        10.4G  1.62T      0  3.02K      0  12.6M
tank        10.4G  1.62T      0  2.90K      0  12.0M
tank        10.4G  1.62T      0  3.01K      0  12.6M
tank        10.4G  1.62T      0  2.91K      0  12.1M

I don't understand the disparity between what fio is reporting and what zpool iostat is reporting.

@trenb
Copy link
Author

trenb commented Sep 15, 2017

So I've managed to work around the variable zvol performance by switching to O_DIRECT. I don't get as as many IOPS with this, but I also don't get the super variable performance. I think this is good enough for me, especially considering the age of the hardware.

Thank you everyone for your help on this. I've got things going well enough with 0.7.1 that I'm content.

@trenb trenb closed this as completed Sep 15, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Performance Performance improvement or performance problem
Projects
None yet
Development

No branches or pull requests

7 participants