-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance regression on ZVOL between 0.6.4.2 and 0.7.0rc4 #6278
Comments
@dinatale2 - woot! Awesome! It's a Holiday today where I am, so I'll get this built tomorrow. Is this against the rc4 revision, or do I need to pull master for this PR? Thanks! |
@trenb - At the bottom of the PR, there is a link to my fork of ZFS. You can add it as a remote to your git repository and cherry pick the patch on top of rc4 if that is the revision you're interested in testing. Then you can directly measure the performance gain/loss with my patch and minimize noise from other patches that have landed in master. |
FYI, on Linux fio, using libaio with direct=0 might be misleading when comparing filesystem against volumes. See the fio HOWTO and make sure it tests to your workload. |
Thanks for that @richardelling - The file system tests were just for interest, not meant to be directly compared against the zvol results. |
@dinatale2 - Sorry for the delay in getting these numbers to you. I did 4 tests, each with 5 runs each. Same hardware as before, same OS. For SPL I used tag: spl-0.7.0-rc4-5-g7a35f2b I built 2 versions of RPMs for ZFS, with and without #6294
It seems that your patch and setting zvol_requests_sync=1 both perform about the same, and you patch combined with zvol_request_sync=1 doesn't seem to make a difference. Let me know if you have any questions or require additional information, and thanks again for your help and work! |
So essentially combining it with #4804 the IOPS would be at least on par with 0.6.4* ? Would be interesting to know if that level of lock contention (or those cases) was there with 0.6.4* and if it even applies to zvol usage cases (if it got worse on zvols or similar) ... |
@kernelOfTruth - Do you have a pull request or patch that I can apply to test this on my setup? This is purely a test system so I can beat up on it as much as is needed. |
@trenb the commit is referenced in that issue: tonyhutter@7481237 |
@kernelOfTruth - Hmm. I cannot get that to apply to the master I pulled today. I can go and do it by hand, but maybe you have a better idea? |
@kernelOfTruth - Thank you for the PR, that did the trick! I'll have updated results tomorrow morning. |
@kernelOfTruth I was paying attention to iostat while I was running fio on this new PR and noticed that every once in a while all disks will go to 100% and IOPS drops about 3x, but when IOPS returns to previous levels, for some reason 1 mirror is still at 100% and doing 3x less IOPS. zpool iostat -v 1 shows the same thing. Here's a gist: https://gist.github.com/trenb/2e7134a3ddc74e16b86c575f180cd41d Also, I'm ending up with pretty much the same IOPS as my previous tests, mid 20k's. Things start great, but slowly spin down. I'll dig into this more tomorrow :) |
Here's the gist of my testing with PR#6316: Average Write IOPS is 24477 which is more or less the same. When I was watching the run last night, I was seeing strange behaviour on the mirrored pairs. For some reason 1 pair of disks was going much slower for periods of time, even though all disks are identical and the same firmware. What information can I provide to help with this? Would iostat and zpool iostat -r/-w help? How much data would be helpful? Thanks! |
@redtex Thank you! Yeah, I see the same unbalanced IO between 3 pairs of mirrored disks on hardware. No SLOG. It's easy to run into with fio over a 5 minute run. I'll see IO come to a crawl several times as well, usually around 3 minutes into the 5 minute run. |
@trenb I'm curious if setting the |
@ironMann I will give that a shot this week and update with the results. Thanks! |
@ironMann Sorry for the large delay in getting back to you. I'm testing 0.7.1 currently on my platform, and thus far I'm not seeing any difference between the default of 32 for zvol_threads, and 4. I'm testing with 6, 8, 10 and 12 to see what those numbers look like. I'll update once done, but so far this isn't looking promising as I'm ending up with about 12k write iops on a 5 minute average and I'm still seeing fio drop down to hundreds of iops while a zpool iostat shows thousands of writes hitting the pool. |
Just to quickly show, here's the output from fio while it's running:
and here's the zpool iostat 1 running while I'm seeing the above:
I don't understand the disparity between what fio is reporting and what zpool iostat is reporting. |
So I've managed to work around the variable zvol performance by switching to O_DIRECT. I don't get as as many IOPS with this, but I also don't get the super variable performance. I think this is good enough for me, especially considering the age of the hardware. Thank you everyone for your help on this. I've got things going well enough with 0.7.1 that I'm content. |
System information
Describe the problem you're observing
When running ZoL 0.7.0rc4 and using FIO to test random write performance, I get 3x fewer random write IOPS on ZVOL's than on 0.6.4.2.
Describe how to reproduce the problem
Test server: Sun x4140 mark 2. 2x6 core AMD 2435 with 64GB DDR2-ECC RAM, LSI 1068e HBA and 6x600GB SAS disks in mirrored pairs.
I ran 5 FIO tests for each test case, destroying and re-creating the pool on each run:
https://gist.github.com/trenb/c772e68498fbaa1369d680b3f12898dd
https://gist.github.com/trenb/19ecd049a20190fa8714240c85eed311
https://gist.github.com/trenb/8ef32b2205807d8071b4b664a4f76549
And for "fun" I also ran the tests using a file on ZFS.
Here's the gist for 0.6.4.2 (average 64370 write IOPS):
https://gist.github.com/trenb/87429658968fabe7b1c3aa560b75d39e
and the gist for 0.7.0rc4 (average 55677 write IOPS):
https://gist.github.com/trenb/1a4c1e2ffcf12351db890cfcbf64410e
I realize there are a bunch of tickets open for this issue currently. If you prefer I can move this content to whatever ticket you prefer as I didn't want to muck up existing tickets.
The FIO config I'm using for ZVOL is:
and the fio for testing zfs file system
zpool creation command:
zvol creation command:
I am able to reproduce these results reliably on my test system here. If there is anything you would like me to test please let me know.
The text was updated successfully, but these errors were encountered: