Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefetch doesn't kick in when reading a file sequentially through samba #9712

Closed
JulietDeltaGolf opened this issue Dec 11, 2019 · 7 comments
Closed
Labels
Status: Stale No recent activity for issue Type: Performance Performance improvement or performance problem

Comments

@JulietDeltaGolf
Copy link

System information

Type Version/Name
Distribution Name Ubuntu
Distribution Version 18.04.3
Linux Kernel 5.0.0-31-generic
Architecture x86_64
ZFS Version 0.8.1-1ubuntu11
SPL Version 0.8.1-1ubuntu11

Describe the problem you're observing

ZFS prefetcher doesn't seems to detect a sequential read if made from a windows client through samba resulting in limited performance ~300-400 MiB/s while a simple dd can achieve 2.6 GiB/s locally.

First though was to blame samba and move on but reading a file in cache gives me a NIC limited traffic ~1.1 GiB/s.

I also see a lots of increments in zfetchstats misses and max_streams when reading the file from the network while I see none during a local read.

Describe how to reproduce the problem

Read a file from a windows client.

Include any warning/errors/backtraces from the system logs

zpool iostat -q 1(during network read):

              capacity     operations     bandwidth    syncq_read    syncq_write   asyncq_read  asyncq_write   scrubq_read   trimq_write
pool        alloc   free   read  write   read  write   pend  activ   pend  activ   pend  activ   pend  activ   pend  activ   pend  activ
----------  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----
pool      96.0T  1.83P  1.84K      0   236M      0      0     42      0      0      0      0      0      0      0      0      0      0
pool      96.0T  1.83P  3.30K      0   431M      0      0     34      0      0      0      0      0      0      0      0      0      0
pool      96.0T  1.83P  2.62K      0   339M      0      0      0      0      0      0      0      0      0      0      0      0      0
pool      96.0T  1.83P  1.81K      0   231M      0      0     51      0      0      0      0      0      0      0      0      0      0
pool      96.0T  1.83P  2.79K      0   394M      0      0     42      0      0      0      0      0      0      0      0      0      0
pool      96.0T  1.83P  2.45K      0   323M      0      0     50      0      0      0      0      0      0      0      0      0      0
pool      96.0T  1.83P  2.73K      0   353M      0      0      0      0      0      0     55      0      0      0      0      0      0
pool      96.0T  1.83P  2.26K      0   291M      0      0     48      0      0      0      0      0      0      0      0      0      0
pool      96.0T  1.83P  2.37K      0   311M      0      0     49      0      0      0      0      0      0      0      0      0      0
pool      96.0T  1.83P  2.63K      0   342M      0      0      0      0      0      0      0      0      0      0      0      0      0
pool      96.0T  1.83P  2.34K      0   300M      0      0     17      0      0      0      0      0      0      0      0      0      0
pool      96.0T  1.83P  2.64K      0   347M      0      0     13      0      0      0      0      0      0      0      0      0      0
pool      96.0T  1.83P  2.73K      0   375M      0      0     13      0      0      0      0      0      0      0      0      0      0
pool      96.0T  1.83P  3.02K      0   387M      0      0     18      0      0      0      0      0      0      0      0      0      0

zpool iostat -q 1(during local read):

              capacity     operations     bandwidth    syncq_read    syncq_write   asyncq_read  asyncq_write   scrubq_read   trimq_write
pool        alloc   free   read  write   read  write   pend  activ   pend  activ   pend  activ   pend  activ   pend  activ   pend  activ
----------  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----  -----
pool      96.0T  1.83P  4.41K  1.14K   637M   305M      0      0      0      0    145    107      0      0      0      0      0      0
pool      96.0T  1.83P  6.87K      0  2.30G      0      0      0      0      0    269     92      0      0      0      0      0      0
pool      96.0T  1.83P  6.89K      0  2.33G      0      0      0      0      0     65    131      0      0      0      0      0      0
pool      96.0T  1.83P  6.57K      0  2.25G      0      0      0      0      0     63    125      0      0      0      0      0      0
pool      96.0T  1.83P  6.93K      0  2.36G      0      0      0      0      0    139    122      0      0      0      0      0      0
pool      96.0T  1.83P  6.93K      0  2.38G      0      0      0      0      0    211    129      0      0      0      0      0      0
pool      96.0T  1.83P  6.77K      0  2.28G      0      0      0      0      0    132    121      0      0      0      0      0      0
pool      96.0T  1.83P  7.00K      0  2.31G      0      0      0      0      0    107    110      0      0      0      0      0      0
pool      96.0T  1.83P  6.51K      0  2.27G      0      0      0      0      0    217    108      0      0      0      0      0      0
pool      96.0T  1.83P  6.89K      0  2.28G      0      0      0      0      0    110    117      0      0      0      0      0      0
pool      96.0T  1.83P  6.56K      0  2.26G      0      0      0      0      0    209     97      0      0      0      0      0      0
pool      96.0T  1.83P  6.76K      0  2.34G      0      0      0      0      0     87    123      0      0      0      0      0      0
pool      96.0T  1.83P  6.84K      0  2.29G      0      0      0      0      0    113    115      0      0      0      0      0      0
pool      96.0T  1.83P  6.73K      0  2.32G      0      0      0      0      0    228    110      0      0      0      0      0      0
pool      96.0T  1.83P  6.74K      0  2.34G      0      0      0      0      0    126    105      0      0      0      0      0      0
@JulietDeltaGolf
Copy link
Author

JulietDeltaGolf commented Dec 11, 2019

Ok... so this doesn't make sense to me and could just be an artefact but while I was trying to understand the difference in behavior between samba and dd, I noticed attaching strace to the smbd process gave me better performance.

zfetchstats confirms the effect with almost no more increments to the misses and max_streams counter.

I've been able to reproduce the effect every single time so I thought it could be worth mentioning.

edit : Added two flamegraphs to illustre the difference in behavior.
ZFS.zip

@beren12
Copy link
Contributor

beren12 commented Dec 11, 2019

performance is only better while you have strace actively attached, and it goes back to normal when you detach it?

@JulietDeltaGolf
Copy link
Author

@beren12 Yes.

@behlendorf behlendorf added the Type: Performance Performance improvement or performance problem label Dec 12, 2019
@h1z1
Copy link

h1z1 commented Dec 13, 2019

Not sure how this is a zfs bug given those are entirely different IO patterns. Samba is using sync for everything. Possibly from samba's own readahead (aio read size,aio max threads, etc). It's also not clear if and how you purged caches between your local and remote tests?

@JulietDeltaGolf
Copy link
Author

JulietDeltaGolf commented Feb 3, 2020

@h1z1 Correct by setting aio read size = 0, samba no longer delegates I/Os to a threadpool and instead execute them synchronously in the process handling the connection. I then do have the expected prefetch behavior.

However this would degrade performance when reading a block not in ARC and since ZFS prefetch works at the file level, it shouldn't matter which process requested the blocks as long as those requests are sequential. They are, I just checked in the ZFS read history.

FWIW reverting to stock kernel, 4.15.0-74-generic, seems to solves the problem without the need to cripple samba.

One of the obvious changes is the use of the blk-mq for SATA/SAS drives but I haven't had a chance yet to run the HWE kernel with blk-mq disabled. Is it worth trying or it couldn't be related at all ?

@stale
Copy link

stale bot commented Feb 2, 2021

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale No recent activity for issue label Feb 2, 2021
@stale stale bot closed this as completed May 3, 2021
@amotin
Copy link
Member

amotin commented May 3, 2021

PR #11652 may help if the problem arise only with aio enabled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Stale No recent activity for issue Type: Performance Performance improvement or performance problem
Projects
None yet
Development

No branches or pull requests

5 participants