-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dbuf_hold_impl() cleanup to improve cached read performance #9351
dbuf_hold_impl() cleanup to improve cached read performance #9351
Conversation
Codecov Report
@@ Coverage Diff @@
## master #9351 +/- ##
==========================================
+ Coverage 69.62% 71.25% +1.63%
==========================================
Files 374 372 -2
Lines 119740 119879 +139
==========================================
+ Hits 83363 85415 +2052
+ Misses 36377 34464 -1913
Continue to review full report at Codecov.
|
Looks like there are 2 new test failures which appear to be related to this change:
|
Hmm. I'll look into those failures. Thanks!
…On Tue, Sep 24, 2019 at 9:49 AM Matthew Ahrens ***@***.***> wrote:
Looks like there are 2 new test failures which appear to be related to
this change:
Test: /usr/share/zfs/zfs-tests/tests/functional/arc/dbufstats_001_pos (run as root) [00:00] [FAIL]
00:40:43.31 ASSERTION: dbufstats produces correct statistics
00:40:43.35 SUCCESS: file_write -o create -f /mnt/testdir/file -b 1048576 -c 20 -d R
00:40:43.49 SUCCESS: zpool sync
00:40:43.76 SUCCESS: eval cat /proc/spl/kstat/zfs/dbufs > /mnt/dbufs.out.DD4gCe
00:40:43.76 SUCCESS: eval cat /proc/spl/kstat/zfs/dbufstats > /mnt/dbufstats.out.37w2eH
00:40:43.83 NOTE: Checking if 4126 is within +/-15 of 193 (diff: 3933)
00:40:43.83 Stat cache_level_0 exceeded tolerance
Test: /usr/share/zfs/zfs-tests/tests/functional/arc/dbufstats_002_pos (run as root) [00:00] [FAIL]
00:40:43.88 ASSERTION: dbufs move from mru to mfu list
00:40:43.89 SUCCESS: file_write -o create -f /mnt/testdir/file -b 1048576 -c 1 -d R
00:40:43.92 SUCCESS: zpool sync
00:40:43.92 NOTE: Object ID for /mnt/testdir/file is 3
00:40:44.19 SUCCESS: eval cat /proc/spl/kstat/zfs/dbufs > /mnt/dbufs.out.aYZTIA
00:40:44.33 NOTE: dbuf count is 9, mru count is 9, mfu count is 0
00:40:44.33 SUCCESS: eval cat /mnt/testdir/file > /dev/null
00:40:44.59 SUCCESS: eval cat /proc/spl/kstat/zfs/dbufs > /mnt/dbufs.out.aYZTIA
00:40:44.73 NOTE: dbuf count is 9, mru count is 9, mfu count is 0
00:40:44.73 Compared mfu count should be not equal: 0 == 0
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#9351>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAQVV7APVMXO6OOSSQ4AWDTQLIZJRANCNFSM4IZVWSGA>
.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for revisiting this. Since the default kernel stack size has doubled since this change was originally made we should now have enough stack space to avoid needing to do this.
I made the suggested changes and reran the tests. The performance gain smaller, ~3%, with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good. I just saw one tiny nit.
Currently every dbuf_hold_impl() incurs kmem_alloc() and kmem_free() which can be costly for cached read performance. This change reverts the dbuf_hold_impl() fix stack commit, i.e. fc5bb51 to eliminate the extra kmem_alloc() and kmem_free() operations and improve cached read performance. With the change, each dbuf_hold_impl() frame uses 40 bytes more, total of 800 for 20 recursive levels. Linux kernel stack sizes are 8K and 16K for 32bit and 64bit, respectively, so stack overrun risk is limited. Sample stack output comparisons with 50 PB file and recordsize=512 Current code 11) 2240 64 arc_alloc_buf+0x4a/0xd0 [zfs] 12) 2176 264 dbuf_read_impl.constprop.16+0x2e3/0x7f0 [zfs] 13) 1912 120 dbuf_read+0xe5/0x520 [zfs] 14) 1792 56 dbuf_hold_impl_arg+0x572/0x630 [zfs] 15) 1736 64 dbuf_hold_impl_arg+0x508/0x630 [zfs] 16) 1672 64 dbuf_hold_impl_arg+0x508/0x630 [zfs] 17) 1608 40 dbuf_hold_impl+0x23/0x40 [zfs] 18) 1568 40 dbuf_hold_level+0x32/0x60 [zfs] 19) 1528 16 dbuf_hold+0x16/0x20 [zfs] dbuf_hold_impl() cleanup 11) 2320 64 arc_alloc_buf+0x4a/0xd0 [zfs] 12) 2256 264 dbuf_read_impl.constprop.17+0x2e3/0x7f0 [zfs] 13) 1992 120 dbuf_read+0xe5/0x520 [zfs] 14) 1872 96 dbuf_hold_impl+0x50f/0x5e0 [zfs] 15) 1776 104 dbuf_hold_impl+0x4df/0x5e0 [zfs] 16) 1672 104 dbuf_hold_impl+0x4df/0x5e0 [zfs] 17) 1568 40 dbuf_hold_level+0x32/0x60 [zfs] 18) 1528 16 dbuf_hold+0x16/0x20 [zfs] Performance observations on 8K recordsize filesystem: - 8/128/1024K at 1-128 sequential cached read, ~3% improvement Testing done on Ubuntu 18.04 with 4.15 kernel, 8vCPUs and SSD storage on VMware ESX. Signed-off-by: Tony Nguyen <[email protected]>
Currently every dbuf_hold_impl() incurs kmem_alloc() and kmem_free() which can be costly for cached read performance. This change reverts the dbuf_hold_impl() fix stack commit, i.e. fc5bb51 to eliminate the extra kmem_alloc() and kmem_free() operations and improve cached read performance. With the change, each dbuf_hold_impl() frame uses 40 bytes more, total of 800 for 20 recursive levels. Linux kernel stack sizes are 8K and 16K for 32bit and 64bit, respectively, so stack overrun risk is limited. Sample stack output comparisons with 50 PB file and recordsize=512 Current code 11) 2240 64 arc_alloc_buf+0x4a/0xd0 [zfs] 12) 2176 264 dbuf_read_impl.constprop.16+0x2e3/0x7f0 [zfs] 13) 1912 120 dbuf_read+0xe5/0x520 [zfs] 14) 1792 56 dbuf_hold_impl_arg+0x572/0x630 [zfs] 15) 1736 64 dbuf_hold_impl_arg+0x508/0x630 [zfs] 16) 1672 64 dbuf_hold_impl_arg+0x508/0x630 [zfs] 17) 1608 40 dbuf_hold_impl+0x23/0x40 [zfs] 18) 1568 40 dbuf_hold_level+0x32/0x60 [zfs] 19) 1528 16 dbuf_hold+0x16/0x20 [zfs] dbuf_hold_impl() cleanup 11) 2320 64 arc_alloc_buf+0x4a/0xd0 [zfs] 12) 2256 264 dbuf_read_impl.constprop.17+0x2e3/0x7f0 [zfs] 13) 1992 120 dbuf_read+0xe5/0x520 [zfs] 14) 1872 96 dbuf_hold_impl+0x50f/0x5e0 [zfs] 15) 1776 104 dbuf_hold_impl+0x4df/0x5e0 [zfs] 16) 1672 104 dbuf_hold_impl+0x4df/0x5e0 [zfs] 17) 1568 40 dbuf_hold_level+0x32/0x60 [zfs] 18) 1528 16 dbuf_hold+0x16/0x20 [zfs] Performance observations on 8K recordsize filesystem: - 8/128/1024K at 1-128 sequential cached read, ~3% improvement Testing done on Ubuntu 18.04 with 4.15 kernel, 8vCPUs and SSD storage on VMware ESX. Reviewed-by: Matt Ahrens <[email protected]> Reviewed-by: Brian Behlendorf <[email protected]> Signed-off-by: Tony Nguyen <[email protected]> Closes openzfs#9351 Signed-off-by: Bryant G. Ly <[email protected]> Conflicts: module/zfs/dbuf.c
Motivation and Context
Improve cached read performance
Description
Currently every dbuf_hold_impl() incurs kmem_alloc() and kmem_free() which can be costly for cached read performance.
This change reverts the dbuf_hold_impl() fix stack commit, i.e. fc5bb51 to eliminate the extra kmem_alloc() and kmem_free() operations and improve cached read performance. With the change, each dbuf_hold_impl() frame uses 40 bytes more, total of 800 for 20 recursive levels. Linux kernel stack sizes are 8K and 16K for 32bit and 64bit, respectively, so stack overrun risk is limited.
Sample stack output comparisons with 50 PB file and recordsize=512
Current code
dbuf_hold_impl() cleanup
How Has This Been Tested?
Performance observations on 8K recordsize filesystem:
Testing done on Ubuntu 18.04 with 4.15 kernel, 8vCPUs and SSD storage on
VMware ESX.
Types of changes
Checklist:
Signed-off-by
.