Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Abysmal performance with default per_txg_dirty_frees_percent during parallel file removal #13932

Closed
mjguzik opened this issue Sep 21, 2022 · 1 comment · Fixed by #13938
Closed
Labels
Type: Performance Performance improvement or performance problem Type: Regression Indicates a functional regression

Comments

@mjguzik
Copy link
Contributor

mjguzik commented Sep 21, 2022

Issue reproduced on FreeBSD and Linux.

I'm running FreeBSD with zfs as of:

commit a582d52
Author: Umer Saleem [email protected]
Date: Thu Aug 25 02:20:43 2022 +0500

Updates for snapshots_changed property

I found -j 104 buildkernel performance to be incredibly bad, about twice the time needed by tmpfs and about 1.5 the time needed by ufs. I tracked it down to enormous off cpu time spent waiting on transactions to finish, almost all of which comes from waiting for nodes to be removed.

Check the flamegraph: https://people.freebsd.org/~mjg/zfs-buildkernel-offcpu.svg

I got pointed by @allanjude at vfs.zfs.per_txg_dirty_frees_percent and it indeed is the culprit.

The parameter currently sits at 5, changed from 30 here:

commit 65282ee
Author: Alek P [email protected]
Date: Tue Feb 12 12:01:08 2019 -0800

Freeing throttle should account for holes

It may be its significance has changed(?).

Anyhow, this is what I get when running -j 104 buildkernel (total real time varies by few seconds for the longer cases between runs, I don't think it matters for this test):

tunable time
0 2564.69s user 440.50s system 8546% cpu 35.163 total
1 2314.02s user 935.17s system 2738% cpu 1:58.65 total
5 2538.62s user 528.88s system 4478% cpu 1:08.49 total
10 2568.74s user 479.62s system 6869% cpu 44.374 total
15 2567.89s user 488.04s system 7946% cpu 38.458 total
20 2554.29s user 539.25s system 8321% cpu 37.176 total
25 2558.83s user 495.73s system 8595% cpu 35.536 total

As you can see 25 gives roughly the same performance as 0 (no throttling).

I don't know what values makes the most sense by default, the current one is most definitely bad.

The drive is Micron M600 MTFDDAK1T0MBF MU03, pool has default params + noatime.

Linux is a bare metal box on ec2, doing -j 96 clean after an allyesconfig modules build. I could not be bothered to check all the above cases, but:

tunable time
5 4.14s user 6.79s system 48% cpu 22.631 total
30 4.17s user 6.44s system 153% cpu 6.927 total

May I suggest other people bench their favorite workload(tm) with different parameters?

As is I think the parameter should revert back to 30.

@mjguzik mjguzik added the Type: Defect Incorrect behavior (e.g. crash, hang) label Sep 21, 2022
@ryao ryao added Type: Performance Performance improvement or performance problem Type: Regression Indicates a functional regression and removed Type: Defect Incorrect behavior (e.g. crash, hang) labels Sep 21, 2022
@mjguzik mjguzik changed the title Abysmal performance with default per_txg_dirty_frees_percent value Abysmal performance with default per_txg_dirty_frees_percent value when doing -j 104 buildkernel on FreeBSD Sep 21, 2022
@mjguzik mjguzik changed the title Abysmal performance with default per_txg_dirty_frees_percent value when doing -j 104 buildkernel on FreeBSD Abysmal performance with default per_txg_dirty_frees_percent during mass file removal Sep 22, 2022
@mjguzik mjguzik changed the title Abysmal performance with default per_txg_dirty_frees_percent during mass file removal Abysmal performance with default per_txg_dirty_frees_percent during parallel file removal Sep 22, 2022
mjguzik added a commit to mjguzik/zfs that referenced this issue Sep 22, 2022
The current value causes significant artificial slowdown during mass
parallel file removal, which can be observed both on FreeBSD and Linux
when running real workloads.

Sample results from Linux doing make -j 96 clean after an allyesconfig
modules build:

before: 4.14s user 6.79s system 48% cpu 22.631 total
after:	4.17s user 6.44s system 153% cpu 6.927 total

FreeBSD results in the ticket.

Signed-off-by:	Mateusz Guzik <[email protected]>
Closes openzfs#13932
mjguzik added a commit to mjguzik/zfs that referenced this issue Sep 22, 2022
The current value causes significant artificial slowdown during mass
parallel file removal, which can be observed both on FreeBSD and Linux
when running real workloads.

Sample results from Linux doing make -j 96 clean after an allyesconfig
modules build:

before: 4.14s user 6.79s system 48% cpu 22.631 total
after:	4.17s user 6.44s system 153% cpu 6.927 total

FreeBSD results in the ticket.

Signed-off-by:	Mateusz Guzik <[email protected]>
Closes openzfs#13932
mjguzik added a commit to mjguzik/zfs that referenced this issue Sep 23, 2022
The current value causes significant artificial slowdown during mass
parallel file removal, which can be observed both on FreeBSD and Linux
when running real workloads.

Sample results from Linux doing make -j 96 clean after an allyesconfig
modules build:

before: 4.14s user 6.79s system 48% cpu 22.631 total
after:	4.17s user 6.44s system 153% cpu 6.927 total

FreeBSD results in the ticket.

Signed-off-by:	Mateusz Guzik <[email protected]>
Closes openzfs#13932
behlendorf pushed a commit that referenced this issue Sep 28, 2022
The current value causes significant artificial slowdown during mass
parallel file removal, which can be observed both on FreeBSD and Linux
when running real workloads.

Sample results from Linux doing make -j 96 clean after an allyesconfig
modules build:

before: 4.14s user 6.79s system 48% cpu 22.631 total
after:	4.17s user 6.44s system 153% cpu 6.927 total

FreeBSD results in the ticket.

Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by:	Mateusz Guzik <[email protected]>
Closes #13932
Closes #13938
freebsd-git pushed a commit to freebsd/freebsd-src that referenced this issue Sep 28, 2022
This cherry-picks upstream eb9bec0

    The current value causes significant artificial slowdown during mass
    parallel file removal, which can be observed both on FreeBSD and Linux
    when running real workloads.

    Sample results from Linux doing make -j 96 clean after an allyesconfig
    modules build:

    before: 4.14s user 6.79s system 48% cpu 22.631 total
    after:  4.17s user 6.44s system 153% cpu 6.927 total

    FreeBSD results in the ticket.

See openzfs/zfs#13932
andrewc12 pushed a commit to andrewc12/openzfs that referenced this issue Oct 2, 2022
The current value causes significant artificial slowdown during mass
parallel file removal, which can be observed both on FreeBSD and Linux
when running real workloads.

Sample results from Linux doing make -j 96 clean after an allyesconfig
modules build:

before: 4.14s user 6.79s system 48% cpu 22.631 total
after:	4.17s user 6.44s system 153% cpu 6.927 total

FreeBSD results in the ticket.

Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by:	Mateusz Guzik <[email protected]>
Closes openzfs#13932
Closes openzfs#13938
andrewc12 pushed a commit to andrewc12/openzfs that referenced this issue Oct 2, 2022
The current value causes significant artificial slowdown during mass
parallel file removal, which can be observed both on FreeBSD and Linux
when running real workloads.

Sample results from Linux doing make -j 96 clean after an allyesconfig
modules build:

before: 4.14s user 6.79s system 48% cpu 22.631 total
after:	4.17s user 6.44s system 153% cpu 6.927 total

FreeBSD results in the ticket.

Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by:	Mateusz Guzik <[email protected]>
Closes openzfs#13932
Closes openzfs#13938
andrewc12 added a commit to andrewc12/openzfs that referenced this issue Oct 2, 2022
Unortunately, Windows defines 'long' as 32-bit even on x64
compiles. We create two new macros ZFS_MODULE_LONG
and ZFS_MODULE_ULONG. These two will be 'long' on Unix, and
let the toolchain handle the size of it.

On Windows the two macros are defined as 'int64_t'/'uint64_t'.

Clean up source to get Linux builds working (openzfs#119)

* Move perfmon functions to windows

from zfs_ioctl.h to zfs_ioctl_os.c

* remove static for zfs_dirty_data_sync_percent

* uint64_t to ZFS_MODULE_ULONG

* add ifdef _WIN32 to zvol.c

* define posix_memalign_free for other os

* add vdev_file_t for other os

* remove duplicate check_file

* Create build_for_wsl.yaml

* move functions to zvol_os.c

and add them to the header zvol_impl.h
in answer to this discussion
openzfsonwindows#119 (comment)

* Fix code formatting

in reference to openzfsonwindows#119 (comment)

* revert the removal of static in dsl_pool.c

* remove zfs_dirty_data_sync_percent in dsl_pool.h

* make zvol_find_by_name not static

* changed workflow name

* cstyle zpool_vdev_os.c

Fix cstyle complaints (openzfs#125)

* clean cstyle

* squash

* squash

* Update zfs_vnops_windows.c

* Update zfs_windows_zvol_wmi.c

* squash

* Additional cstyle fixes

* Update sysctl_os.c

Upstream: Additional MODULE ULONG

Bring per_txg_dirty_frees_percent back to 30

The current value causes significant artificial slowdown during mass
parallel file removal, which can be observed both on FreeBSD and Linux
when running real workloads.

Sample results from Linux doing make -j 96 clean after an allyesconfig
modules build:

before: 4.14s user 6.79s system 48% cpu 22.631 total
after:	4.17s user 6.44s system 153% cpu 6.927 total

FreeBSD results in the ticket.

Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by:	Mateusz Guzik <[email protected]>
Closes openzfs#13932
Closes openzfs#13938

Co-Authored-By: Jorgen Lundman <[email protected]>
andrewc12 pushed a commit to andrewc12/openzfs that referenced this issue Oct 6, 2022
The current value causes significant artificial slowdown during mass
parallel file removal, which can be observed both on FreeBSD and Linux
when running real workloads.

Sample results from Linux doing make -j 96 clean after an allyesconfig
modules build:

before: 4.14s user 6.79s system 48% cpu 22.631 total
after:	4.17s user 6.44s system 153% cpu 6.927 total

FreeBSD results in the ticket.

Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by:	Mateusz Guzik <[email protected]>
Closes openzfs#13932
Closes openzfs#13938
zxombie pushed a commit to CTSRD-CHERI/freebsd-morello that referenced this issue Oct 10, 2022
This cherry-picks upstream eb9bec0

    The current value causes significant artificial slowdown during mass
    parallel file removal, which can be observed both on FreeBSD and Linux
    when running real workloads.

    Sample results from Linux doing make -j 96 clean after an allyesconfig
    modules build:

    before: 4.14s user 6.79s system 48% cpu 22.631 total
    after:  4.17s user 6.44s system 153% cpu 6.927 total

    FreeBSD results in the ticket.

See openzfs/zfs#13932
snajpa pushed a commit to vpsfreecz/zfs that referenced this issue Oct 22, 2022
The current value causes significant artificial slowdown during mass
parallel file removal, which can be observed both on FreeBSD and Linux
when running real workloads.

Sample results from Linux doing make -j 96 clean after an allyesconfig
modules build:

before: 4.14s user 6.79s system 48% cpu 22.631 total
after:	4.17s user 6.44s system 153% cpu 6.927 total

FreeBSD results in the ticket.

Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by:	Mateusz Guzik <[email protected]>
Closes openzfs#13932
Closes openzfs#13938
snajpa pushed a commit to vpsfreecz/zfs that referenced this issue Oct 22, 2022
The current value causes significant artificial slowdown during mass
parallel file removal, which can be observed both on FreeBSD and Linux
when running real workloads.

Sample results from Linux doing make -j 96 clean after an allyesconfig
modules build:

before: 4.14s user 6.79s system 48% cpu 22.631 total
after:	4.17s user 6.44s system 153% cpu 6.927 total

FreeBSD results in the ticket.

Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by:	Mateusz Guzik <[email protected]>
Closes openzfs#13932
Closes openzfs#13938
snajpa pushed a commit to vpsfreecz/zfs that referenced this issue Oct 23, 2022
The current value causes significant artificial slowdown during mass
parallel file removal, which can be observed both on FreeBSD and Linux
when running real workloads.

Sample results from Linux doing make -j 96 clean after an allyesconfig
modules build:

before: 4.14s user 6.79s system 48% cpu 22.631 total
after:	4.17s user 6.44s system 153% cpu 6.927 total

FreeBSD results in the ticket.

Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by:	Mateusz Guzik <[email protected]>
Closes openzfs#13932
Closes openzfs#13938
snajpa pushed a commit to vpsfreecz/zfs that referenced this issue Oct 27, 2022
The current value causes significant artificial slowdown during mass
parallel file removal, which can be observed both on FreeBSD and Linux
when running real workloads.

Sample results from Linux doing make -j 96 clean after an allyesconfig
modules build:

before: 4.14s user 6.79s system 48% cpu 22.631 total
after:	4.17s user 6.44s system 153% cpu 6.927 total

FreeBSD results in the ticket.

Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by:	Mateusz Guzik <[email protected]>
Closes openzfs#13932
Closes openzfs#13938
behlendorf pushed a commit that referenced this issue Nov 1, 2022
The current value causes significant artificial slowdown during mass
parallel file removal, which can be observed both on FreeBSD and Linux
when running real workloads.

Sample results from Linux doing make -j 96 clean after an allyesconfig
modules build:

before: 4.14s user 6.79s system 48% cpu 22.631 total
after:	4.17s user 6.44s system 153% cpu 6.927 total

FreeBSD results in the ticket.

Reviewed-by: Alexander Motin <[email protected]>
Reviewed-by: Brian Behlendorf <[email protected]>
Reviewed-by: Richard Yao <[email protected]>
Reviewed-by: George Melikov <[email protected]>
Signed-off-by:	Mateusz Guzik <[email protected]>
Closes #13932
Closes #13938
bsdjhb pushed a commit to bsdjhb/cheribsd that referenced this issue Jan 1, 2023
This cherry-picks upstream eb9bec0

    The current value causes significant artificial slowdown during mass
    parallel file removal, which can be observed both on FreeBSD and Linux
    when running real workloads.

    Sample results from Linux doing make -j 96 clean after an allyesconfig
    modules build:

    before: 4.14s user 6.79s system 48% cpu 22.631 total
    after:  4.17s user 6.44s system 153% cpu 6.927 total

    FreeBSD results in the ticket.

See openzfs/zfs#13932
@markmi
Copy link

markmi commented May 2, 2023

Context: FreeBSD main context on a Windows Dev Kit 2023 using USB3 media, 4 cortex-a78c cores plus 4 cortex-x1c cores, kernel and world tailored/tuned for cortex-a78c (noted in case it matters).

Activity: Building ports with poudriere-devel with USE_TMPFS=data . This mostly goes to the media, unlikeUSE_TMPFS=all. (My normal configuration is structured for heavy parallel activity that frequently lead to load averages much higher than the hardware thread count, here a core count.)

An interesting related note is that I was getting poudriere bulk builds set up for parallel activity that had long times of load averages <1 (for example) and lots of processes mostly waiting. (UFS did not have the problem. tmpfs use to avoid most zfs use avoided the problem.) Part of what I was seeing for zfs use was the likes of (from ^T use with kern.tty_info_kstacks=1):

load: 0.34  cmd: sh 93914 [tx->tx_quiesce_done_cv] 7566.69r 11.06u 22.66s 0% 3800k
#0 0xffff0000004fd564 at mi_switch+0x104
#1 0xffff000000463f40 at _cv_wait+0x120
#2 0xffff00000153fa34 at txg_wait_open+0xf4
#3 0xffff0000014a40bc at dmu_free_long_range+0x17c
#4 0xffff000001448254 at zfs_rmnode+0x64
#5 0xffff0000014557c4 at zfs_freebsd_reclaim+0x34
#6 0xffff000000a1340c at VOP_RECLAIM_APV+0x2c
#7 0xffff0000005fd6c0 at vgonel+0x450
#8 0xffff0000005fde7c at vrecycle+0x9c
#9 0xffff000001455678 at zfs_freebsd_inactive+0x48
#10 0xffff0000005fc430 at vinactivef+0x180
#11 0xffff0000005fba50 at vput_final+0x200
#12 0xffff00000060c4d0 at kern_funlinkat+0x320
#13 0xffff00015d6cbbf4 at filemon_wrapper_unlink+0x14
#14 0xffff0000008f8514 at do_el0_sync+0x594
#15 0xffff0000008d4904 at handle_el0_sync+0x40

or, an example not involving filemon but involving a null mount:

load: 1.20  cmd: sh 95560 [tx->tx_quiesce_done_cv] 2022.55r 3.21u 9.24s 0% 3852k
#0 0xffff0000004fd564 at mi_switch+0x104
#1 0xffff000000463f40 at _cv_wait+0x120
#2 0xffff000001518a34 at txg_wait_open+0xf4
#3 0xffff00000147d0bc at dmu_free_long_range+0x17c
#4 0xffff000001421254 at zfs_rmnode+0x64
#5 0xffff00000142e7c4 at zfs_freebsd_reclaim+0x34
#6 0xffff000000a1340c at VOP_RECLAIM_APV+0x2c
#7 0xffff0000005fd6c0 at vgonel+0x450
#8 0xffff0000005fde7c at vrecycle+0x9c
#9 0xffff00000142e678 at zfs_freebsd_inactive+0x48
#10 0xffff0000005fc430 at vinactivef+0x180
#11 0xffff0000005fba50 at vput_final+0x200
#12 0xffff00015d8ceab4 at null_reclaim+0x154
#13 0xffff000000a1340c at VOP_RECLAIM_APV+0x2c
#14 0xffff0000005fd6c0 at vgonel+0x450
#15 0xffff0000005fde7c at vrecycle+0x9c
#16 0xffff00015d8ce8e8 at null_inactive+0x38
#17 0xffff0000005fc430 at vinactivef+0x180

mjguzik asked me to try for zfs: set vfs.zfs.per_txg_dirty_frees_percent=5 . And it vastly improved the overall results for the context. (So this looks like local reverting of the 5 to 30 change, not that such would necessarily be universally appropriate.) No claim here that 5 is near optimal, just vastly better and I adopted the value for normal use for now.

It leaves me wondering if the setting needs at least some published guidance about picking what (ball park?) values to use for various types of contexts. Automatic picking of a reasonable default figure would be nice if sufficiently reliable across a wide enough range of contexts. There might be a question of what setting would leave fewer contexts needing to adjust away from the default. Or, may be: What setting has the least nasty bad-case consequences. (I have no clue of answers to such.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Performance Performance improvement or performance problem Type: Regression Indicates a functional regression
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants