Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Very slow recursive deletion of many snapshots #12987

Closed
shodanshok opened this issue Jan 19, 2022 · 4 comments
Closed

Very slow recursive deletion of many snapshots #12987

shodanshok opened this issue Jan 19, 2022 · 4 comments
Labels
Component: Memory Management kernel memory management Type: Defect Incorrect behavior (e.g. crash, hang) Type: Performance Performance improvement or performance problem

Comments

@shodanshok
Copy link
Contributor

shodanshok commented Jan 19, 2022

System information

Type Version/Name
Distribution Name Rocky Linux
Distribution Version 8.5
Kernel Version kernel-4.18.0-348.7.1.el8_5.x86_64
Architecture x86_64
OpenZFS Version 2.0.7

Describe the problem you're observing

Deleting large amount of snapshot (ie: a recursive delete of a very snapshotted dataset) is slow due to extreme ARC contention.

Describe how to reproduce the problem

  • spin up a test VM with 1 CPU, 2 GB RAM and a single 8 GB pool.
  • create a test dataset and populate it with many snapshots via the following commands: zfs create tank/test; for i in `seq 1 10000`; do zfs snapshot tank/test@$i; /bin/cp -f /etc/services /tank/test/; echo $i; done
  • reboot
  • try to recursive delete the test dataset via zfs destroy -r tank/test
  • after a while, disk activity halts but the CPU is very busy in the arc_evict and arc_prune processes
  • in the meanwhile, the kernel reports about hung tasks
  • after many minutes (>20m in my case) the deletion succeeds.

Adding more RAM (8 GB) solves the issue, with the deletion done after only 3 minutes.

Include any warning/errors/backtraces from the system logs

The following are the stack traces of the blocked processes and a brief summary of ARC status:

# arc_evict
[root@localhost 804]# cat stack
[<0>] aggsum_add+0x38/0x190 [zfs]
[<0>] arc_space_return+0x78/0x110 [zfs]
[<0>] spl_kmem_cache_free+0x28/0x1e0 [spl]
[<0>] arc_evict_state+0x851/0x890 [zfs]
[<0>] arc_evict_cb+0x504/0x630 [zfs]
[<0>] zthr_procedure+0x13c/0x160 [zfs]
[<0>] thread_generic_wrapper+0x6f/0x80 [spl]
[<0>] kthread+0x116/0x130
[<0>] ret_from_fork+0x35/0x40

# arc_prune
[root@localhost 802]# cat stack
[<0>] taskq_thread+0x449/0x530 [spl]
[<0>] kthread+0x116/0x130
[<0>] ret_from_fork+0x35/0x40

# txg_sync
[root@localhost 921]# cat stack
[<0>] cv_wait_common+0xfb/0x130 [spl]
[<0>] arc_wait_for_eviction+0x196/0x210 [zfs]
[<0>] arc_get_data_impl.isra.39+0x114/0x190 [zfs]
[<0>] arc_get_data_buf.isra.42+0x2d/0x50 [zfs]
[<0>] arc_buf_alloc_impl.isra.43+0x1a0/0x320 [zfs]
[<0>] arc_read+0xf1b/0x1230 [zfs]
[<0>] dbuf_read_impl.constprop.31+0x29f/0x6b0 [zfs]
[<0>] dbuf_read+0x1b2/0x520 [zfs]
[<0>] dmu_buf_hold_by_dnode+0x88/0x100 [zfs]
[<0>] zap_get_leaf_byblk.isra.13+0x62/0x250 [zfs]
[<0>] zap_deref_leaf+0xa1/0xf0 [zfs]
[<0>] fzap_cursor_retrieve+0x115/0x290 [zfs]
[<0>] zap_cursor_retrieve+0x156/0x2e0 [zfs]
[<0>] dsl_deadlist_load_cache+0x110/0x220 [zfs]
[<0>] dsl_deadlist_space_range+0x80/0x190 [zfs]
[<0>] dsl_destroy_snapshot_sync_impl+0x8fa/0xc00 [zfs]
[<0>] dsl_destroy_snapshot_sync+0x5d/0xa0 [zfs]
[<0>] zcp_sync_task+0x60/0xc0 [zfs]
[<0>] zcp_synctask_destroy+0x89/0x110 [zfs]
[<0>] zcp_synctask_wrapper+0x9e/0x160 [zfs]
[<0>] luaD_precall+0x26d/0x3f0 [zlua]
[<0>] luaV_execute+0x700/0x1570 [zlua]
[<0>] luaD_call+0x119/0x140 [zlua]
[<0>] luaD_rawrunprotected+0x6e/0xb0 [zlua]
[<0>] luaD_pcall+0x34/0x90 [zlua]
[<0>] lua_pcallk+0x83/0x110 [zlua]
[<0>] zcp_eval_impl+0xca/0x490 [zfs]
[<0>] dsl_sync_task_sync+0xa1/0xe0 [zfs]
[<0>] dsl_pool_sync+0x3f6/0x510 [zfs]
[<0>] spa_sync+0x56d/0xfc0 [zfs]
[<0>] txg_sync_thread+0x29f/0x480 [zfs]
[<0>] thread_generic_wrapper+0x6f/0x80 [spl]
[<0>] kthread+0x116/0x130
[<0>] ret_from_fork+0x35/0x40

# arc_summary
ARC size (current):                                   114.2 %    1.1 GiB
        Target size (adaptive):                       100.0 %  988.6 MiB
        Min size (hard limit):                          6.2 %   61.8 MiB
        Max size (high water):                           16:1  988.6 MiB
        Most Frequently Used (MFU) cache size:         46.2 %  202.2 MiB
        Most Recently Used (MRU) cache size:           53.8 %  235.0 MiB
        Metadata cache size (hard limit):              75.0 %  741.4 MiB
        Metadata cache size (current):                152.3 %    1.1 GiB
        Dnode cache size (hard limit):                 10.0 %   74.1 MiB
        Dnode cache size (current):                    19.0 %   14.1 MiB
@shodanshok shodanshok added the Type: Defect Incorrect behavior (e.g. crash, hang) label Jan 19, 2022
@gamanakis
Copy link
Contributor

gamanakis commented Jan 20, 2022

Perhaps PR #12985 helps with this.

On my VMs it does make a noticeable difference, and I don't see the contention between arc_prune and arc_evict anymore.

@behlendorf behlendorf added Component: Memory Management kernel memory management Type: Performance Performance improvement or performance problem labels Jan 21, 2022
@shodanshok
Copy link
Contributor Author

shodanshok commented Jan 21, 2022

@gamanakis Good catch, it surely seems related! I'll try the patch when it goes to the main branch. Thank you for reporting.

@stale
Copy link

stale bot commented Jan 21, 2023

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the Status: Stale No recent activity for issue label Jan 21, 2023
@amotin
Copy link
Member

amotin commented Jan 21, 2023

This may also help with the issue: #14402 .

@stale stale bot removed the Status: Stale No recent activity for issue label Jan 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Memory Management kernel memory management Type: Defect Incorrect behavior (e.g. crash, hang) Type: Performance Performance improvement or performance problem
Projects
None yet
Development

No branches or pull requests

4 participants