OpenZFS on Linux 32-bit practically hangs on recv #13241

rincebrain · 2022-03-21T11:46:49Z

System information

Type	Version/Name
Distribution Name	Debian
Distribution Version	11.2
Kernel Version	5.10.0-12-686-pae
Architecture	i686
OpenZFS Version	`2feba9a`

Describe the problem you're observing

When trying to benchmark some things on 32-bit OpenZFS, I discovered that attempting zfs recv of a 1M recordsize dataset on a 32-bit VM very rapidly nearly hung (it makes about 100 MB of progress in a second or two every minute or so), and the callstack looked like:

[<0>] spl_kmem_cache_alloc+0x5c7/0x6f0 [spl]
[<0>] zio_data_buf_alloc+0x29/0x60 [zfs]
[<0>] abd_alloc_linear+0xaa/0xf0 [zfs]
[<0>] dmu_recv_stream+0x640/0xb80 [zfs]
[<0>] zfs_ioc_recv_impl+0x2b2/0x1090 [zfs]
[<0>] zfs_ioc_recv+0x194/0x300 [zfs]
[<0>] zfsdev_ioctl_common+0x5dc/0x7c0 [zfs]
[<0>] zfsdev_ioctl+0x4c/0xb0 [zfs]
[<0>] __ia32_sys_ioctl+0x145/0x760
[<0>] __do_fast_syscall_32+0x45/0x80
[<0>] do_fast_syscall_32+0x29/0x60
[<0>] do_SYSENTER_32+0x15/0x20
[<0>] entry_SYSENTER_32+0x9f/0xf2

Which maps to line 1078 here:

zfs/module/os/linux/spl/spl-kmem-cache.c

Lines 1065 to 1082 in c70bb2f

    
           	/* 
        
           	 * The goal here is to only detect the rare case where a virtual slab 
        
           	 * allocation has deadlocked.  We must be careful to minimize the use 
        
           	 * of emergency objects which are more expensive to track.  Therefore, 
        
           	 * we set a very long timeout for the asynchronous allocation and if 
        
           	 * the timeout is reached the cache is flagged as deadlocked.  From 
        
           	 * this point only new emergency objects will be allocated until the 
        
           	 * asynchronous allocation completes and clears the deadlocked flag. 
        
           	 */ 
        
           	if (test_bit(KMC_BIT_DEADLOCKED, &skc->skc_flags)) { 
        
           		rc = spl_emergency_alloc(skc, flags, obj); 
        
           	} else { 
        
           		remaining = wait_event_timeout(skc->skc_waitq, 
        
           		    spl_cache_grow_wait(skc), HZ / 10); 
        
           		if (!remaining) { 
        
           			spin_lock(&skc->skc_lock);

Being a 32-bit VM with the default splith, we get 1 GB of kernel RAM, which ZFS calculates it can use up to 372922368 bytes of (half the 745844736 it calculates allmem to be).

Meanwhile, arcstat reports using 5.6 MB of RAM with a target of 50 and 599M available.

I'm wildly guessing from the lines above that we're requesting 1M of contiguous RAM, finding it hard to come by, and then waiting permanently in the "emergency" allocation path, with the progress being whenever the emergency path timeout deigns to allow progress.

Describe how to reproduce the problem

recv a recordsize 1M dataset on 32-bit x86 OpenZFS, no flags needed
bang

Include any warning/errors/backtraces from the system logs

As above.

I hacked it up to grossly shorten the timeout on i386 just to make progress, but then it hangs semi-indefinitely on attempting to receive with zstd enabled with a /proc/foo/stack on the recv of:

[<0>] cv_wait_common+0xd5/0x110 [spl]
[<0>] __cv_wait_sig+0x14/0x30 [spl]
[<0>] bqueue_enqueue_impl+0x69/0x140 [zfs]
[<0>] bqueue_enqueue+0x15/0x20 [zfs]
[<0>] dmu_recv_stream+0x1f0/0xb80 [zfs]
[<0>] zfs_ioc_recv_impl+0x2b2/0x1090 [zfs]
[<0>] zfs_ioc_recv+0x194/0x300 [zfs]
[<0>] zfsdev_ioctl_common+0x5dc/0x7c0 [zfs]
[<0>] zfsdev_ioctl+0x4c/0xb0 [zfs]
[<0>] __ia32_sys_ioctl+0x145/0x760
[<0>] __do_fast_syscall_32+0x45/0x80
[<0>] do_fast_syscall_32+0x29/0x60
[<0>] do_SYSENTER_32+0x15/0x20
[<0>] entry_SYSENTER_32+0x9f/0xf2

and complaints about receive_writer blocking forever in dmesg with:

[ 4471.886541] INFO: task receive_writer:31715 blocked for more than 966 seconds.
[ 4471.887626]       Tainted: P           OE     5.10.0-12-686-pae #1 Debian 5.10.103-1
[ 4471.888145] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 4471.888662] task:receive_writer  state:D stack:    0 pid:31715 ppid:     2 flags:0x00004000
[ 4471.888664] Call Trace:
[ 4471.888669]  __schedule+0x24c/0x8f0
[ 4471.888671]  ? __raw_callee_save___pv_queued_spin_unlock+0x9/0x10
[ 4471.888673]  ? prepare_to_wait_exclusive+0x51/0xa0
[ 4471.888674]  schedule+0x42/0xb0
[ 4471.888677]  cv_wait_common+0xd5/0x110 [spl]
[ 4471.888678]  ? add_wait_queue_exclusive+0x60/0x60
[ 4471.888679]  __cv_wait+0x14/0x20 [spl]
[ 4471.888691]  dmu_tx_wait+0xec/0x480 [zfs]
[ 4471.888698]  dmu_tx_assign+0x160/0x550 [zfs]
[ 4471.888705]  ? dmu_tx_hold_write_by_dnode+0x4a/0x60 [zfs]
[ 4471.888712]  flush_write_batch+0x202/0x4d0 [zfs]
[ 4471.888713]  ? __wake_up_common_lock+0x72/0xa0
[ 4471.888714]  ? __wake_up+0x12/0x20
[ 4471.888720]  receive_process_write_record+0xe2/0x1e0 [zfs]
[ 4471.888726]  receive_writer_thread+0xe5/0xa10 [zfs]
[ 4471.888728]  ? kfree+0x329/0x3c0
[ 4471.888729]  ? kfree+0x329/0x3c0
[ 4471.888730]  ? spl_kmem_free+0x25/0x30 [spl]
[ 4471.888731]  ? spl_kmem_free+0x25/0x30 [spl]
[ 4471.888737]  ? receive_process_write_record+0x1e0/0x1e0 [zfs]
[ 4471.888739]  thread_generic_wrapper+0x5f/0x70 [spl]
[ 4471.888740]  kthread+0xf6/0x110
[ 4471.888741]  ? __thread_exit+0x20/0x20 [spl]
[ 4471.888742]  ? kthread_associate_blkcg+0xb0/0xb0
[ 4471.888744]  ret_from_fork+0x1c/0x28

So more work than just that hack job required, I suppose.

The text was updated successfully, but these errors were encountered:

stale · 2023-03-23T01:31:05Z

This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.

rincebrain added the Type: Defect Incorrect behavior (e.g. crash, hang) label Mar 21, 2022

rincebrain mentioned this issue Jun 28, 2022

Linux/i686 is broken, and has been for a while. #13603

Open

stale bot added the Status: Stale No recent activity for issue label Mar 23, 2023

behlendorf added Type: Architecture Indicates an issue is specific to a single processor architecture Bot: Not Stale Override for the stale bot and removed Status: Stale No recent activity for issue labels Mar 23, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenZFS on Linux 32-bit practically hangs on recv #13241

OpenZFS on Linux 32-bit practically hangs on recv #13241

rincebrain commented Mar 21, 2022

stale bot commented Mar 23, 2023

OpenZFS on Linux 32-bit practically hangs on recv #13241

OpenZFS on Linux 32-bit practically hangs on recv #13241

Comments

rincebrain commented Mar 21, 2022

System information

Describe the problem you're observing

Describe how to reproduce the problem

Include any warning/errors/backtraces from the system logs

stale bot commented Mar 23, 2023