Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZFS Kernel Panic during encrypted raw send #14709

Open
Cadair opened this issue Apr 3, 2023 · 3 comments
Open

ZFS Kernel Panic during encrypted raw send #14709

Cadair opened this issue Apr 3, 2023 · 3 comments
Labels
Component: Encryption "native encryption" feature Component: Send/Recv "zfs send/recv" feature Type: Defect Incorrect behavior (e.g. crash, hang)

Comments

@Cadair
Copy link

Cadair commented Apr 3, 2023

Hello,

I am not sure if this error has been reported elsewhere, I see a few issues which look similar but I am not good enough at understanding my kernel log to know if it's identical or not.

System information

Type Version/Name
Distribution Name Proxmox
Distribution Version 7.1
Kernel Version 5.15
Architecture x86
OpenZFS Version 2.1.9-pve1

zfs_error.log

@Cadair Cadair added the Type: Defect Incorrect behavior (e.g. crash, hang) label Apr 3, 2023
@rincebrain
Copy link
Contributor

(For ease of searching:

[551658.766325] VERIFY3(0 == dmu_bonus_hold_by_dnode(dn, FTAG, &db, flags)) failed (0 == 5)
[551658.775069] PANIC at dmu_recv.c:1806:receive_object()
[551658.780656] Showing stack for process 3811325
[551658.785546] CPU: 3 PID: 3811325 Comm: receive_writer Tainted: P           O      5.15.102-1-pve #1
[551658.785549] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X570D4I-2T, BIOS P1.00 05/19/2020
[551658.785550] Call Trace:
[551658.785552]  <TASK>
[551658.811071]  dump_stack_lvl+0x4a/0x63
[551658.815279]  dump_stack+0x10/0x16
[551658.819175]  spl_dumpstack+0x29/0x2f [spl]
[551658.823846]  spl_panic+0xd1/0xe9 [spl]
[551658.823857]  ? arc_space_consume+0x54/0x120 [zfs]
[551658.833568]  ? dbuf_read+0x11b/0x5c0 [zfs]
[551658.838251]  ? dmu_bonus_hold_by_dnode+0x14d/0x1b0 [zfs]
[551658.844134]  receive_object+0xb11/0xdb0 [zfs]
[551658.849063]  ? dmu_object_next+0xd9/0x130 [zfs]
[551658.854165]  ? kfree+0x21a/0x250
[551658.857917]  ? __cond_resched+0x1a/0x50
[551658.862276]  ? mutex_lock+0x13/0x50
[551658.866297]  receive_writer_thread+0x1cc/0xb50 [zfs]
[551658.871838]  ? spl_kmem_free+0x2e/0x40 [spl]
[551658.876712]  ? receive_read_payload_and_next_header+0x300/0x300 [zfs]
[551658.883762]  thread_generic_wrapper+0x64/0x80 [spl]
[551658.889197]  ? receive_read_payload_and_next_header+0x300/0x300 [zfs]
[551658.896234]  ? thread_generic_wrapper+0x64/0x80 [spl]
[551658.901841]  ? __thread_exit+0x20/0x20 [spl]
[551658.906669]  kthread+0x12a/0x150
[551658.910457]  ? set_kthread_struct+0x50/0x50
[551658.910460]  ret_from_fork+0x22/0x30
[551658.919428]  </TASK>

)

and sure looks like #12732 and #12270 to me.

@rincebrain rincebrain added Component: Send/Recv "zfs send/recv" feature Component: Encryption "native encryption" feature labels Apr 3, 2023
@vaclavskala
Copy link
Contributor

Same issue with 2.11.1 and kernel 5.10

[Fri Oct 27 13:45:36 2023] VERIFY3(0 == dmu_bonus_hold_by_dnode(dn, FTAG, &db, flags)) failed (0 == 5)
[Fri Oct 27 13:45:36 2023] PANIC at dmu_recv.c:1806:receive_object()
[Fri Oct 27 13:45:36 2023] Showing stack for process 1944920
[Fri Oct 27 13:45:36 2023] CPU: 37 PID: 1944920 Comm: receive_writer Tainted: P                  5.10.178-zfs2111 #2
[Fri Oct 27 13:45:36 2023] Hardware name: Supermicro X10DRi/X10DRi, BIOS 3.4a 08/16/2021
[Fri Oct 27 13:45:36 2023] Call Trace:
[Fri Oct 27 13:45:36 2023]  dump_stack+0x70/0x8f
[Fri Oct 27 13:45:36 2023]  spl_dumpstack+0x29/0x2f [spl]
[Fri Oct 27 13:45:36 2023]  spl_panic+0xd4/0xfc [spl]
[Fri Oct 27 13:45:36 2023]  ? arc_space_consume+0x54/0x120 [zfs]
[Fri Oct 27 13:45:36 2023]  ? dbuf_read+0x10b/0x5e0 [zfs]
[Fri Oct 27 13:45:36 2023]  ? dmu_bonus_hold_by_dnode+0x14c/0x1a0 [zfs]
[Fri Oct 27 13:45:36 2023]  receive_object+0xaec/0xd30 [zfs]
[Fri Oct 27 13:45:36 2023]  ? kfree+0x3f5/0x470
[Fri Oct 27 13:45:36 2023]  ? _cond_resched+0x1a/0x50
[Fri Oct 27 13:45:36 2023]  ? mutex_lock+0x13/0x50
[Fri Oct 27 13:45:36 2023]  receive_writer_thread+0x1c9/0xaf0 [zfs]
[Fri Oct 27 13:45:36 2023]  ? spl_kmem_free+0x2e/0x40 [spl]
[Fri Oct 27 13:45:36 2023]  ? kfree+0x3f5/0x470
[Fri Oct 27 13:45:36 2023]  ? receive_process_write_record+0x180/0x180 [zfs]
[Fri Oct 27 13:45:36 2023]  thread_generic_wrapper+0x76/0x90 [spl]
[Fri Oct 27 13:45:36 2023]  ? thread_generic_wrapper+0x76/0x90 [spl]
[Fri Oct 27 13:45:36 2023]  kthread+0x12d/0x150
[Fri Oct 27 13:45:36 2023]  ? __thread_exit+0x20/0x20 [spl]
[Fri Oct 27 13:45:36 2023]  ? kthread_associate_blkcg+0xc0/0xc0
[Fri Oct 27 13:45:36 2023]  ret_from_fork+0x1f/0x30

@borgmanJeremy
Copy link

I've also had repeated kernel panics during encrypted send using syncoid. Using Ubuntu 22.04:

kernel log:

Nov 07 19:36:17.727791 falcon kernel: BUG: kernel NULL pointer dereference, address: 0000000000000000
Nov 07 19:36:17.727972 falcon kernel: #PF: supervisor read access in kernel mode
Nov 07 19:36:17.744077 falcon kernel: #PF: error_code(0x0000) - not-present page
Nov 07 19:36:17.747910 falcon kernel: PGD 0 P4D 0
Nov 07 19:36:17.751851 falcon kernel: Oops: 0000 [#1] SMP NOPTI
Nov 07 19:36:17.751894 falcon kernel: CPU: 14 PID: 2617220 Comm: zfs Tainted: P           OE     5.15.0-87-generic #97-Ubuntu
Nov 07 19:36:17.751917 falcon kernel: Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS ULTRA/X570 AORUS ULTRA, BIOS F35 01/04/2022
Nov 07 19:36:17.751938 falcon kernel: RIP: 0010:mzap_open+0x37/0x330 [zfs]
Nov 07 19:36:17.751957 falcon kernel: Code: e5 41 57 49 89 f7 31 f6 41 56 41 55 41 54 53 48 89 d3 48 83 ec 10 48 8b 42 18 48 89 7d d0 48 c7 c2 a8 4b b4 c4 bf 28 01 00 00 <4c> 8b 30 48 8b 40 08 48>
Nov 07 19:36:17.751976 falcon kernel: RSP: 0018:ffffbb02054a37c0 EFLAGS: 00010286
Nov 07 19:36:17.752000 falcon kernel: RAX: 0000000000000000 RBX: ffff9cdaa8e92100 RCX: 00000000000001a1
Nov 07 19:36:17.752019 falcon kernel: RDX: ffffffffc4b44ba8 RSI: 0000000000000000 RDI: 0000000000000128
Nov 07 19:36:17.752040 falcon kernel: RBP: ffffbb02054a37f8 R08: 0000000000000000 R09: 0020000000000000
Nov 07 19:36:17.752058 falcon kernel: R10: ffff9cdb72ae84e0 R11: 0000000000000000 R12: ffff9cd41d180800
Nov 07 19:36:17.752076 falcon kernel: R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000001
Nov 07 19:36:17.752094 falcon kernel: FS:  00007f54ed53c7c0(0000) GS:ffff9cf1fed80000(0000) knlGS:0000000000000000
Nov 07 19:36:17.752112 falcon kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 07 19:36:17.752128 falcon kernel: CR2: 0000000000000000 CR3: 000000092a818000 CR4: 0000000000750ee0
Nov 07 19:36:17.752146 falcon kernel: PKRU: 55555554
Nov 07 19:36:17.752168 falcon kernel: Call Trace:
Nov 07 19:36:17.752187 falcon kernel:  <TASK>
Nov 07 19:36:17.752208 falcon kernel:  ? show_trace_log_lvl+0x1d6/0x2ea
Nov 07 19:36:17.752227 falcon kernel:  ? show_trace_log_lvl+0x1d6/0x2ea
Nov 07 19:36:17.752244 falcon kernel:  ? zap_lockdir_impl+0x2cd/0x3a0 [zfs]
Nov 07 19:36:17.752260 falcon kernel:  ? show_regs.part.0+0x23/0x29
Nov 07 19:36:17.752278 falcon kernel:  ? __die_body.cold+0x8/0xd
Nov 07 19:36:17.752296 falcon kernel:  ? __die+0x2b/0x37
Nov 07 19:36:17.752314 falcon kernel:  ? page_fault_oops+0x13b/0x170
Nov 07 19:36:17.752332 falcon kernel:  ? dbuf_rele_and_unlock+0x540/0x540 [zfs]
Nov 07 19:36:17.752353 falcon kernel:  ? do_user_addr_fault+0x321/0x670
Nov 07 19:36:17.752371 falcon kernel:  ? exc_page_fault+0x77/0x170
Nov 07 19:36:17.752389 falcon kernel:  ? asm_exc_page_fault+0x27/0x30
Nov 07 19:36:17.752407 falcon kernel:  ? mzap_open+0x37/0x330 [zfs]
Nov 07 19:36:17.752424 falcon kernel:  zap_lockdir_impl+0x2cd/0x3a0 [zfs]
Nov 07 19:36:17.752445 falcon kernel:  zap_lockdir+0x92/0xb0 [zfs]
Nov 07 19:36:17.752463 falcon kernel:  zap_lookup_norm+0x5c/0xd0 [zfs]
Nov 07 19:36:17.752481 falcon kernel:  zap_lookup+0x16/0x20 [zfs]
Nov 07 19:36:17.752499 falcon kernel:  zfs_get_zplprop+0xb7/0x1b0 [zfs]
Nov 07 19:36:17.752517 falcon kernel:  setup_featureflags+0x21b/0x260 [zfs]
Nov 07 19:36:17.752542 falcon kernel:  dmu_send_impl+0xdd/0xbf0 [zfs]
Nov 07 19:36:17.752559 falcon kernel:  ? dnode_rele+0x39/0x50 [zfs]
Nov 07 19:36:17.752577 falcon kernel:  ? dmu_buf_rele+0xe/0x20 [zfs]
Nov 07 19:36:17.752598 falcon kernel:  ? zap_unlockdir+0x46/0x60 [zfs]
Nov 07 19:36:17.752617 falcon kernel:  dmu_send_obj+0x265/0x340 [zfs]
Nov 07 19:36:17.752635 falcon kernel:  zfs_ioc_send+0xe8/0x2c0 [zfs]
Nov 07 19:36:17.752655 falcon kernel:  ? dump_bytes_cb+0x30/0x30 [zfs]
Nov 07 19:36:17.752673 falcon kernel:  zfsdev_ioctl_common+0x686/0x740 [zfs]
Nov 07 19:36:17.752689 falcon kernel:  ? __check_object_size.part.0+0x4a/0x150
Nov 07 19:36:17.752708 falcon kernel:  ? _copy_from_user+0x31/0x70
Nov 07 19:36:17.752724 falcon kernel:  zfsdev_ioctl+0x57/0xf0 [zfs]
Nov 07 19:36:17.752744 falcon kernel:  __x64_sys_ioctl+0x95/0xd0
Nov 07 19:36:17.752762 falcon kernel:  do_syscall_64+0x5c/0xc0
Nov 07 19:36:17.752783 falcon kernel:  ? do_syscall_64+0x69/0xc0
Nov 07 19:36:17.752802 falcon kernel:  entry_SYSCALL_64_after_hwframe+0x62/0xcc
Nov 07 19:36:17.752820 falcon kernel: RIP: 0033:0x7f54eddbcb3f
Nov 07 19:36:17.752838 falcon kernel: Code: 00 48 89 44 24 18 31 c0 48 8d 44 24 60 c7 04 24 10 00 00 00 48 89 44 24 08 48 8d 44 24 20 48 89 44 24 10 b8 10 00 00 00 0f 05 <41> 89 c0 3d 00 f0 ff ff>
Nov 07 19:36:17.752857 falcon kernel: RSP: 002b:00007fff21b90b50 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
Nov 07 19:36:17.752875 falcon kernel: RAX: ffffffffffffffda RBX: 0000558721f69640 RCX: 00007f54eddbcb3f
Nov 07 19:36:17.752894 falcon kernel: RDX: 00007fff21b90fe0 RSI: 0000000000005a1c RDI: 0000000000000003
Nov 07 19:36:17.752912 falcon kernel: RBP: 00007fff21b945d0 R08: 0000558721f70f50 R09: 0000000000000000
Nov 07 19:36:17.752930 falcon kernel: R10: 0000000000000009 R11: 0000000000000246 R12: 0000558721f61f20
Nov 07 19:36:17.752951 falcon kernel: R13: 00007fff21b90fe0 R14: 0000558721f69650 R15: 0000000000000000
Nov 07 19:36:17.752969 falcon kernel:  </TASK>
Nov 07 19:36:17.752986 falcon kernel: Modules linked in: cpuid tls rpcsec_gss_krb5 xt_nat xt_tcpudp veth xt_mark xt_conntrack nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack_netlink nf_conntrack >
Nov 07 19:36:17.753051 falcon kernel:  sunrpc ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq>
Nov 07 19:36:17.753103 falcon kernel: CR2: 0000000000000000
Nov 07 19:36:17.753140 falcon kernel: ---[ end trace 3d726021e92020e9 ]---
Nov 07 19:36:17.753174 falcon kernel: RIP: 0010:mzap_open+0x37/0x330 [zfs]
Nov 07 19:36:17.753209 falcon kernel: Code: e5 41 57 49 89 f7 31 f6 41 56 41 55 41 54 53 48 89 d3 48 83 ec 10 48 8b 42 18 48 89 7d d0 48 c7 c2 a8 4b b4 c4 bf 28 01 00 00 <4c> 8b 30 48 8b 40 08 48>
Nov 07 19:36:17.753239 falcon kernel: RSP: 0018:ffffbb02054a37c0 EFLAGS: 00010286
Nov 07 19:36:17.753257 falcon kernel: RAX: 0000000000000000 RBX: ffff9cdaa8e92100 RCX: 00000000000001a1
Nov 07 19:36:17.753274 falcon kernel: RDX: ffffffffc4b44ba8 RSI: 0000000000000000 RDI: 0000000000000128
Nov 07 19:36:17.753291 falcon kernel: RBP: ffffbb02054a37f8 R08: 0000000000000000 R09: 0020000000000000
Nov 07 19:36:17.753313 falcon kernel: R10: ffff9cdb72ae84e0 R11: 0000000000000000 R12: ffff9cd41d180800
Nov 07 19:36:17.753332 falcon kernel: R13: 0000000000000000 R14: 0000000000000001 R15: 0000000000000001
Nov 07 19:36:17.753349 falcon kernel: FS:  00007f54ed53c7c0(0000) GS:ffff9cf1fed80000(0000) knlGS:0000000000000000
Nov 07 19:36:17.753365 falcon kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov 07 19:36:17.753383 falcon kernel: CR2: 0000000000000000 CR3: 000000092a818000 CR4: 0000000000750ee0
Nov 07 19:36:17.753403 falcon kernel: PKRU: 55555554

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: Encryption "native encryption" feature Component: Send/Recv "zfs send/recv" feature Type: Defect Incorrect behavior (e.g. crash, hang)
Projects
None yet
Development

No branches or pull requests

4 participants