Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault while trying to destroy a pool #505

Closed
uclageek opened this issue Dec 24, 2011 · 1 comment
Closed

Segmentation fault while trying to destroy a pool #505

uclageek opened this issue Dec 24, 2011 · 1 comment
Milestone

Comments

@uclageek
Copy link

I got this error when I was trying to destroy a pool.

I'm on Ubuntu 10.04 64bit

[76636.757200] ld[23210]: segfault at 403 ip 00007f5d958e035e sp 00007fff5bc0d6f0 error 4 in libc-2.11.1.so[7f5d95897000+17a000]
[76724.551932] ld[29115]: segfault at 403 ip 00007f0142b0f35e sp 00007fff11909190 error 4 in libc-2.11.1.so[7f0142ac6000+17a000]
[76855.355442] ld[7081]: segfault at 403 ip 00007f1beffcb35e sp 00007fffa95ee5f0 error 4 in libc-2.11.1.so[7f1beff82000+17a000]
[76988.576990] ld[17169]: segfault at 403 ip 00007ffcac4ea35e sp 00007fff9ddf7d20 error 4 in libc-2.11.1.so[7ffcac4a1000+17a000]
[77393.842663] ld[26723]: segfault at 403 ip 00007f2ca2cc135e sp 00007fff9b32e050 error 4 in libc-2.11.1.so[7f2ca2c78000+17a000]
[127277.599111] general protection fault: 0000 [#1] SMP
[127277.599121] last sysfs file: /sys/devices/pci0000:00/0000:00:07.0/0000:02:00.0/local_cpus
[127277.599126] CPU 2
[127277.599130] Modules linked in: nls_iso8859_1 nls_cp437 vfat fat binfmt_misc ppdev snd_hda_codec_atihdmi fbcon tileblit font bitblit softcursor vga16fb vgastate snd_hda_codec_via snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi radeon snd_seq_midi_event ttm drm_kms_helper snd_seq snd_timer snd_seq_device dell_wmi psmouse dcdbas snd drm i2c_algo_bit edac_core edac_mce_amd serio_raw soundcore snd_page_alloc i2c_piix4 shpchp lp parport zfs(P) zcommon(P) znvpair(P) zavl(P) zunicode(P) spl zlib_deflate usbhid hid usb_storage ahci r8169 mii pata_atiixp
[127277.599209] Pid: 28637, comm: zpool Tainted: P 2.6.32-37-server #81-Ubuntu Inspiron 546S
[127277.599215] RIP: 0010:[] [] __taskq_destroy+0xa9/0xf0 [spl]
[127277.599242] RSP: 0018:ffff880131855d78 EFLAGS: 00010006
[127277.599246] RAX: ffff8801f6faab38 RBX: ffff88020500de00 RCX: 9300e3aa01ffcf82
[127277.599251] RDX: 4880231977c0cdf3 RSI: 0000000000000038 RDI: ffff8801f6faab30
[127277.599255] RBP: ffff880131855d98 R08: ffff880131854000 R09: 0000000000000000
[127277.599259] R10: dead000000200200 R11: dead000000100100 R12: ffff88020500de68
[127277.599264] R13: ffff8801eb4b1700 R14: 0000000000000000 R15: 0000000000000000
[127277.599269] FS: 00007fa919a53b40(0000) GS:ffff880008d00000(0000) knlGS:0000000000000000
[127277.599274] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[127277.599278] CR2: 00007fa9174ab000 CR3: 000000016e350000 CR4: 00000000000006e0
[127277.599282] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[127277.599287] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[127277.599292] Process zpool (pid: 28637, threadinfo ffff880131854000, task ffff88020375c500)
[127277.599296] Stack:
[127277.599299] ffff880131855da8 ffff8801f51f2000 ffff8801f51f2160 0000000000000000
[127277.599306] <0> ffff880131855dc8 ffffffffa0178230 0000000000000000 0000000000000002
[127277.599315] <0> ffff8801f51f2000 0000000000000000 ffff880131855e28 ffffffffa017f7dd
[127277.599325] Call Trace:
[127277.599408] [] spa_deactivate+0xb0/0x1e0 [zfs]
[127277.599483] [] spa_export_common+0xfd/0x310 [zfs]
[127277.599549] [] spa_destroy+0x1a/0x20 [zfs]
[127277.599623] [] zfs_ioc_pool_destroy+0x1e/0x40 [zfs]
[127277.599686] [] zfsdev_ioctl+0xfd/0x1c0 [zfs]
[127277.599701] [] vfs_ioctl+0x22/0xa0
[127277.599711] [] ? alloc_fd+0x10a/0x150
[127277.599718] [] do_vfs_ioctl+0x81/0x410
[127277.599724] [] sys_ioctl+0x81/0xa0
[127277.599733] [] system_call_fastpath+0x16/0x1b
[127277.599737] Code: 08 4c 39 e2 75 d4 48 8b 43 68 4c 8d 63 68 49 39 c4 74 39 0f 1f 80 00 00 00 00 48 8d 78 f8 be 38 00 00 00 48 8b 4f 08 48 8b 57 10 <48> 89 51 08 48 89 0a 48 89 47 08 48 89 47 10 e8 23 ee ff ff 48
[127277.599798] RIP [] __taskq_destroy+0xa9/0xf0 [spl]
[127277.599817] RSP
[127277.599825] ---[ end trace 4131885a1db306b1 ]---

@behlendorf
Copy link
Contributor

This was fixed by the following commits:

openzfs/spl@8f2503e
openzfs/spl@e7e5f78

ryao added a commit to ryao/zfs that referenced this issue Oct 9, 2014
The below excerpt of a backtrace is from a ztest failure when running
ZoL's ztest.

/openzfs#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/openzfs#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/openzfs#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/openzfs#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/openzfs#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/openzfs#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/openzfs#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/openzfs#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/openzfs#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/openzfs#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/openzfs#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/openzfs#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/openzfs#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/openzfs#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/openzfs#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/openzfs#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/openzfs#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/openzfs#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/openzfs#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/openzfs#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/openzfs#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/openzfs#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/openzfs#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/openzfs#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/openzfs#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/openzfs#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/openzfs#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/openzfs#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/openzfs#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/openzfs#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/openzfs#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/openzfs#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/openzfs#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/openzfs#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/openzfs#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/openzfs#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/openzfs#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/openzfs#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/openzfs#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/openzfs#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/openzfs#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/openzfs#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/openzfs#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/openzfs#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/openzfs#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/openzfs#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/openzfs#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/openzfs#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/openzfs#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/openzfs#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/openzfs#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/openzfs#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/openzfs#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/openzfs#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/openzfs#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/openzfs#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/openzfs#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/openzfs#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/openzfs#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/openzfs#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/openzfs#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/openzfs#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/openzfs#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/openzfs#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

The backtrace was an infinite loop of `vdev_queue_io_to_issue()` invoking
`zio_execute()` until it overran the stack. vdev_queue_io_to_issue() will ony
invoke `zio_execute()` on raidz vdevs when aggregation I/Os are generated to
improve aggregation continuity. These I/Os do not trigger any writes. However,
it appears that they can be generated in such a way that they recurse
infinitely upon return to `vdev_queue_io_to_issue()`. As a consequence, we see
the number of parents by 1 each time the recursion returns to
`vdev_raidz_io_start()`.

Signed-off-by: Richard Yao <[email protected]>
ryao added a commit to ryao/zfs that referenced this issue Oct 9, 2014
The below excerpt of a backtrace is from a ztest failure when running
ZoL's ztest.

/openzfs#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/openzfs#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/openzfs#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/openzfs#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/openzfs#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/openzfs#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/openzfs#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/openzfs#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/openzfs#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/openzfs#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/openzfs#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/openzfs#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/openzfs#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/openzfs#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/openzfs#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/openzfs#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/openzfs#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/openzfs#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/openzfs#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/openzfs#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/openzfs#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/openzfs#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/openzfs#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/openzfs#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/openzfs#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/openzfs#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/openzfs#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/openzfs#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/openzfs#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/openzfs#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/openzfs#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/openzfs#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/openzfs#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/openzfs#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/openzfs#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/openzfs#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/openzfs#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/openzfs#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/openzfs#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/openzfs#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/openzfs#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/openzfs#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/openzfs#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/openzfs#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/openzfs#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/openzfs#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/openzfs#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/openzfs#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/openzfs#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/openzfs#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/openzfs#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/openzfs#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/openzfs#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/openzfs#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/openzfs#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/openzfs#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/openzfs#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/openzfs#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/openzfs#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/openzfs#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/openzfs#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/openzfs#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/openzfs#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/openzfs#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

The backtrace was an infinite loop of `vdev_queue_io_to_issue()` invoking
`zio_execute()` until it overran the stack. vdev_queue_io_to_issue() will ony
invoke `zio_execute()` on raidz vdevs when aggregation I/Os are generated to
improve aggregation continuity. These I/Os do not trigger any writes. However,
it appears that they can be generated in such a way that they recurse
infinitely upon return to `vdev_queue_io_to_issue()`. As a consequence, we see
the number of parents by 1 each time the recursion returns to
`vdev_raidz_io_start()`.

Signed-off-by: Richard Yao <[email protected]>
ryao added a commit to ryao/zfs that referenced this issue Oct 9, 2014
The below excerpt of a backtrace is from a ztest failure when running
ZoL's ztest.

/openzfs#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/openzfs#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/openzfs#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/openzfs#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/openzfs#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/openzfs#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/openzfs#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/openzfs#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/openzfs#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/openzfs#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/openzfs#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/openzfs#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/openzfs#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/openzfs#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/openzfs#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/openzfs#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/openzfs#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/openzfs#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/openzfs#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/openzfs#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/openzfs#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/openzfs#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/openzfs#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/openzfs#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/openzfs#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/openzfs#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/openzfs#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/openzfs#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/openzfs#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/openzfs#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/openzfs#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/openzfs#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/openzfs#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/openzfs#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/openzfs#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/openzfs#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/openzfs#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/openzfs#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/openzfs#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/openzfs#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/openzfs#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/openzfs#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/openzfs#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/openzfs#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/openzfs#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/openzfs#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/openzfs#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/openzfs#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/openzfs#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/openzfs#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/openzfs#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/openzfs#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/openzfs#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/openzfs#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/openzfs#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/openzfs#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/openzfs#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/openzfs#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/openzfs#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/openzfs#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/openzfs#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/openzfs#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/openzfs#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/openzfs#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

The backtrace was an infinite loop of `vdev_queue_io_to_issue()` invoking
`zio_execute()` until it overran the stack. vdev_queue_io_to_issue() will ony
invoke `zio_execute()` on raidz vdevs when aggregation I/Os are generated to
improve aggregation continuity. These I/Os do not trigger any writes. However,
it appears that they can be generated in such a way that they recurse
infinitely upon return to `vdev_queue_io_to_issue()`.

Signed-off-by: Richard Yao <[email protected]>
ryao added a commit to ryao/zfs that referenced this issue Oct 10, 2014
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/openzfs#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/openzfs#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/openzfs#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/openzfs#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/openzfs#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/openzfs#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/openzfs#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/openzfs#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/openzfs#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/openzfs#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/openzfs#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/openzfs#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/openzfs#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/openzfs#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/openzfs#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/openzfs#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/openzfs#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/openzfs#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/openzfs#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/openzfs#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/openzfs#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/openzfs#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/openzfs#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/openzfs#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/openzfs#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/openzfs#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/openzfs#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/openzfs#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/openzfs#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/openzfs#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/openzfs#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/openzfs#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/openzfs#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/openzfs#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/openzfs#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/openzfs#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/openzfs#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/openzfs#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/openzfs#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/openzfs#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/openzfs#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/openzfs#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/openzfs#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/openzfs#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/openzfs#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/openzfs#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/openzfs#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/openzfs#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/openzfs#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/openzfs#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/openzfs#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/openzfs#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/openzfs#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/openzfs#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/openzfs#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/openzfs#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/openzfs#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/openzfs#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/openzfs#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/openzfs#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/openzfs#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/openzfs#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/openzfs#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/openzfs#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <[email protected]>
ryao added a commit to ryao/zfs that referenced this issue Oct 10, 2014
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/openzfs#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/openzfs#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/openzfs#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/openzfs#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/openzfs#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/openzfs#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/openzfs#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/openzfs#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/openzfs#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/openzfs#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/openzfs#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/openzfs#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/openzfs#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/openzfs#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/openzfs#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/openzfs#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/openzfs#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/openzfs#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/openzfs#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/openzfs#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/openzfs#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/openzfs#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/openzfs#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/openzfs#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/openzfs#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/openzfs#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/openzfs#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/openzfs#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/openzfs#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/openzfs#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/openzfs#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/openzfs#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/openzfs#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/openzfs#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/openzfs#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/openzfs#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/openzfs#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/openzfs#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/openzfs#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/openzfs#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/openzfs#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/openzfs#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/openzfs#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/openzfs#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/openzfs#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/openzfs#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/openzfs#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/openzfs#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/openzfs#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/openzfs#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/openzfs#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/openzfs#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/openzfs#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/openzfs#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/openzfs#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/openzfs#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/openzfs#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/openzfs#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/openzfs#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/openzfs#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/openzfs#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/openzfs#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/openzfs#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/openzfs#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <[email protected]>
ryao added a commit to ryao/zfs that referenced this issue Oct 10, 2014
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/openzfs#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/openzfs#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/openzfs#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/openzfs#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/openzfs#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/openzfs#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/openzfs#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/openzfs#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/openzfs#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/openzfs#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/openzfs#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/openzfs#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/openzfs#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/openzfs#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/openzfs#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/openzfs#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/openzfs#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/openzfs#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/openzfs#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/openzfs#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/openzfs#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/openzfs#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/openzfs#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/openzfs#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/openzfs#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/openzfs#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/openzfs#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/openzfs#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/openzfs#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/openzfs#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/openzfs#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/openzfs#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/openzfs#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/openzfs#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/openzfs#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/openzfs#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/openzfs#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/openzfs#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/openzfs#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/openzfs#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/openzfs#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/openzfs#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/openzfs#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/openzfs#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/openzfs#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/openzfs#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/openzfs#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/openzfs#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/openzfs#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/openzfs#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/openzfs#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/openzfs#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/openzfs#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/openzfs#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/openzfs#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/openzfs#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/openzfs#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/openzfs#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/openzfs#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/openzfs#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/openzfs#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/openzfs#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/openzfs#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/openzfs#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <[email protected]>
ryao added a commit to ryao/zfs that referenced this issue Oct 10, 2014
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/openzfs#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/openzfs#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/openzfs#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/openzfs#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/openzfs#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/openzfs#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/openzfs#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/openzfs#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/openzfs#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/openzfs#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/openzfs#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/openzfs#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/openzfs#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/openzfs#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/openzfs#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/openzfs#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/openzfs#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/openzfs#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/openzfs#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/openzfs#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/openzfs#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/openzfs#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/openzfs#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/openzfs#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/openzfs#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/openzfs#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/openzfs#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/openzfs#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/openzfs#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/openzfs#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/openzfs#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/openzfs#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/openzfs#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/openzfs#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/openzfs#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/openzfs#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/openzfs#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/openzfs#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/openzfs#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/openzfs#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/openzfs#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/openzfs#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/openzfs#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/openzfs#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/openzfs#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/openzfs#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/openzfs#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/openzfs#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/openzfs#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/openzfs#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/openzfs#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/openzfs#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/openzfs#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/openzfs#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/openzfs#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/openzfs#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/openzfs#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/openzfs#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/openzfs#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/openzfs#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/openzfs#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/openzfs#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/openzfs#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/openzfs#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <[email protected]>
ryao added a commit to ryao/zfs that referenced this issue Oct 10, 2014
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/openzfs#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/openzfs#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/openzfs#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/openzfs#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/openzfs#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/openzfs#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/openzfs#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/openzfs#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/openzfs#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/openzfs#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/openzfs#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/openzfs#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/openzfs#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/openzfs#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/openzfs#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/openzfs#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/openzfs#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/openzfs#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/openzfs#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/openzfs#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/openzfs#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/openzfs#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/openzfs#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/openzfs#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/openzfs#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/openzfs#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/openzfs#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/openzfs#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/openzfs#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/openzfs#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/openzfs#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/openzfs#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/openzfs#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/openzfs#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/openzfs#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/openzfs#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/openzfs#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/openzfs#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/openzfs#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/openzfs#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/openzfs#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/openzfs#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/openzfs#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/openzfs#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/openzfs#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/openzfs#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/openzfs#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/openzfs#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/openzfs#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/openzfs#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/openzfs#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/openzfs#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/openzfs#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/openzfs#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/openzfs#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/openzfs#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/openzfs#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/openzfs#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/openzfs#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/openzfs#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/openzfs#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/openzfs#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/openzfs#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/openzfs#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <[email protected]>
ryao added a commit to ryao/zfs that referenced this issue Oct 10, 2014
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/openzfs#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/openzfs#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/openzfs#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/openzfs#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/openzfs#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/openzfs#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/openzfs#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/openzfs#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/openzfs#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/openzfs#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/openzfs#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/openzfs#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/openzfs#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/openzfs#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/openzfs#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/openzfs#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/openzfs#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/openzfs#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/openzfs#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/openzfs#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/openzfs#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/openzfs#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/openzfs#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/openzfs#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/openzfs#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/openzfs#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/openzfs#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/openzfs#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/openzfs#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/openzfs#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/openzfs#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/openzfs#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/openzfs#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/openzfs#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/openzfs#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/openzfs#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/openzfs#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/openzfs#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/openzfs#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/openzfs#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/openzfs#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/openzfs#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/openzfs#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/openzfs#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/openzfs#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/openzfs#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/openzfs#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/openzfs#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/openzfs#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/openzfs#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/openzfs#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/openzfs#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/openzfs#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/openzfs#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/openzfs#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/openzfs#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/openzfs#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/openzfs#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/openzfs#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/openzfs#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/openzfs#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/openzfs#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/openzfs#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/openzfs#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <[email protected]>
ryao added a commit to ryao/zfs that referenced this issue Oct 11, 2014
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/openzfs#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/openzfs#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/openzfs#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/openzfs#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/openzfs#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/openzfs#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/openzfs#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/openzfs#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/openzfs#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/openzfs#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/openzfs#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/openzfs#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/openzfs#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/openzfs#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/openzfs#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/openzfs#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/openzfs#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/openzfs#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/openzfs#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/openzfs#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/openzfs#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/openzfs#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/openzfs#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/openzfs#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/openzfs#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/openzfs#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/openzfs#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/openzfs#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/openzfs#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/openzfs#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/openzfs#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/openzfs#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/openzfs#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/openzfs#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/openzfs#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/openzfs#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/openzfs#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/openzfs#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/openzfs#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/openzfs#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/openzfs#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/openzfs#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/openzfs#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/openzfs#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/openzfs#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/openzfs#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/openzfs#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/openzfs#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/openzfs#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/openzfs#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/openzfs#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/openzfs#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/openzfs#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/openzfs#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/openzfs#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/openzfs#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/openzfs#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/openzfs#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/openzfs#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/openzfs#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/openzfs#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/openzfs#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/openzfs#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/openzfs#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <[email protected]>
ryao added a commit to ryao/zfs that referenced this issue Oct 11, 2014
The below excerpt of a backtrace is from a ztest failure when running ZoL's
ztest:

/openzfs#453 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#454 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350003de0) at ../../module/zfs/vdev_queue.c:747
/openzfs#455 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350003de0) at ../../module/zfs/zio.c:2659
/openzfs#456 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1399
/openzfs#457 zio_nowait (zio=0x7f0350003de0) at ../../module/zfs/zio.c:1456
/openzfs#458 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350003a10) at ../../module/zfs/vdev_mirror.c:374
/openzfs#459 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1399
/openzfs#460 zio_nowait (zio=0x7f0350003a10) at ../../module/zfs/zio.c:1456
/openzfs#461 0x00007f03c806464c in vdev_raidz_io_start (zio=0x7f0350003380) at ../../module/zfs/vdev_raidz.c:1607
/openzfs#462 0x00007f03c807f243 in __zio_execute (zio=0x7f0350003380) at ../../module/zfs/zio.c:1399
/openzfs#463 zio_nowait (zio=0x7f0350003380) at ../../module/zfs/zio.c:1456
/openzfs#464 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002fb0) at ../../module/zfs/vdev_mirror.c:374
/openzfs#465 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1399
/openzfs#466 zio_nowait (zio=0x7f0350002fb0) at ../../module/zfs/zio.c:1456
/openzfs#467 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033957ebf0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#468 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:2707
/openzfs#469 0x00007f03c808285b in __zio_execute (zio=0x7f033957ebf0) at ../../module/zfs/zio.c:1399
/openzfs#470 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f0390001330, pio=0x7f033957ebf0) at ../../module/zfs/zio.c:547
/openzfs#471 zio_done (zio=0x7f0390001330) at ../../module/zfs/zio.c:3278
/openzfs#472 0x00007f03c808285b in __zio_execute (zio=0x7f0390001330) at ../../module/zfs/zio.c:1399
/openzfs#473 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4013a00, pio=0x7f0390001330) at ../../module/zfs/zio.c:547
/openzfs#474 zio_done (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:3278
/openzfs#475 0x00007f03c808285b in __zio_execute (zio=0x7f03b4013a00) at ../../module/zfs/zio.c:1399
/openzfs#476 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014210, pio=0x7f03b4013a00) at ../../module/zfs/zio.c:547
/openzfs#477 zio_done (zio=0x7f03b4014210) at ../../module/zfs/zio.c:3278
/openzfs#478 0x00007f03c808285b in __zio_execute (zio=0x7f03b4014210) at ../../module/zfs/zio.c:1399
/openzfs#479 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03b4014620, pio=0x7f03b4014210) at ../../module/zfs/zio.c:547
/openzfs#480 zio_done (zio=0x7f03b4014620) at ../../module/zfs/zio.c:3278
/openzfs#481 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03b4014620) at ../../module/zfs/zio.c:1399
/openzfs#482 zio_execute (zio=zio@entry=0x7f03b4014620) at ../../module/zfs/zio.c:1337
/openzfs#483 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#484 0x00007f03c806106e in vdev_queue_io (zio=zio@entry=0x7f0350002be0) at ../../module/zfs/vdev_queue.c:747
/openzfs#485 0x00007f03c80818c1 in zio_vdev_io_start (zio=0x7f0350002be0) at ../../module/zfs/zio.c:2659
/openzfs#486 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1399
/openzfs#487 zio_nowait (zio=0x7f0350002be0) at ../../module/zfs/zio.c:1456
/openzfs#488 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350002810) at ../../module/zfs/vdev_mirror.c:374
/openzfs#489 0x00007f03c807f243 in __zio_execute (zio=0x7f0350002810) at ../../module/zfs/zio.c:1399
/openzfs#490 zio_nowait (zio=0x7f0350002810) at ../../module/zfs/zio.c:1456
/openzfs#491 0x00007f03c8064593 in vdev_raidz_io_start (zio=0x7f0350001270) at ../../module/zfs/vdev_raidz.c:1591
/openzfs#492 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001270) at ../../module/zfs/zio.c:1399
/openzfs#493 zio_nowait (zio=0x7f0350001270) at ../../module/zfs/zio.c:1456
/openzfs#494 0x00007f03c805f71b in vdev_mirror_io_start (zio=0x7f0350001e60) at ../../module/zfs/vdev_mirror.c:374
/openzfs#495 0x00007f03c807f243 in __zio_execute (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1399
/openzfs#496 zio_nowait (zio=0x7f0350001e60) at ../../module/zfs/zio.c:1456
/openzfs#497 0x00007f03c805ed43 in vdev_mirror_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/vdev_mirror.c:499
/openzfs#498 0x00007f03c807a0c0 in zio_vdev_io_done (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:2707
/openzfs#499 0x00007f03c808285b in __zio_execute (zio=0x7f033a0c39c0) at ../../module/zfs/zio.c:1399
/openzfs#500 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f03a8003c00, pio=0x7f033a0c39c0) at ../../module/zfs/zio.c:547
/openzfs#501 zio_done (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:3278
/openzfs#502 0x00007f03c808285b in __zio_execute (zio=0x7f03a8003c00) at ../../module/zfs/zio.c:1399
/openzfs#503 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800c400, pio=0x7f03a8003c00) at ../../module/zfs/zio.c:547
/openzfs#504 zio_done (zio=0x7f038800c400) at ../../module/zfs/zio.c:3278
/openzfs#505 0x00007f03c808285b in __zio_execute (zio=0x7f038800c400) at ../../module/zfs/zio.c:1399
/openzfs#506 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800da00, pio=0x7f038800c400) at ../../module/zfs/zio.c:547
/openzfs#507 zio_done (zio=0x7f038800da00) at ../../module/zfs/zio.c:3278
/openzfs#508 0x00007f03c808285b in __zio_execute (zio=0x7f038800da00) at ../../module/zfs/zio.c:1399
/openzfs#509 zio_notify_parent (wait=ZIO_WAIT_DONE, zio=0x7f038800fd80, pio=0x7f038800da00) at ../../module/zfs/zio.c:547
/openzfs#510 zio_done (zio=0x7f038800fd80) at ../../module/zfs/zio.c:3278
/openzfs#511 0x00007f03c807a6d3 in __zio_execute (zio=0x7f038800fd80) at ../../module/zfs/zio.c:1399
/openzfs#512 zio_execute (zio=zio@entry=0x7f038800fd80) at ../../module/zfs/zio.c:1337
/openzfs#513 0x00007f03c8060b35 in vdev_queue_io_to_issue (vq=vq@entry=0x99f8a8) at ../../module/zfs/vdev_queue.c:706
/openzfs#514 0x00007f03c806119d in vdev_queue_io_done (zio=zio@entry=0x7f03a0010950) at ../../module/zfs/vdev_queue.c:775
/openzfs#515 0x00007f03c807a0e8 in zio_vdev_io_done (zio=0x7f03a0010950) at ../../module/zfs/zio.c:2686
/openzfs#516 0x00007f03c807a6d3 in __zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1399
/openzfs#517 zio_execute (zio=0x7f03a0010950) at ../../module/zfs/zio.c:1337
/openzfs#518 0x00007f03c7fcd0c4 in taskq_thread (arg=0x966d50) at ../../lib/libzpool/taskq.c:215
/openzfs#519 0x00007f03c7fc7937 in zk_thread_helper (arg=0x967e90) at ../../lib/libzpool/kernel.c:135
/openzfs#520 0x00007f03c78890a3 in start_thread (arg=0x7f03c2703700) at pthread_create.c:309
/openzfs#521 0x00007f03c75c50fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111

This occurred when ztest was simulating a scrub under heavy I/O load. Under
those circumstances, it was possible for a mix of noop I/Os for aggregation
continuity and the I/O elevator to generate arbitrarily deep recursion.

This patch modifies ZFS to propapage a recursion counter inside the zio_t
objects such that IOs will be redispatched upon reaching a given recursion
depth.  We can detect long call chains and dispatch to another ZIO taskq. We
cut in-line when we do this to minimize the potential for taskq exhaustion that
can prevent a zio from notifying its parent.

Signed-off-by: Richard Yao <[email protected]>
behlendorf pushed a commit to behlendorf/zfs that referenced this issue May 21, 2018
To prevent taskq_member holding tq_lock and doing linear search, thus causing
contention. We store the taskq pointer to which the thread belongs in tsd.
This way taskq_member will not need to touch tq_lock, and tsd has per slot
spinlock. So the contention should be reduced greatly.

Signed-off-by: Chunwei Chen <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#500
Closes openzfs#504
Closes openzfs#505
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants