Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

crash during zfs testsuite #279

Closed
ggruber opened this issue Jun 16, 2011 · 34 comments
Closed

crash during zfs testsuite #279

ggruber opened this issue Jun 16, 2011 · 34 comments
Milestone

Comments

@ggruber
Copy link

ggruber commented Jun 16, 2011

testing 0.6.0-rc4 on debian squeeze, distro kernel (amd64)

I got several kernel crashes during different tests of the zfs testsuite, not always on the same place.
details will follow

@ggruber
Copy link
Author

ggruber commented Jun 16, 2011

during

/tmp# /usr/libexec/zfs/zconfig.sh -c
1    persistent zpool.cache             Pass
2    scan disks for pools to import     Pass
3    zpool import/export device         Pass
4    zpool insmod/rmmod device          Pass
5    zvol+ext3 volume                   Pass
6    zvol+ext2 snapshot                 Pass
7    zvol+ext2 clone                    Pass
8    zfs send/receive

I got

[  223.852043] BUG: unable to handle kernel NULL pointer dereference at (null)
[  223.856004] IP: [] get_next_timer_interrupt+0x145/0x223
[  223.856004] PGD b08e4067 PUD b08f4067 PMD 0
[  223.856004] Thread overran stack, or stack corrupted
[  223.856004] Oops: 0000 [#1] SMP
[  223.856004] last sysfs file: /sys/devices/virtual/block/zd0/removable
[  223.856004] CPU 0
[  223.856004] Modules linked in: zfs(P) zcommon(P) zunicode(P) znvpair(P) zavl(P) splat spl ext2 ext3 jbd zlib_deflate loo]
[  223.856004] Pid: 0, comm: swapper Tainted: P           2.6.32-5-amd64 #1
[  223.856004] RIP: 0010:[]  [] get_next_timer_interrupt+0x145/0x223
[  223.856004] RSP: 0018:ffffffff8142fe48  EFLAGS: 00010097
[  223.856004] RAX: 0000000000000000 RBX: ffffffff815f97c0 RCX: 0000000000000036
[  223.856004] RDX: 0000000000000000 RSI: 0000000000000036 RDI: 0000000000000000
[  223.856004] RBP: ffffffff815fab40 R08: ffffffff8142fe58 R09: 0000000000ffffb6
[  223.856004] R10: 0000000000000000 R11: 0000000000000001 R12: 00000000ffffb5a3
[  223.856004] R13: ffffffff815fa7e0 R14: 0000000000000040 R15: 0000000000000000
[  223.856004] FS:  0000000000000000(0000) GS:ffff880001800000(0000) knlGS:0000000000000000
[  223.856004] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[  223.856004] CR2: 0000000000000000 CR3: 00000000b5742000 CR4: 00000000000006f0
[  223.856004] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  223.856004] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  223.856004] Process swapper (pid: 0, threadinfo ffffffff8142e000, task ffffffff814631f0)
[  223.856004] Stack:
[  223.856004]  ffff8800018100c0 0000000000000000 ffffffff815fa7e0 ffffffff815fabe0
[  223.856004] <0> ffffffff815fafe0 ffffffff815fb3e0 ffff8800018101b0 000000000000bd42
[  223.856004] <0> 0000000000000000 0000014b87380000 00000000ffffb5a3 ffffffff81070cd1
[  223.856004] Call Trace:
[  223.856004]  [] ? tick_nohz_stop_sched_tick+0x327/0x335
[  223.856004]  [] ? cpu_idle+0x73/0xda
[  223.856004]  [] ? early_idt_handler+0x0/0x71
[  223.856004]  [] ? start_kernel+0x3dc/0x3e8
[  223.856004]  [] ? x86_64_start_kernel+0xf9/0x106
[  223.856004] Code: 4a 8b 14 28 49 8d 6c 05 00 eb 20 f6 42 28 01 4c 89 d0 75 11 48 8b 42 10 41 bb 01 00 00 00 4c 39 d0 49
[  223.856004] RIP  [] get_next_timer_interrupt+0x145/0x223
[  223.856004]  RSP 
[  223.856004] CR2: 0000000000000000
[  223.856004] ---[ end trace 2f379f335549008e ]---
[  223.856004] Kernel panic - not syncing: Attempted to kill the idle task!
[  223.856004] Pid: 0, comm: swapper Tainted: P      D    2.6.32-5-amd64 #1
[  223.856004] Call Trace:
[  223.856004]  [] ? panic+0x86/0x143
[  223.856004]  [] ? printk+0x4e/0x5a
[  223.856004]  [] ? printk+0x4e/0x5a
[  223.856004]  [] ? do_exit+0x72/0x6c6
[  223.856004]  [] ? release_console_sem+0x17e/0x1af
[  223.856004]  [] ? oops_end+0xaf/0xb4
[  223.856004]  [] ? no_context+0x1e9/0x1f8
[  223.856004]  [] ? __bad_area_nosemaphore+0x1a6/0x1ca
[  223.856004]  [] ? page_fault+0x25/0x30
[  223.856004]  [] ? get_next_timer_interrupt+0x145/0x223
[  223.856004]  [] ? get_next_timer_interrupt+0x4f/0x223
[  223.856004]  [] ? tick_nohz_stop_sched_tick+0x327/0x335
[  223.856004]  [] ? cpu_idle+0x73/0xda
[  223.856004]  [] ? early_idt_handler+0x0/0x71
[  223.856004]  [] ? start_kernel+0x3dc/0x3e8
[  223.856004]  [] ? x86_64_start_kernel+0xf9/0x106

@ggruber ggruber closed this as completed Jun 16, 2011
@ggruber ggruber reopened this Jun 16, 2011
@ggruber
Copy link
Author

ggruber commented Jun 16, 2011

after one successfull pass of the test suite (at least this first part) the next try gave the next crash:

/tmp# /usr/libexec/zfs/zconfig.sh -c
1    persistent zpool.cache             Pass
2    scan disks for pools to import     Pass
3    zpool import/export device         Pass
4    zpool insmod/rmmod device          Pass
5    zvol+ext3 volume                   Pass
6    zvol+ext2 snapshot                 Pass
7    zvol+ext2 clone                    Pass
8    zfs send/receive

and now:

[  343.270039] kjournald starting.  Commit interval 5 seconds
[  343.316909] EXT3 FS on zd0p1, internal journal
[  343.330386] EXT3-fs: mounted filesystem with ordered data mode.
[  348.096041] ZFS: Unloaded module v0.6.0
[  348.155962] SPLAT: Unloaded module v0.6.0
[  348.177687] SPL: Unloaded module v0.6.0
[  348.335376] SPL: Loaded module v0.6.0, using hostid 0xc00a622c
[  348.371726] SPLAT: Loaded module v0.6.0
[  348.556851] ZFS: Loaded module v0.6.0, ZFS pool version 28, ZFS filesystem version 5
[  349.866077]  zd0: unknown partition table
[  349.895807]  zd0: unknown partition table
[  350.324572]  zd0: p1 p2
[  358.575597]  zd16: p1 p2
[  365.571241] ZFS: Unloaded module v0.6.0
[  365.630854] SPLAT: Unloaded module v0.6.0
[  365.652564] SPL: Unloaded module v0.6.0
[  365.810126] SPL: Loaded module v0.6.0, using hostid 0xc00a622c
[  365.846533] SPLAT: Loaded module v0.6.0
[  366.030592] ZFS: Loaded module v0.6.0, ZFS pool version 28, ZFS filesystem version 5
[  367.555046]  zd0: unknown partition table
[  367.586026]  zd0: unknown partition table
[  368.100575]  zd0: p1
[  380.441026]  zd16: p1
[  382.516332]  zd32: p1
[  382.885877] EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
[  396.050114] ZFS: Unloaded module v0.6.0
[  396.112848] SPLAT: Unloaded module v0.6.0
[  396.134769] SPL: Unloaded module v0.6.0
[  396.293051] SPL: Loaded module v0.6.0, using hostid 0xc00a622c
[  396.329436] SPLAT: Loaded module v0.6.0
[  396.511287] ZFS: Loaded module v0.6.0, ZFS pool version 28, ZFS filesystem version 5
[  398.893624]  zd0: unknown partition table
[  398.923536]  zd0: unknown partition table
[  399.444574]  zd0: p1
[  400.316065] BUG: unable to handle kernel NULL pointer dereference at (null)
[  400.320032] IP: [] get_next_timer_interrupt+0x145/0x223
[  400.353245] PGD 0
[  400.353245] Oops: 0000 [#1] SMP
[  400.367796] last sysfs file: /sys/devices/virtual/block/zd0/removable
[  400.367796] CPU 1
[  400.367796] Modules linked in: zfs(P) zcommon(P) zunicode(P) znvpair(P) zavl(P) splat spl ext2 ext3 jbd zlib_deflate loo]
[  400.367796] Pid: 0, comm: swapper Tainted: P           2.6.32-5-amd64 #1
[  400.367796] RIP: 0010:[]  [] get_next_timer_interrupt+0x145/0x223
[  400.367796] RSP: 0018:ffff880001883ed8  EFLAGS: 00010093
[  400.367796] RAX: 0000000000000000 RBX: ffff8800bf67c000 RCX: 0000000000000022
[  400.367796] RDX: 0000000000000000 RSI: 0000000000000022 RDI: 0000000000000000
[  400.367796] RBP: ffff8800bf67d240 R08: ffff880001883ee8 R09: 0000000001000062
[  400.367796] R10: 0000000000000000 R11: 0000000000000001 R12: 00000001000061f7
[  400.367796] R13: ffff8800bf67d020 R14: 0000000000000040 R15: 0000000000000000
[  400.367796] FS:  0000000000000000(0000) GS:ffff880001880000(0000) knlGS:0000000000000000
[  400.367796] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[  400.367796] CR2: 0000000000000000 CR3: 00000000ba5c1000 CR4: 00000000000006e0
[  400.367796] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  400.367796] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  400.367796] Process swapper (pid: 0, threadinfo ffff8800bf684000, task ffff8800bf663880)
[  400.367796] Stack:
[  400.367796]  ffffffff8109845e 0000000000000100 ffff8800bf67d020 ffff8800bf67d420
[  400.367796] <0> ffff8800bf67d820 ffff8800bf67dc20 ffff8800018901b0 000000000001754e
[  400.367796] <0> 0000000000000001 0000014b7f970000 00000001000061f7 ffffffff81070cd1
[  400.367796] Call Trace:
[  400.367796]  
[  400.367796]  [] ? rcu_process_callbacks+0x45/0x4a
[  400.367796]  [] ? tick_nohz_stop_sched_tick+0x327/0x335
[  400.367796]  [] ? smp_apic_timer_interrupt+0x87/0x95
[  400.367796]  [] ? apic_timer_interrupt+0x13/0x20
[  400.367796]  
[  400.367796]  [] ? mwait_idle+0x72/0x7d
[  400.367796]  [] ? mwait_idle+0x22/0x7d
[  400.367796]  [] ? cpu_idle+0xa2/0xda
[  400.367796] Code: 4a 8b 14 28 49 8d 6c 05 00 eb 20 f6 42 28 01 4c 89 d0 75 11 48 8b 42 10 41 bb 01 00 00 00 4c 39 d0 49
[  400.367796] RIP  [] get_next_timer_interrupt+0x145/0x223
[  400.367796]  RSP 
[  400.367796] CR2: 0000000000000000
[  400.367796] ---[ end trace 72ca58eb598af0d9 ]---
[  400.367796] Kernel panic - not syncing: Attempted to kill the idle task!
[  400.367796] Pid: 0, comm: swapper Tainted: P      D    2.6.32-5-amd64 #1
[  400.367796] Call Trace:
[  400.367796]    [] ? panic+0x86/0x143
[  400.367796]  [] ? printk+0x4e/0x5a
[  400.367796]  [] ? printk+0x4e/0x5a
[  400.367796]  [] ? do_exit+0x72/0x6c6
[  400.367796]  [] ? release_console_sem+0x17e/0x1af
[  400.367796]  [] ? oops_end+0xaf/0xb4
[  400.367796]  [] ? no_context+0x1e9/0x1f8
[  400.367796]  [] ? __bad_area_nosemaphore+0x1a6/0x1ca
[  400.367796]  [] ? page_fault+0x25/0x30
[  400.367796]  [] ? get_next_timer_interrupt+0x145/0x223
[  400.367796]  [] ? get_next_timer_interrupt+0x4f/0x223
[  400.367796]  [] ? rcu_process_callbacks+0x45/0x4a
[  400.367796]  [] ? tick_nohz_stop_sched_tick+0x327/0x335
[  400.367796]  [] ? smp_apic_timer_interrupt+0x87/0x95
[  400.367796]  [] ? apic_timer_interrupt+0x13/0x20
[  400.367796]    [] ? mwait_idle+0x72/0x7d
[  400.367796]  [] ? mwait_idle+0x22/0x7d
[  400.367796]  [] ? cpu_idle+0xa2/0xda

@ggruber
Copy link
Author

ggruber commented Jun 16, 2011

the next run gives a success, the following a failure?

/tmp# /usr/libexec/zfs/zconfig.sh -c
1    persistent zpool.cache             Pass
2    scan disks for pools to import     Pass
3    zpool import/export device         Pass
4    zpool insmod/rmmod device          Pass
5    zvol+ext3 volume                   Pass
6    zvol+ext2 snapshot                 Pass
7    zvol+ext2 clone                    Pass
8    zfs send/receive                   Pass
9    zpool events                       Pass
10   zpool add/remove vdev              Pass
root@jena:/tmp# /usr/libexec/zfs/zconfig.sh -c
1    persistent zpool.cache             Pass
2    scan disks for pools to import     Pass
3    zpool import/export device         Pass
4    zpool insmod/rmmod device          Pass
5    zvol+ext3 volume                   Pass
6    zvol+ext2 snapshot                 Pass
7    zvol+ext2 clone                    Pass
8    zfs send/receive                   Pass
9    zpool events                       Pass
10   zpool add/remove vdev              Fail (4)
root@jena:/tmp#

@kohlschuetter
Copy link
Contributor

Have you also tried the latest version from behlendorf/zfs master? There seem to be a few changes post rc4 that might fix that. At least it works fine here on my system (EL6, though).

@ggruber
Copy link
Author

ggruber commented Jun 16, 2011

just fetched the git stuff (spl+zfs), compiled it. spl test again fine.
zfs again with problems:

root@jena:/tmp# /usr/libexec/zfs/zconfig.sh -c
1    persistent zpool.cache             Pass
2    scan disks for pools to import     Pass
3    zpool import/export device         Pass
4    zpool insmod/rmmod device          Pass
5    zvol+ext3 volume                   Pass
6    zvol+ext2 snapshot                 Pass
7    zvol+ext2 clone                    Pass
8    zfs send/receive

and again kernel crash:

[23869.705891] EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
[23883.051452] ZFS: Unloaded module v0.6.0
[23883.112126] SPLAT: Unloaded module v0.6.0
[23883.133912] SPL: Unloaded module v0.6.0
[23883.293702] SPL: Loaded module v0.6.0, using hostid 0xc00a622c
[23883.330280] SPLAT: Loaded module v0.6.0
[23883.515168] ZFS: Loaded module v0.6.0, ZFS pool version 28, ZFS filesystem version 5
[23886.016070]  zd0: unknown partition table
[23886.046164]  zd0: unknown partition table
[23886.548573]  zd0: p1
[23887.804545] BUG: unable to handle kernel NULL pointer dereference at 0000000000000020
[23887.810231] IP: [] get_next_timer_interrupt+0x145/0x223
[23887.810231] PGD 9ee8c067 PUD 8ba78067 PMD 0
[23887.810231] Oops: 0000 [#1] SMP
[23887.810231] last sysfs file: /sys/devices/virtual/block/zd0/removable
[23887.810231] CPU 1
[23887.810231] Modules linked in: zfs(P) zcommon(P) zunicode(P) znvpair(P) zavl(P) splat spl scsi_debug zlib_deflate crc32c]
[23887.810231] Pid: 0, comm: swapper Tainted: P           2.6.32-5-amd64 #1
[23887.810231] RIP: 0010:[]  [] get_next_timer_interrupt+0x145/0x223
[23887.810231] RSP: 0018:ffff8800bf685e68  EFLAGS: 00010093
[23887.810231] RAX: 000000000000e400 RBX: ffff8800bf67c000 RCX: 000000000000003b
[23887.810231] RDX: 0000000000000020 RSI: 000000000000003b RDI: 0000000000000020
[23887.810231] RBP: ffff8800bf67d3d0 R08: ffff8800bf685e78 R09: 00000000010059fb
[23887.810231] R10: 000000000000e400 R11: 0000000000000001 R12: 000000010059faf7
[23887.810231] R13: ffff8800bf67d020 R14: 0000000000000040 R15: 0000000000000000
[23887.810231] FS:  0000000000000000(0000) GS:ffff880001880000(0000) knlGS:0000000000000000
[23887.810231] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[23887.810231] CR2: 0000000000000020 CR3: 0000000065e3c000 CR4: 00000000000006e0
[23887.810231] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[23887.810231] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[23887.810231] Process swapper (pid: 0, threadinfo ffff8800bf684000, task ffff8800bf663880)
[23887.810231] Stack:
[23887.810231]  0000000000000001 0000000000000001 ffff8800bf67d020 ffff8800bf67d420
[23887.810231] <0> ffff8800bf67d820 ffff8800bf67dc20 ffff8800018901b0 00000000003209d2
[23887.810231] <0> 0000000000000001 0000014b7a430000 000000010059faf7 ffffffff81070cd1
[23887.810231] Call Trace:
[23887.810231]  [] ? tick_nohz_stop_sched_tick+0x327/0x335
[23887.810231]  [] ? cpu_idle+0x73/0xda
[23887.810231] Code: 4a 8b 14 28 49 8d 6c 05 00 eb 20 f6 42 28 01 4c 89 d0 75 11 48 8b 42 10 41 bb 01 00 00 00 4c 39 d0 49
[23887.810231] RIP  [] get_next_timer_interrupt+0x145/0x223
[23887.810231]  RSP 
[23887.810231] CR2: 0000000000000020

@kohlschuetter
Copy link
Contributor

What do
cat /sys/devices/system/clocksource/clocksource0/current_clocksource
cat /sys/devices/system/clocksource/clocksource0/available_clocksource
say?

@ggruber
Copy link
Author

ggruber commented Jun 16, 2011

root@jena:~# cat /sys/devices/system/clocksource/clocksource0/current_clocksource
tsc
root@jena:~# cat /sys/devices/system/clocksource/clocksource0/available_clocksource
tsc hpet acpi_pm
root@jena:~#

@behlendorf
Copy link
Contributor

I occasionally see these failures as well for zconfig.sh. Sometimes they are related to overrunning the kernel stack, usually you'll see a message like this in the console logs when this the case.

Thread overran stack, or stack corrupted

I believe there is also a rare related to unloading the kernel modules which is occasionally hit. We're working to run both of these issue down because we need the tests to run reliably. Luckily however, both of these issue are very rarely encounter during real filesystem usage. The test suite just happens to be good at hitting them.

@kohlschuetter
Copy link
Contributor

May this be a somewhat related bug? https://bugzilla.kernel.org/show_bug.cgi?id=25832

@ggruber How are your devices connected? (SATA, USB, etc.?) What CPU/disks do you have?

As it seems that you can reliably reproduce the failure, you should try to disable (one at a time) hyperthreading ("noht" kernel boot option), smp completely ("maxcpus=1 nosmp"), or dynamic ticks ("nohz=off") to see any change.
Maybe changing the clocksource could also help, I found some discussions about that in a somewhat unrelated discussion (echo "hpet" > /sys/.../current_clocksource etc.)

@ggruber
Copy link
Author

ggruber commented Jun 18, 2011

I cannot see anything more after those crashes because of the system completely stands. I have to push the reset button to go on. The crashdumps I caught with a serial console.
This is an elder Intel server board. The disks I had liked the bring into the network as a bunch of iSCSI Volumes are connected to a PCI-X LSI 8 port SATA RAID controller and a onboard 4 port LSI RAID controller. But those are not in the game so far. The system is installed on a classical 40G PATA disk, with ext4 on top of LVM, no partitions elsewhere so far.
The proc is a elder P4 with HT and 64bit extensions.
So I can turn of HT easily.
I had started with a first test on a VMware VM on a Phenom host, and moved after the first similar crash to this physical system.
Because I cannot reset the system remotely I will continue the tests on Monday.
Maybe I will have to move to another hardware 'cause the boss urges me to bring the storage in use, and this zfs test was a suggestion of mine (after some bad experience with unnoticed data corruption during the last year in a non company related context).

@kohlschuetter
Copy link
Contributor

If you can reproduce the crash on the VM, you could also try the proposed changes with that setting and see if it makes a difference.

Right now, I am experimenting with an AMD Fusion E350 (Zacate) board (from Zotac), 8 GB RAM and 5x2TB WD20EARS disks, running RHEL6. I am very satisfied with the initial results (fanless, 32W standby, 45W under load, 60W max), so you might give that configuration a try.

@ggruber
Copy link
Author

ggruber commented Jun 20, 2011

hyperthreading disabled:

root@jena:~# /usr/libexec/zfs/zconfig.sh -c
1    persistent zpool.cache             Pass
2    scan disks for pools to import     Pass
3    zpool import/export device         Pass
4    zpool insmod/rmmod device          Pass
5    zvol+ext3 volume                   Pass
6    zvol+ext2 snapshot                 Pass
7    zvol+ext2 clone                    Pass
8    zfs send/receive                   Pass
9    zpool events
[ 7508.065245] SPLAT: Loaded module v0.6.0
[ 7508.242278] ZFS: Loaded module v0.6.0, ZFS pool version 28, ZFS filesystem version 5
[ 7510.811259]  zd0: unknown partition table
[ 7510.838777]  zd0: unknown partition table
[ 7511.368047]  zd0: p1
[ 7526.404160]  zd16: p1
[ 7530.134678]  zd32: p1
[ 7530.155745]  zd48: p1
[ 7530.562471] EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
[ 7553.255996] ZFS: Unloaded module v0.6.0
[ 7553.313277] SPLAT: Unloaded module v0.6.0
[ 7553.334385] SPL: Unloaded module v0.6.0
[ 7553.486762] SPL: Loaded module v0.6.0, using hostid 0xc00a622c
[ 7553.522403] SPLAT: Loaded module v0.6.0
[ 7553.698481] ZFS: Loaded module v0.6.0, ZFS pool version 28, ZFS filesystem version 5
[ 7555.358917]  zd0: unknown partition table
[ 7558.020027] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 7558.024002] IP: [] get_next_timer_interrupt+0x145/0x223
[ 7558.024002] PGD 529b4067 PUD 119ca067 PMD 0
[ 7558.024002] Thread overran stack, or stack corrupted
[ 7558.024002] Oops: 0000 [#1] SMP
[ 7558.024002] last sysfs file: /sys/devices/virtual/bdi/230:0/uevent
[ 7558.024002] CPU 0
[ 7558.024002] Modules linked in: zfs(P) zcommon(P) zunicode(P) znvpair(P) zavl(P) splat spl ext2 ext]
[ 7558.024002] Pid: 0, comm: swapper Tainted: P           2.6.32-5-amd64 #1
[ 7558.024002] RIP: 0010:[]  [] get_next_timer_interrupt+0x145/0x3
[ 7558.024002] RSP: 0018:ffff880001803ed8  EFLAGS: 00010093
[ 7558.024002] RAX: 0000000000000000 RBX: ffffffff815f97c0 RCX: 0000000000000030
[ 7558.024002] RDX: 0000000000000000 RSI: 0000000000000030 RDI: 0000000000000000
[ 7558.024002] RBP: ffffffff815faae0 R08: ffff880001803ee8 R09: 0000000001001bb0
[ 7558.024002] R10: 0000000000000000 R11: 0000000000000001 R12: 00000001001bafe9
[ 7558.024002] R13: ffffffff815fa7e0 R14: 0000000000000040 R15: 0000000000000000
[ 7558.024002] FS:  0000000000000000(0000) GS:ffff880001800000(0000) knlGS:0000000000000000
[ 7558.024002] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[ 7558.024002] CR2: 0000000000000000 CR3: 000000009c51f000 CR4: 00000000000006f0
[ 7558.024002] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 7558.024002] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 7558.024002] Process swapper (pid: 0, threadinfo ffffffff8142e000, task ffffffff814631f0)
[ 7558.024002] Stack:
[ 7558.024002]  ffffffff8109845e 0000000000000100 ffffffff815fa7e0 ffffffff815fabe0
[ 7558.024002] <0> ffffffff815fafe0 ffffffff815fb3e0 ffff8800018101b0 00000000000a6386
[ 7558.024002] <0> 0000000000000000 0000014b78720000 00000001001bafe9 ffffffff81070cd1
[ 7558.024002] Call Trace:
[ 7558.024002]  
[ 7558.024002]  [] ? rcu_process_callbacks+0x45/0x4a
[ 7558.024002]  [] ? tick_nohz_stop_sched_tick+0x327/0x335
[ 7558.024002]  [] ? smp_apic_timer_interrupt+0x87/0x95
[ 7558.024002]  [] ? apic_timer_interrupt+0x13/0x20
[ 7558.024002]  
[ 7558.024002]  [] ? mwait_idle+0x72/0x7d
[ 7558.024002]  [] ? mwait_idle+0x22/0x7d
[ 7558.024002]  [] ? cpu_idle+0xa2/0xda
[ 7558.024002]  [] ? early_idt_handler+0x0/0x71
[ 7558.024002]  [] ? start_kernel+0x3dc/0x3e8
[ 7558.024002]  [] ? x86_64_start_kernel+0xf9/0x106
[ 7558.024002] Code: 4a 8b 14 28 49 8d 6c 05 00 eb 20 f6 42 28 01 4c 89 d0 75 11 48 8b 42 10 41 bb 01
[ 7558.024002] RIP  [] get_next_timer_interrupt+0x145/0x223
[ 7558.024002]  RSP 
[ 7558.024002] CR2: 0000000000000000
[ 7558.024002] ---[ end trace 538adae46eb83794 ]---
[ 7558.024002] Kernel panic - not syncing: Attempted to kill the idle task!
[ 7558.024002] Pid: 0, comm: swapper Tainted: P      D    2.6.32-5-amd64 #1
[ 7558.024002] Call Trace:
[ 7558.024002]    [] ? panic+0x86/0x143
[ 7558.024002]  [] ? printk+0x4e/0x5a
[ 7558.024002]  [] ? printk+0x4e/0x5a
[ 7558.024002]  [] ? do_exit+0x72/0x6c6
[ 7558.024002]  [] ? release_console_sem+0x17e/0x1af
[ 7558.024002]  [] ? oops_end+0xaf/0xb4
[ 7558.024002]  [] ? no_context+0x1e9/0x1f8
[ 7558.024002]  [] ? lock_timer_base+0x26/0x4b
[ 7558.024002]  [] ? __bad_area_nosemaphore+0x1a6/0x1ca
[ 7558.024002]  [] ? ipmi_timeout+0x0/0x424 [ipmi_msghandler]
[ 7558.024002]  [] ? ipmi_timeout+0x415/0x424 [ipmi_msghandler]
[ 7558.024002]  [] ? e1000_clean+0x45e/0x49f [e1000]
[ 7558.024002]  [] ? page_fault+0x25/0x30
[ 7558.024002]  [] ? get_next_timer_interrupt+0x145/0x223
[ 7558.024002]  [] ? get_next_timer_interrupt+0x4f/0x223
[ 7558.024002]  [] ? rcu_process_callbacks+0x45/0x4a
[ 7558.024002]  [] ? tick_nohz_stop_sched_tick+0x327/0x335
[ 7558.024002]  [] ? smp_apic_timer_interrupt+0x87/0x95
[ 7558.024002]  [] ? apic_timer_interrupt+0x13/0x20
[ 7558.024002]    [] ? mwait_idle+0x72/0x7d
[ 7558.024002]  [] ? mwait_idle+0x22/0x7d
[ 7558.024002]  [] ? cpu_idle+0xa2/0xda
[ 7558.024002]  [] ? early_idt_handler+0x0/0x71
[ 7558.024002]  [] ? start_kernel+0x3dc/0x3e8
[ 7558.024002]  [] ? x86_64_start_kernel+0xf9/0x106

@ggruber
Copy link
Author

ggruber commented Jun 20, 2011

with maxcpus=1 nosmp, hyperthreading in BIOS still off

[  209.478480] SPL: Loaded module v0.6.0, using hostid 0xc00a622c
[  209.514164] SPLAT: Loaded module v0.6.0
[  210.070731]  zd0: p1 p2
[  210.081289]  zd16: p1 p2
[  210.092158]  zd32: p1 p2
[  210.101823] ZFS: Loaded module v0.6.0, ZFS pool version 28, ZFS filesystem version 5
[  212.960044] ------------[ cut here ]------------
[  212.964012] kernel BUG at /build/buildd-linux-2.6_2.6.32-34squeeze1-amd64-ZyRy0W/linux-2.6-2.6.32/!
[  212.964012] invalid opcode: 0000 [#1] SMP
[  212.964012] last sysfs file: /sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0/0000:03:02.0/0000:0t
[  212.964012] CPU 0
[  212.964012] Modules linked in: zfs(P) zcommon(P) zunicode(P) znvpair(P) zavl(P) splat spl zlib_def]
[  212.964012] Pid: 11012, comm: path_id Tainted: P           2.6.32-5-amd64 #1
[  212.964012] RIP: 0010:[]  [] cascade+0x53/0x77
[  212.964012] RSP: 0000:ffff880001803e10  EFLAGS: 00010083
[  212.964012] RAX: ffffffffa07472e0 RBX: ffffffff81063743 RCX: ffffffff815faa90
[  212.964012] RDX: 00000000000002b0 RSI: ffff88009e4da038 RDI: ffffffff815f97c0
[  212.964012] RBP: 000000000000002b R08: 0000000000000071 R09: ffffffff814654c0
[  212.964012] R10: 0000000000000001 R11: ffffffff8102b97e R12: ffffffff815f97c0
[  212.964012] R13: ffff880001803e10 R14: 0000000000000101 R15: ffff8800be21ffd8
[  212.964012] FS:  00007f05561f1700(0000) GS:ffff880001800000(0000) knlGS:0000000000000000
[  212.964012] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  212.964012] CR2: 00007f0555c8a03c CR3: 00000000bab5a000 CR4: 00000000000006f0
[  212.964012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  212.964012] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  212.964012] Process path_id (pid: 11012, threadinfo ffff8800be21e000, task ffff88008dde9530)
[  212.964012] Stack:
[  212.964012]  ffff88009e4da038 ffff8800be3e26d0 ffff8800018101b0 0000000000000000
[  212.964012] <0> ffffffff815f97c0 ffffffff81451088 0000000000000001 ffffffff8105a4da
[  212.964012] <0> 0000000000000000 ffff880001803e90 ffff8800be21ffd8 ffffffff81468700
[  212.964012] Call Trace:
[  212.964012]  
[  212.964012]  [] ? run_timer_softirq+0xb0/0x268
[  212.964012]  [] ? hrtimer_interrupt+0xe4/0x18d
[  212.964012]  [] ? __do_softirq+0xdd/0x1a6
[  212.964012]  [] ? call_softirq+0x1c/0x30
[  212.964012]  [] ? do_softirq+0x3f/0x7c
[  212.964012]  [] ? irq_exit+0x36/0x76
[  212.964012]  [] ? do_IRQ+0xa0/0xb6
[  212.964012]  [] ? ret_from_intr+0x0/0x11
[  212.964012]  
[  212.964012] Code: 08 48 8b 41 08 48 89 44 24 08 48 89 20 48 89 0c 32 48 89 49 08 48 8b 34 24 48 8b
[  212.964012] RIP  [] cascade+0x53/0x77
[  212.964012]  RSP 
[  212.964012] ---[ end trace e0eaa0ff2c198091 ]---
[  212.964012] Kernel panic - not syncing: Fatal exception in interrupt
[  212.964012] Pid: 11012, comm: path_id Tainted: P      D    2.6.32-5-amd64 #1
[  212.964012] Call Trace:
[  212.964012]    [] ? panic+0x86/0x143
[  212.964012]  [] ? show_registers+0x1e9/0x1f8
[  212.964012]  [] ? up+0xe/0x37
[  212.964012]  [] ? release_console_sem+0x17e/0x1af
[  212.964012]  [] ? oops_end+0xa7/0xb4
[  212.964012]  [] ? do_invalid_op+0x8b/0x95
[  212.964012]  [] ? cascade+0x53/0x77
[  212.964012]  [] ? autoremove_wake_function+0x9/0x2e
[  212.964012]  [] ? __wake_up_common+0x44/0x72
[  212.964012]  [] ? mix_pool_bytes_extract+0x57/0x14a
[  212.964012]  [] ? invalid_op+0x1b/0x20
[  212.964012]  [] ? param_attr_show+0x0/0x3e
[  212.964012]  [] ? hpet_legacy_next_event+0x0/0x7
[  212.964012]  [] ? cascade+0x53/0x77
[  212.964012]  [] ? run_timer_softirq+0xb0/0x268
[  212.964012]  [] ? hrtimer_interrupt+0xe4/0x18d
[  212.964012]  [] ? __do_softirq+0xdd/0x1a6
[  212.964012]  [] ? call_softirq+0x1c/0x30
[  212.964012]  [] ? do_softirq+0x3f/0x7c
[  212.964012]  [] ? irq_exit+0x36/0x76
[  212.964012]  [] ? do_IRQ+0xa0/0xb6
[  212.964012]  [] ? ret_from_intr+0x0/0x11
[  212.964012]  

@ggruber
Copy link
Author

ggruber commented Jun 20, 2011

nohz=off

root@jena:/tmp# /usr/libexec/zfs/zconfig.sh -c
1    persistent zpool.cache             Pass
2    scan disks for pools to import     Pass
3    zpool import/export device         Pass
4    zpool insmod/rmmod device          Pass
5    zvol+ext3 volume                   Pass
6    zvol+ext2 snapshot                 Pass
7    zvol+ext2 clone                    Pass
8    zfs send/receive                   Pass
9    zpool events
[ 1431.718838] SPL: Loaded module v0.6.0, using hostid 0xc00a622c
[ 1431.754386] SPLAT: Loaded module v0.6.0
[ 1431.930787] ZFS: Loaded module v0.6.0, ZFS pool version 28, ZFS filesystem version 5
[ 1433.575698]  zd0: unknown partition table
[ 1435.616032] ------------[ cut here ]------------
[ 1435.620002] kernel BUG at /build/buildd-linux-2.6_2.6.32-34squeeze1-amd64-ZyRy0W/linux-2.6-2.6.32/!
[ 1435.620002] invalid opcode: 0000 [#1] SMP
[ 1435.620002] last sysfs file: /sys/devices/virtual/bdi/230:0/uevent
[ 1435.620002] CPU 0
[ 1435.620002] Modules linked in: zfs(P) zcommon(P) zunicode(P) znvpair(P) zavl(P) splat spl ext2 ext]
[ 1435.620002] Pid: 0, comm: swapper Tainted: P           2.6.32-5-amd64 #1
[ 1435.620002] RIP: 0010:[]  [] cascade+0x53/0x77
[ 1435.620002] RSP: 0018:ffff880001803e20  EFLAGS: 00010096
[ 1435.620002] RAX: 480002baabe8007c RBX: 0000017a850fc389 RCX: 00000001000455d6
[ 1435.620002] RDX: ffffffff815fa540 RSI: ffff8800b6a83088 RDI: ffffffff815f97c0
[ 1435.620002] RBP: 0000000000000015 R08: 00000000000000d6 R09: ffffffff814654c0
[ 1435.620002] R10: 0000000000000000 R11: ffffffff813cf21c R12: ffffffff815f97c0
[ 1435.620002] R13: ffff880001803e20 R14: 0000000000000101 R15: ffffffff8142ffd8
[ 1435.620002] FS:  0000000000000000(0000) GS:ffff880001800000(0000) knlGS:0000000000000000
[ 1435.620002] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[ 1435.620002] CR2: 00007f6dc74f9760 CR3: 00000000a04f8000 CR4: 00000000000006f0
[ 1435.620002] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 1435.620002] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 1435.620002] Process swapper (pid: 0, threadinfo ffffffff8142e000, task ffffffff814631f0)
[ 1435.620002] Stack:
[ 1435.620002]  ffff8800b0ae1e10 ffff8800be2d3f98 ffffffff8101650e 0000000000000000
[ 1435.620002] <0> ffffffff815f97c0 ffffffff81451088 0000000000000001 ffffffff8105a4da
[ 1435.620002] <0> ffffffff81069257 ffff880001803ea0 ffffffff8142ffd8 ffff880001815780
[ 1435.620002] Call Trace:
[ 1435.620002]  
[ 1435.620002]  [] ? native_sched_clock+0x2e/0x66
[ 1435.620002]  [] ? run_timer_softirq+0xb0/0x268
[ 1435.620002]  [] ? sched_clock_local+0x13/0x74
[ 1435.620002]  [] ? __do_softirq+0xdd/0x1a6
[ 1435.620002]  [] ? lapic_next_event+0x18/0x1d
[ 1435.620002]  [] ? call_softirq+0x1c/0x30
[ 1435.620002]  [] ? do_softirq+0x3f/0x7c
[ 1435.620002]  [] ? irq_exit+0x36/0x76
[ 1435.620002]  [] ? smp_apic_timer_interrupt+0x87/0x95
[ 1435.620002]  [] ? apic_timer_interrupt+0x13/0x20
[ 1435.620002]  
[ 1435.620002]  [] ? mwait_idle+0x72/0x7d
[ 1435.620002]  [] ? mwait_idle+0x22/0x7d
[ 1435.620002]  [] ? cpu_idle+0xa2/0xda
[ 1435.620002]  [] ? early_idt_handler+0x0/0x71
[ 1435.620002]  [] ? start_kernel+0x3dc/0x3e8
[ 1435.620002]  [] ? x86_64_start_kernel+0xf9/0x106
[ 1435.620002] Code: 08 48 8b 41 08 48 89 44 24 08 48 89 20 48 89 0c 32 48 89 49 08 48 8b 34 24 48 8b
[ 1435.620002] RIP  [] cascade+0x53/0x77
[ 1435.620002]  RSP 
[ 1435.620002] ---[ end trace b830fe4da3d5990a ]---
[ 1435.620002] Kernel panic - not syncing: Fatal exception in interrupt
[ 1435.620002] Pid: 0, comm: swapper Tainted: P      D    2.6.32-5-amd64 #1
[ 1435.620002] Call Trace:
[ 1435.620002]    [] ? panic+0x86/0x143
[ 1435.620002]  [] ? show_registers+0x1e9/0x1f8
[ 1435.620002]  [] ? up+0xe/0x37
[ 1435.620002]  [] ? release_console_sem+0x17e/0x1af
[ 1435.620002]  [] ? oops_end+0xa7/0xb4
[ 1435.620002]  [] ? do_invalid_op+0x8b/0x95
[ 1435.620002]  [] ? cascade+0x53/0x77
[ 1435.620002]  [] ? e1000_clean_rx_irq+0x3bf/0x3fb [e1000]
[ 1435.620002]  [] ? lock_timer_base+0x26/0x4b
[ 1435.620002]  [] ? invalid_op+0x1b/0x20
[ 1435.620002]  [] ? cascade+0x53/0x77
[ 1435.620002]  [] ? native_sched_clock+0x2e/0x66
[ 1435.620002]  [] ? run_timer_softirq+0xb0/0x268
[ 1435.620002]  [] ? sched_clock_local+0x13/0x74
[ 1435.620002]  [] ? __do_softirq+0xdd/0x1a6
[ 1435.620002]  [] ? lapic_next_event+0x18/0x1d
[ 1435.620002]  [] ? call_softirq+0x1c/0x30
[ 1435.620002]  [] ? do_softirq+0x3f/0x7c
[ 1435.620002]  [] ? irq_exit+0x36/0x76
[ 1435.620002]  [] ? smp_apic_timer_interrupt+0x87/0x95
[ 1435.620002]  [] ? apic_timer_interrupt+0x13/0x20
[ 1435.620002]    [] ? mwait_idle+0x72/0x7d
[ 1435.620002]  [] ? mwait_idle+0x22/0x7d
[ 1435.620002]  [] ? cpu_idle+0xa2/0xda
[ 1435.620002]  [] ? early_idt_handler+0x0/0x71
[ 1435.620002]  [] ? start_kernel+0x3dc/0x3e8
[ 1435.620002]  [] ? x86_64_start_kernel+0xf9/0x106

@ggruber
Copy link
Author

ggruber commented Jun 20, 2011

what do you think about compiling a vanilla latest 2.6.32 with the same config as distro kernel and testing with this?
Or using a backport 2.6.38?

@kohlschuetter
Copy link
Contributor

I'm pretty satisfied with the RHEL6 kernel (Scientific Linux 6.0, 2.6.32-131.2.1.el6.x86_64). Curious if that one still triggers your bug.

@kohlschuetter
Copy link
Contributor

Regarding the hyper-threading setting, did you use the "noht" kernel boot option or just changed the BIOS option?

@ggruber
Copy link
Author

ggruber commented Jun 20, 2011

changed bios and observed that there was only 1 CPU in /proc/cpuinfo

@kohlschuetter
Copy link
Contributor

Could you please try "noht" as a boot option? Just as a precaution.

@behlendorf
Copy link
Contributor

Several of your crashes can be attributed to stack overflow. If you rebuild your kernel with the following small patch it should improve the stability. As mentioned above we've improved our stack issues but it's not quite 100% yet. However, in real usage these issues seem fairly rare (but they of course need to be impossible).

diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/page_64_t
index 7639dbf..ec86f1b 100644
--- a/arch/x86/include/asm/page_64_types.h
+++ b/arch/x86/include/asm/page_64_types.h
@@ -1,7 +1,7 @@
 #ifndef _ASM_X86_PAGE_64_DEFS_H
 #define _ASM_X86_PAGE_64_DEFS_H
 
-#define THREAD_ORDER   1
+#define THREAD_ORDER   2
 #define THREAD_SIZE  (PAGE_SIZE << THREAD_ORDER)
 #define CURRENT_MASK (~(THREAD_SIZE - 1))
 

I suppose I should really update the FAQ and place this in there.

@kohlschuetter
Copy link
Contributor

@behlendorf
So the proposed change in fact doubles the THREAD_SIZE, which still is dependent on PAGE_SIZE.

I wonder why I don't get these crashes then, having THREAD_ORDER=1 and a page size of 4k:

#include <stdio.h>
#include <unistd.h>

int main()
{
printf("%i\n",getpagesize());
return 0;
}

@behlendorf
Copy link
Contributor

Linux memory reclamation isn't deterministic. It's depends on a lot of factors how much memory you have, your workload, your kernel version, what other applications are running, how you compiled your kernel, etc. The RHEL kernels tend to be a little lighter on the stack in my experience which helps, particularly because we don't overrun by much. We're still working to identify the remaining offending call paths and get them fixed.

Increasing the THREAD_ORDER should allow you to avoid the issue entirely, it will cost you a little memory and perhaps minutely impact your fork/exec speed but that's it. That's a price most folks are willing to pay. However, I don't want people to have to rebuild their kernels so I do want to get this fixed.

@ggruber
Copy link
Author

ggruber commented Jun 20, 2011

@behlendorf
should I apply your suggested patch against the debian squeeze distro sources or against vanilla 2.6.32.41?
probably against the distro sources ?

@behlendorf
Copy link
Contributor

It should apply cleanly to either, so pick which ever kernel you prefer. I'd opt for the distro sources personally.

@ggruber
Copy link
Author

ggruber commented Jun 21, 2011

built distro kernel with +#define THREAD_ORDER 2, rebuilt spl + zfs stuff.
during ./configure for spl I got a

[  557.064065] conftest[3389]: segfault at 2aecd48bfe78 ip 000000000040078e sp 00002aecd48c3e80 error 6 in conftest[400000+1000]

and during ./configure for zfs

[ 2466.740047] conftest[19221]: segfault at 7fb2615d3e78 ip 000000000040078e sp 00007fb2615d7e80 error 6 in conftest[400000+1000]
.

make deb ran fine.
another

[ 3231.400299] conftest[7862]: segfault at 2b48181dde78 ip 000000000040078e sp 00002b48181e1e80 error 6 in conftest[400000+1000]

during make deb for zfs (git from Jun 16th)
HT still off in BIOS, no additional kernel params.
spl tests fine (3 times)

first /usr/libexec/zfs/zconfig.sh -c

root@jena:/tmp# /usr/libexec/zfs/zconfig.sh -c
1    persistent zpool.cache             Pass
2    scan disks for pools to import     Pass
3    zpool import/export device         Pass
4    zpool insmod/rmmod device          Pass
5    zvol+ext3 volume                   Pass
6    zvol+ext2 snapshot                 Pass
7    zvol+ext2 clone                    Pass
8    zfs send/receive                   Pass
9    zpool events

gives:

[ 3803.906804] SPLAT: Loaded module v0.6.0
[ 3804.087191] ZFS: Loaded module v0.6.0, ZFS pool version 28, ZFS filesystem version 5
[ 3805.711231]  zd0: unknown partition table
[ 3808.216031] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 3808.220002] IP: [] get_next_timer_interrupt+0x155/0x230
[ 3808.220002] PGD ae0bd067 PUD 930d6067 PMD 0
[ 3808.220002] Thread overran stack, or stack corrupted
[ 3808.220002] Oops: 0000 [#1] SMP
[ 3808.220002] last sysfs file: /sys/devices/virtual/bdi/230:0/uevent
[ 3808.220002] CPU 0
[ 3808.220002] Modules linked in: zfs(P) zcommon(P) zunicode(P) znvpair(P) zavl(P) splat spl ext2 ext3 jbd zlib_deflate loop megaraid_mbox megaraid_mm radeon snd_pcm ttm snd_timer sg snd drm_kms_helper soundcore uhci_hcd snd_page_alloc drm ehci_hcd i2c_i801 shpchp i3000_edac i2c_algo_bit sr_mod igb ipmi_msghandler usbcore pcspkr floppy processor i2c_core pci_hotplug edac_core ahci e1000e e1000 cdrom evdev dca button rng_core psmouse serio_raw nls_base ext4 mbcache jbd2 crc16 dm_mod raid10 raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx raid1 raid0 multipath linear md_mod sd_mod crc_t10dif ata_generic ata_piix thermal libata scsi_mod thermal_sys [last unloaded: spl]
[ 3808.220002] Pid: 0, comm: swapper Tainted: P           2.6.32-1-amd64 #1
[ 3808.220002] RIP: 0010:[]  [] get_next_timer_interrupt+0x155/0x230
[ 3808.220002] RSP: 0018:ffffffff81423e68  EFLAGS: 00010093
[ 3808.220002] RAX: 0000000000000022 RBX: 00000001000d61fe RCX: 0000000000000022
[ 3808.220002] RDX: 0000000000000000 RSI: 0000000001000d62 RDI: 0000000000000000
[ 3808.220002] RBP: 0000000000000000 R08: ffffffff81423e78 R09: 0000000000000001
[ 3808.220002] R10: ffffffff815ed7e0 R11: 0000000000000000 R12: ffffffff815ec7c0
[ 3808.220002] R13: 0000000000000040 R14: ffffffff81423e98 R15: ffffffff815eda00
[ 3808.220002] FS:  0000000000000000(0000) GS:ffff880001800000(0000) knlGS:0000000000000000
[ 3808.220002] CS:  0010 DS: 0018 ES: 0018 CR0: 000000008005003b
[ 3808.220002] CR2: 0000000000000000 CR3: 00000000243d0000 CR4: 00000000000006f0
[ 3808.220002] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 3808.220002] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 3808.220002] Process swapper (pid: 0, threadinfo ffffffff81420000, task ffffffff814571f0)
[ 3808.220002] Stack:
[ 3808.220002]  ffff880001815780 ffffffff812f5d75 ffffffff815ed7e0 ffffffff815edbe0
[ 3808.220002] <0> ffffffff815edfe0 ffffffff815ee3e0 ffffffff81016426 ffff8800018101b0
[ 3808.220002] <0> 0000000000000000 00000000000ca21c 00000001000d61fe 00000376ab5cc772
[ 3808.220002] Call Trace:
[ 3808.220002]  [] ? thread_return+0x77/0xdc
[ 3808.220002]  [] ? read_tsc+0x5/0x16
[ 3808.220002]  [] ? tick_nohz_stop_sched_tick+0x314/0x329
[ 3808.220002]  [] ? cpu_idle+0x74/0xdd
[ 3808.220002]  [] ? start_kernel+0x3da/0x3e5
[ 3808.220002]  [] ? x86_64_start_kernel+0xf9/0x106
[ 3808.220002] Code: 4a 8b 7c 15 00 4d 8d 3c 2a eb 20 f6 47 28 01 48 89 d5 75 11 48 8b 6f 10 41 b9 01 00 00 00 48 39 d5 48 0f 49 ea 4c 89 df 48 89 ea <4c> 8b 1f 4c 39 ff 41 0f 18 0b 75 d4 45 85 c9 74 10 39 c8 7c 04
[ 3808.220002] RIP  [] get_next_timer_interrupt+0x155/0x230
[ 3808.220002]  RSP 
[ 3808.220002] CR2: 0000000000000000
[ 3808.220002] ---[ end trace 86ccfb0a83ef5ce6 ]---
[ 3808.220002] Kernel panic - not syncing: Attempted to kill the idle task!
[ 3808.220002] Pid: 0, comm: swapper Tainted: P      D    2.6.32-1-amd64 #1
[ 3808.220002] Call Trace:
[ 3808.220002]  [] ? panic+0x78/0x133
[ 3808.220002]  [] ? do_exit+0x73/0x6c6
[ 3808.220002]  [] ? release_console_sem+0x17e/0x1ae
[ 3808.220002]  [] ? oops_end+0xb1/0xb6
[ 3808.220002]  [] ? no_context+0x1e7/0x1f4
[ 3808.220002]  [] ? page_fault+0x25/0x30
[ 3808.220002]  [] ? get_next_timer_interrupt+0x155/0x230
[ 3808.220002]  [] ? get_next_timer_interrupt+0x51/0x230
[ 3808.220002]  [] ? thread_return+0x77/0xdc
[ 3808.220002]  [] ? read_tsc+0x5/0x16
[ 3808.220002]  [] ? tick_nohz_stop_sched_tick+0x314/0x329
[ 3808.220002]  [] ? cpu_idle+0x74/0xdd
[ 3808.220002]  [] ? start_kernel+0x3da/0x3e5
[ 3808.220002]  [] ? x86_64_start_kernel+0xf9/0x106

@ggruber ggruber closed this as completed Jun 21, 2011
@kohlschuetter
Copy link
Contributor

@ggruber

I also encountered the conftest segfault. This is something different, and I guess it can be ignored for now.

So THREAD_ORDER 2 did not help?

@ggruber ggruber reopened this Jun 21, 2011
@behlendorf
Copy link
Contributor

Gerold, thanks for giving the project a try. I understand your situation completely, everything really needs to be rock solid before using this is a real production environment. You might want to try to project again was we get past putting out release candidates. At that point we should be past these sorts of issues.

@ggruber
Copy link
Author

ggruber commented Jul 8, 2011

ok, I'll stay tuned :-)

@behlendorf
Copy link
Contributor

This issue has been fixed by spl commit openzfs/spl@64c075c .

dajhorn referenced this issue in zfsonlinux/pkg-zfs Oct 12, 2011
In a non-debug build the ASSERT() would be optimized away
which could cause pending work items to not be cancelled.

We must also use cancel_delayed_work_sync() rather than just
cancel_delayed_work() to actually wait until work items have
completed.  Otherwise they might accidentally access free'd
memory.

Signed-off-by: Brian Behlendorf <[email protected]>
Closes ZFS bugs #279, #62, #363, #418
@Rudd-O
Copy link
Contributor

Rudd-O commented Oct 17, 2011

kernelOfTruth pushed a commit to kernelOfTruth/zfs that referenced this issue Mar 1, 2015
current_kernel_time() is used by the SPLAT, but it is not meant for
performance measurement. We modify the SPLAT to use getnstimeofday(),
which is equivalent to the gethrestime() function on Solaris.
Additionally, we update gethrestime() to invoke getnstimeofday().

Signed-off-by: Richard Yao <[email protected]>
Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs#279
lundman referenced this issue in openzfsonosx/openzfs Jul 17, 2020
macOS: change update_pages

OsX traditionally used uio_copy to re-issue the write in update_pages() to
mapped memory. Based on Linux, it is now changed to using dmu_read() from
ARC, to save the uio allocation, and clean up zfs_write().

macOS: enhance getxattr/setxattr code to use txg

macOS: xattr writes to use txg, and correct xattr=sa

macOS: rollback to notify finder of change

macOS: setattr(uid/gid) incorrect

macOS: replace obj-C function with C

macOS: move AT_ to ATTR_

Appears Linux already defines them in headers, but doesn't use them.
Since we just match to XNU values, it doesn't matter what they are called.

macOS: zfs_writes were truncated

macOS: zed changes

macOS: const required after C++ fix commit

macOS: Add primitive auto unmount of idle snapshots.

macOS: uio changes after uio PR

macOS: rebase master fixes

macOS: Add all tunables from upstream

macOS: compile fixes after vdev_disk_t change

macOS: remove semaphore.h

macOS: ASM changes after revert

macOS: correct cv_wait_sig() return values

macOS: Ensure to use -O2 on dsl_scan.c

macOS: Disambiguate condvar API contract

This is the macOS changes required for the upstream commit:
8056a75

Also correct cstyle.

remember cv_timedwait() and cv_timedwait_hires() take different
timeout types!

macOS: Add abd_cache_reap_now for abd_chunk_cache users

macOS changes needed for commit 7564073

macOS: fix zfstester get_time

macOS: can not pass componentname to ZFS

as ZFS updates the name with the real name (case insensitive) and
will cause panic, as it is often protected. We need to pass a buffer
to use, and use the returned name as cachename.

macOS: remove more tests to fit in 60mins

macOS: zvol need to verify and lock zv before use.

When called from, or calling to C++, we need to treat the stored 'zv'
as a unique ID, and verify it is still in the list of active zvols.
If found, lock in the correct order.

SPL: correct cstyle.pl

Fixed kmem_bufctl_cache memory leak (#273)

macOS: cstyle fixes, remove spl-kobj.c

macOS: cstyle.pl-ify SPL

macOS: cstyle.pl all kernel header files.

macOS: cstyle kernel C files

macOS: cstyle kernel CPP files

macOS: libspl cstyle fixes.

Fixed memory leak during zfs driver unload. (#279)

macOS: fix file based pools

zfs_dev_is_whole_disk() was incorrectly ported from FreeBSD. Use
Linux version.

macOS: undo previous decmpfs handling

We used to handle decmpfs by uncompressing it after the fact, but it
added a lot of dependencies.

Now, if decmpfs XATTR is set, AND it sets UF_COMPRESSED, AND is
followed by a setsize(0), we will ignore the truncate, and delete
the XATTR.

Only concern is if we were to stop a truncate, which was supposed to happen
(ie, not decmpfs related).

macOS: make codecheck pass

macOS: change send/recv pipe to fifo

Simplify kernel dependencies by only do IO on vnodes, and userland
to wrap send to pipes with fifo (opened to get vnode) and relay
child processes.
fuhrmannb pushed a commit to fuhrmannb/cstor that referenced this issue Nov 3, 2020
Related to: openebs/openebs#1295

Builds on top of openebs-archive/libcstor#35

- modify build_image.sh to decide which arch is used
- add libkqueue deb which is not found in apt source.

Signed-off-by: wangzihao <[email protected]>
lundman referenced this issue in openzfsonosx/openzfs Nov 11, 2020
macOS: change update_pages

OsX traditionally used uio_copy to re-issue the write in update_pages() to
mapped memory. Based on Linux, it is now changed to using dmu_read() from
ARC, to save the uio allocation, and clean up zfs_write().

macOS: enhance getxattr/setxattr code to use txg

macOS: xattr writes to use txg, and correct xattr=sa

macOS: rollback to notify finder of change

macOS: setattr(uid/gid) incorrect

macOS: replace obj-C function with C

macOS: move AT_ to ATTR_

Appears Linux already defines them in headers, but doesn't use them.
Since we just match to XNU values, it doesn't matter what they are called.

macOS: zfs_writes were truncated

macOS: zed changes

macOS: const required after C++ fix commit

macOS: Add primitive auto unmount of idle snapshots.

macOS: uio changes after uio PR

macOS: rebase master fixes

macOS: Add all tunables from upstream

macOS: compile fixes after vdev_disk_t change

macOS: remove semaphore.h

macOS: ASM changes after revert

macOS: correct cv_wait_sig() return values

macOS: Ensure to use -O2 on dsl_scan.c

macOS: Disambiguate condvar API contract

This is the macOS changes required for the upstream commit:
8056a75

Also correct cstyle.

remember cv_timedwait() and cv_timedwait_hires() take different
timeout types!

macOS: Add abd_cache_reap_now for abd_chunk_cache users

macOS changes needed for commit 7564073

macOS: fix zfstester get_time

macOS: can not pass componentname to ZFS

as ZFS updates the name with the real name (case insensitive) and
will cause panic, as it is often protected. We need to pass a buffer
to use, and use the returned name as cachename.

macOS: remove more tests to fit in 60mins

macOS: zvol need to verify and lock zv before use.

When called from, or calling to C++, we need to treat the stored 'zv'
as a unique ID, and verify it is still in the list of active zvols.
If found, lock in the correct order.

SPL: correct cstyle.pl

Fixed kmem_bufctl_cache memory leak (#273)

macOS: cstyle fixes, remove spl-kobj.c

macOS: cstyle.pl-ify SPL

macOS: cstyle.pl all kernel header files.

macOS: cstyle kernel C files

macOS: cstyle kernel CPP files

macOS: libspl cstyle fixes.

Fixed memory leak during zfs driver unload. (#279)

macOS: fix file based pools

zfs_dev_is_whole_disk() was incorrectly ported from FreeBSD. Use
Linux version.

macOS: undo previous decmpfs handling

We used to handle decmpfs by uncompressing it after the fact, but it
added a lot of dependencies.

Now, if decmpfs XATTR is set, AND it sets UF_COMPRESSED, AND is
followed by a setsize(0), we will ignore the truncate, and delete
the XATTR.

Only concern is if we were to stop a truncate, which was supposed to happen
(ie, not decmpfs related).

macOS: make codecheck pass

macOS: change send/recv pipe to fifo

Simplify kernel dependencies by only do IO on vnodes, and userland
to wrap send to pipes with fifo (opened to get vnode) and relay
child processes.

macOS: additional work
lundman referenced this issue in openzfsonosx/openzfs Dec 1, 2020
macOS: change update_pages

OsX traditionally used uio_copy to re-issue the write in update_pages() to
mapped memory. Based on Linux, it is now changed to using dmu_read() from
ARC, to save the uio allocation, and clean up zfs_write().

macOS: enhance getxattr/setxattr code to use txg

macOS: xattr writes to use txg, and correct xattr=sa

macOS: rollback to notify finder of change

macOS: setattr(uid/gid) incorrect

macOS: replace obj-C function with C

macOS: move AT_ to ATTR_

Appears Linux already defines them in headers, but doesn't use them.
Since we just match to XNU values, it doesn't matter what they are called.

macOS: zfs_writes were truncated

macOS: zed changes

macOS: const required after C++ fix commit

macOS: Add primitive auto unmount of idle snapshots.

macOS: uio changes after uio PR

macOS: rebase master fixes

macOS: Add all tunables from upstream

macOS: compile fixes after vdev_disk_t change

macOS: remove semaphore.h

macOS: ASM changes after revert

macOS: correct cv_wait_sig() return values

macOS: Ensure to use -O2 on dsl_scan.c

macOS: Disambiguate condvar API contract

This is the macOS changes required for the upstream commit:
8056a75

Also correct cstyle.

remember cv_timedwait() and cv_timedwait_hires() take different
timeout types!

macOS: Add abd_cache_reap_now for abd_chunk_cache users

macOS changes needed for commit 7564073

macOS: fix zfstester get_time

macOS: can not pass componentname to ZFS

as ZFS updates the name with the real name (case insensitive) and
will cause panic, as it is often protected. We need to pass a buffer
to use, and use the returned name as cachename.

macOS: remove more tests to fit in 60mins

macOS: zvol need to verify and lock zv before use.

When called from, or calling to C++, we need to treat the stored 'zv'
as a unique ID, and verify it is still in the list of active zvols.
If found, lock in the correct order.

SPL: correct cstyle.pl

Fixed kmem_bufctl_cache memory leak (#273)

macOS: cstyle fixes, remove spl-kobj.c

macOS: cstyle.pl-ify SPL

macOS: cstyle.pl all kernel header files.

macOS: cstyle kernel C files

macOS: cstyle kernel CPP files

macOS: libspl cstyle fixes.

Fixed memory leak during zfs driver unload. (#279)

macOS: fix file based pools

zfs_dev_is_whole_disk() was incorrectly ported from FreeBSD. Use
Linux version.

macOS: undo previous decmpfs handling

We used to handle decmpfs by uncompressing it after the fact, but it
added a lot of dependencies.

Now, if decmpfs XATTR is set, AND it sets UF_COMPRESSED, AND is
followed by a setsize(0), we will ignore the truncate, and delete
the XATTR.

Only concern is if we were to stop a truncate, which was supposed to happen
(ie, not decmpfs related).

macOS: make codecheck pass

macOS: change send/recv pipe to fifo

Simplify kernel dependencies by only do IO on vnodes, and userland
to wrap send to pipes with fifo (opened to get vnode) and relay
child processes.

macOS: additional work
mmaybee pushed a commit to mmaybee/openzfs that referenced this issue Apr 6, 2022
…penzfs#279)

We've seen an unexplained hang where it seems like
`Responder::response_task()` is not sending responses that it should.
To diagnose this, we would like to be able to look at its local
variables if we hit this hang again.  Specifically, the
`rx: mpsc::UnboundedReceiver<ResponseMessage>` and
`output: BufWriter<unix::OwnedWriteHalf>`.

This commit adds a new debugging mechanism, `DebugPointerSet` which is
based on `lazy_static_ptr!`.  It allows us to save several pointers (of
the same type) in a way that will let us find them from the debugger.

This additional mechanism is needed because (unlike existing users of
`lazy_static_ptr!`) there are multiple objects that we need to track
(the number not known at compile time, since it depends on the number of
active connections), and we want to hold references to the objects (rx,
output) without holding any locks.

`response_task()` uses `DebugPointerSet` to save its `rx` and `output`
in a way that we can find them in the debugger if it hangs or crashes.

Example use:
```
(gdb) p zettaobject::server::Responder::response_task::RESPOND_RECEIVERS_PTR.p.value.set.ptr.pointer.data.data.value
$5 = HashSet(size=1) = {
  util::lazy_static_ptr::DebugPointer<(tokio::sync::mpsc::unbounded::UnboundedReceiver<zettaobject::server::ResponseMessage>, tokio::io::util::buf_writer::BufWriter<tokio::net::unix::split_owned::OwnedWriteHalf>)> (
    0x7fc3247b6850
  )}

(gdb) p *(0x7fc3247b6850 as *const (tokio::sync::mpsc::unbounded::UnboundedReceiver<zettaobject::server::ResponseMessage>,tokio::io::util::buf_writer::BufWriter<tokio::net::unix::split_owned::OwnedWriteHalf>))
...
                  head: core::ptr::non_null::NonNull<tokio::sync::mpsc::block::Block<zettaobject::server::ResponseMessage>> {
                    pointer: 0x7fc011140bb0
                  },
...
                          fd: 15
...
    buf: Vec(size=1176) = {2, 0, 0, 0, 16, 0, 0, 0, ...
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants