Skip to content
This repository has been archived by the owner on Feb 26, 2020. It is now read-only.

SPLError: 18228:0:(spa_config.c:427:spa_config_generate()) SPL PANIC #249

Closed
inkdot7 opened this issue Jun 13, 2013 · 1 comment
Closed
Labels

Comments

@inkdot7
Copy link

inkdot7 commented Jun 13, 2013

I just got the following messages. At the time, I think no major ZFS activity should have occurred, except that 'zpool iostat -v 1' (or 5) was running.

Message from syslogd@org-pc51 at Jun 13 01:11:29 ...
kernel:[22682.776846] VERIFY(nvlist_add_uint64_array(config, ZPOOL_CONFIG_DDT_HISTOGRAM, (uint64_t _)ddh, sizeof (_ddh) / sizeof (uint64_t)) == 0) failed

Message from syslogd@org-pc51 at Jun 13 01:11:29 ...
kernel:[22682.776892] SPLError: 18228:0:(spa_config.c:427:spa_config_generate()) SPL PANIC

The pool is:

    NAME        STATE     READ WRITE CKSUM
    upper5plus3  ONLINE       0     0     0
      raidz2-0  ONLINE       0     0     0
        sda     ONLINE       0     0     0
        sdc     ONLINE       0     0     0
        sde     ONLINE       0     0     0
        sdf     ONLINE       0     0     0
        sdd     ONLINE       0     0     0
        sdg     ONLINE       0     0     0
        sdh     ONLINE       0     0     0
        sdi     ONLINE       0     0     0
    logs
      sdb3      ONLINE       0     0     0
    cache
      sdb4      ONLINE       0     0     0

The machine runs debian wheezy, with installation as described on:
http://zfsonlinux.org/debian.html

ii spl 0.6.1-1wheezy amd64 Solaris Porting Layer utilities for Linux
ii spl-dkms 0.6.1-1
wheezy all Solaris Porting Layer kernel modules for Linux

ii debian-zfs 7wheezy amd64 Native ZFS filesystem metapackage for Debian.
ii dkms 2.2.0.3-1.2+zfs6 all Dynamic Kernel Module Support Framework
ii libzfs1 0.6.1-1
wheezy amd64 Native ZFS filesystem library for Linux
ii zfs-dkms 0.6.1-1wheezy amd64 Native ZFS filesystem kernel modules for Linux
ii zfsonlinux 1
wheezy all archive.zfsonlinux.org trust package
ii zfsutils 0.6.1-1~wheezy amd64 Native ZFS management utilities for Linux

The hopefully relevant parts of dmesg (as github cannot add attachments :-( )

[ 0.000000] Linux version 3.2.0-4-amd64 ([email protected]) (gcc version 4.6.3 (Debian 4.6.3-14) ) #1 SMP Debian 3.2.41-2+deb7u2
[ 0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-3.2.0-4-amd64 root=UUID=4f425b17-51b2-419d-8862-572b00c58925 ro quiet
...
[ 0.000000] Memory: 16362564k/17825792k available (3420k kernel code, 1082788k absent, 380440k reserved, 3318k data, 576k init)
...
[ 6.264217] ZFS: Loaded module v0.6.1-rc14, ZFS pool version 5000, ZFS filesystem version 5
...
[ 138.561351] SPL: using hostid 0xa8c00401
...
[22682.642623] zpool: page allocation failure: order:1, mode:0x20
[22682.642630] Pid: 18228, comm: zpool Tainted: P O 3.2.0-4-amd64 #1 Debian 3.2.41-2+deb7u2
[22682.642633] Call Trace:
[22682.642644] [] ? warn_alloc_failed+0x11a/0x12d
[22682.642649] [] ? __alloc_pages_nodemask+0x704/0x7aa
[22682.642656] [] ? kmem_getpages+0x4c/0xd9
[22682.642661] [] ? fallback_alloc+0x13e/0x1e2
[22682.642679] [] ? kmalloc_nofail+0x2b/0x3d [spl]
[22682.642683] [] ? __kmalloc+0xba/0x112
[22682.642691] [] ? kmalloc_nofail+0x2b/0x3d [spl]
[22682.642698] [] ? kmem_alloc_debug+0x164/0x2d0 [spl]
[22682.642704] [] ? kmem_alloc_debug+0x145/0x2d0 [spl]
[22682.642708] [] ? __kmalloc+0x100/0x112
[22682.642718] [] ? nv_mem_zalloc.isra.12+0xa/0x21 [znvpair]
[22682.642725] [] ? nvlist_add_common+0x113/0x2f9 [znvpair]
[22682.642763] [] ? ddt_histogram_add+0x25/0x30 [zfs]
[22682.642790] [] ? spa_config_generate+0x59d/0x701 [zfs]
[22682.642815] [] ? spa_open_common+0x228/0x2be [zfs]
[22682.642820] [] ? __kmalloc+0x100/0x112
[22682.642845] [] ? spa_get_stats+0x35/0x23c [zfs]
[22682.642869] [] ? zfs_ioc_pool_stats+0x1f/0x56 [zfs]
[22682.642876] [] ? pool_namecheck+0xc4/0x101 [zcommon]
[22682.642900] [] ? zfsdev_ioctl+0x103/0x159 [zfs]
[22682.642905] [] ? do_vfs_ioctl+0x459/0x49a
[22682.642910] [] ? update_rmtp+0x62/0x62
[22682.642913] [] ? hrtimer_start_expires+0x13/0x15
[22682.642917] [] ? sys_ioctl+0x4b/0x72
[22682.642923] [] ? math_state_restore+0x4b/0x55
[22682.642929] [] ? system_call_fastpath+0x16/0x1b
[22682.642931] Mem-Info:
[22682.642934] Node 0 DMA per-cpu:
[22682.642937] CPU 0: hi: 0, btch: 1 usd: 0
[22682.642939] CPU 1: hi: 0, btch: 1 usd: 0
[22682.642942] CPU 2: hi: 0, btch: 1 usd: 0
[22682.642944] CPU 3: hi: 0, btch: 1 usd: 0
[22682.642946] Node 0 DMA32 per-cpu:
[22682.642950] CPU 0: hi: 186, btch: 31 usd: 163
[22682.642952] CPU 1: hi: 186, btch: 31 usd: 34
[22682.642955] CPU 2: hi: 186, btch: 31 usd: 0
[22682.642957] CPU 3: hi: 186, btch: 31 usd: 30
[22682.642959] Node 0 Normal per-cpu:
[22682.642962] CPU 0: hi: 186, btch: 31 usd: 157
[22682.642964] CPU 1: hi: 186, btch: 31 usd: 36
[22682.642967] CPU 2: hi: 186, btch: 31 usd: 0
[22682.642970] CPU 3: hi: 186, btch: 31 usd: 0
[22682.642972] Node 1 Normal per-cpu:
[22682.642975] CPU 0: hi: 186, btch: 31 usd: 69
[22682.642977] CPU 1: hi: 186, btch: 31 usd: 0
[22682.642980] CPU 2: hi: 186, btch: 31 usd: 166
[22682.642982] CPU 3: hi: 186, btch: 31 usd: 169
[22682.642988] active_anon:12968 inactive_anon:60 isolated_anon:0
[22682.642989] active_file:14048 inactive_file:848983 isolated_file:0
[22682.642991] unevictable:0 dirty:10 writeback:0 unstable:0
[22682.642992] free:57148 slab_reclaimable:44157 slab_unreclaimable:83590
[22682.642993] mapped:3259 shmem:77 pagetables:1098 bounce:0
[22682.642996] Node 0 DMA free:15868kB min:84kB low:104kB high:124kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15656kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
[22682.643009] lowmem_reserve[]: 0 2967 8017 8017
[22682.643013] Node 0 DMA32 free:120300kB min:16592kB low:20740kB high:24888kB active_anon:1828kB inactive_anon:0kB active_file:5384kB inactive_file:428652kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3038496kB mlocked:0kB dirty:0kB writeback:0kB mapped:180kB shmem:0kB slab_reclaimable:34044kB slab_unreclaimable:85620kB kernel_stack:128kB pagetables:88kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[22682.643026] lowmem_reserve[]: 0 0 5050 5050
[22682.643030] Node 0 Normal free:35308kB min:28240kB low:35300kB high:42360kB active_anon:9844kB inactive_anon:148kB active_file:22724kB inactive_file:529936kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:5171200kB mlocked:0kB dirty:12kB writeback:0kB mapped:7116kB shmem:192kB slab_reclaimable:12444kB slab_unreclaimable:122800kB kernel_stack:1848kB pagetables:1756kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[22682.643043] lowmem_reserve[]: 0 0 0 0
[22682.643047] Node 1 Normal free:57116kB min:45188kB low:56484kB high:67780kB active_anon:40200kB inactive_anon:92kB active_file:28084kB inactive_file:2437344kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:8273920kB mlocked:0kB dirty:28kB writeback:0kB mapped:5740kB shmem:116kB slab_reclaimable:130140kB slab_unreclaimable:125940kB kernel_stack:304kB pagetables:2548kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
[22682.643060] lowmem_reserve[]: 0 0 0 0
[22682.643064] Node 0 DMA: 1_4kB 1_8kB 1_16kB 1_32kB 1_64kB 1_128kB 1_256kB 0_512kB 1_1024kB 1_2048kB 3_4096kB = 15868kB
[22682.643075] Node 0 DMA32: 29388_4kB 0_8kB 0_16kB 13_32kB 0_64kB 1_128kB 1_256kB 0_512kB 2_1024kB 0_2048kB 0_4096kB = 120400kB
[22682.643085] Node 0 Normal: 7759_4kB 26_8kB 0_16kB 1_32kB 1_64kB 1_128kB 1_256kB 1_512kB 1_1024kB 1_2048kB 0_4096kB = 35308kB
[22682.643096] Node 1 Normal: 13174_4kB 10_8kB 6_16kB 9_32kB 0_64kB 0_128kB 0_256kB 0_512kB 0_1024kB 0_2048kB 1_4096kB = 57256kB
[22682.643107] 863112 total pagecache pages
[22682.643109] 0 pages in swap cache
[22682.643111] Swap cache stats: add 0, delete 0, find 0/0
[22682.643113] Free swap = 0kB
[22682.643115] Total swap = 0kB
[22682.646108] 4194288 pages RAM
[22682.646108] 84099 pages reserved
[22682.646108] 25191 pages shared
[22682.646108] 3524742 pages non-shared
[22682.776839] SPL: kmem_alloc(4144, 0x20) at nv_alloc_nosleep_spl:39 failed (159793638/217369747)
[22682.776846] VERIFY(nvlist_add_uint64_array(config, ZPOOL_CONFIG_DDT_HISTOGRAM, (uint64_t _)ddh, sizeof (_ddh) / sizeof (uint64_t)) == 0) failed
[22682.776892] SPLError: 18228:0:(spa_config.c:427:spa_config_generate()) SPL PANIC
[22682.776909] SPL: Showing stack for process 18228
[22682.776916] Pid: 18228, comm: zpool Tainted: P O 3.2.0-4-amd64 #1 Debian 3.2.41-2+deb7u2
[22682.776923] Call Trace:
[22682.776942] [] ? spl_debug_dumpstack+0x24/0x2a [spl]
[22682.776952] [] ? spl_debug_bug+0x7f/0xc8 [spl]
[22682.776995] [] ? spa_config_generate+0x5c9/0x701 [zfs]
[22682.777025] [] ? spa_open_common+0x228/0x2be [zfs]
[22682.777036] [] ? __kmalloc+0x100/0x112
[22682.777066] [] ? spa_get_stats+0x35/0x23c [zfs]
[22682.777094] [] ? zfs_ioc_pool_stats+0x1f/0x56 [zfs]
[22682.777106] [] ? pool_namecheck+0xc4/0x101 [zcommon]
[22682.777134] [] ? zfsdev_ioctl+0x103/0x159 [zfs]
[22682.777143] [] ? do_vfs_ioctl+0x459/0x49a
[22682.777152] [] ? update_rmtp+0x62/0x62
[22682.777159] [] ? hrtimer_start_expires+0x13/0x15
[22682.777168] [] ? sys_ioctl+0x4b/0x72
[22682.777178] [] ? math_state_restore+0x4b/0x55
[22682.777189] [] ? system_call_fastpath+0x16/0x1b
[22920.976045] INFO: task l2arc_feed:743 blocked for more than 120 seconds.
[22920.976062] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[22920.976078] l2arc_feed D ffff88023fd13780 0 743 2 0x00000000
[22920.976084] ffff880235db8e20 0000000000000046 0000000100000000 ffff8802374980c0
[22920.976090] 0000000000013780 ffff880230ed9fd8 ffff880230ed9fd8 ffff880235db8e20
[22920.976095] ffffffffa0531148 0000000181070c55 0000000000000202 ffffffffa0533410
[22920.976099] Call Trace:
[22920.976120] [] ? __mutex_lock_common.isra.5+0xff/0x164
[22920.976125] [] ? mutex_lock+0x1a/0x2d
[22920.976163] [] ? l2arc_feed_thread+0xa7/0x698 [zfs]
[22920.976180] [] ? l2arc_evict+0x227/0x227 [zfs]
[22920.976190] [] ? thread_generic_wrapper+0x6a/0x75 [spl]
[22920.976197] [] ? __thread_create+0x2be/0x2be [spl]
[22920.976203] [] ? kthread+0x76/0x7e
[22920.976208] [] ? kernel_thread_helper+0x4/0x10
[22920.976212] [] ? kthread_worker_fn+0x139/0x139
[22920.976216] [] ? gs_change+0x13/0x13
[22920.976256] INFO: task zpool:18228 blocked for more than 120 seconds.
[22920.976268] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[22920.976282] zpool D ffff88023fd13780 0 18228 3266 0x00000000
[22920.976287] ffff880248f66a70 0000000000000082 ffff880200000000 ffff8802374980c0
[22920.976292] 0000000000013780 ffff88026758dfd8 ffff88026758dfd8 ffff880248f66a70
[22920.976296] 000000000000001b 0000000100000001 ffff88043ea87000 0000000000000000
[22920.976301] Call Trace:
[22920.976308] [] ? spl_debug_bug+0xc6/0xc8 [spl]
[22920.976336] [] ? spa_config_generate+0x5c9/0x701 [zfs]
[22920.976361] [] ? spa_open_common+0x228/0x2be [zfs]
[22920.976368] [] ? __kmalloc+0x100/0x112
[22920.976393] [] ? spa_get_stats+0x35/0x23c [zfs]
[22920.976417] [] ? zfs_ioc_pool_stats+0x1f/0x56 [zfs]
[22920.976423] [] ? pool_namecheck+0xc4/0x101 [zcommon]
[22920.976446] [] ? zfsdev_ioctl+0x103/0x159 [zfs]
[22920.976451] [] ? do_vfs_ioctl+0x459/0x49a
[22920.976456] [] ? update_rmtp+0x62/0x62
[22920.976459] [] ? hrtimer_start_expires+0x13/0x15
[22920.976463] [] ? sys_ioctl+0x4b/0x72
[22920.976469] [] ? math_state_restore+0x4b/0x55
[22920.976474] [] ? system_call_fastpath+0x16/0x1b

full dmesg is here:

http://pastebin.ca/2397206

@behlendorf
Copy link
Contributor

@inkdot7 Interesting! The failure was the result of a memory allocation failure, however that allocation should never have failed. It appears it was because the atomic handlers were accidentally for registered for the nvlist instead of the blocking versions. Patch to follow.

behlendorf added a commit to openzfs/zfs that referenced this issue Jun 18, 2013
The non-blocking allocation handlers in nvlist_alloc() would be
mistakenly assigned if any flags other than KM_SLEEP were passed.
This meant that nvlists allocated with KM_PUSHPUSH or other KM_*
debug flags were effectively always using atomic allocations.

While these failures were unlikely it could lead to assertions
because KM_PUSHPAGE allocation in particular are guaranteed to
succeed or block.  They must never fail.

Since this code is already wrapped in a _KERNEL define the
cleanest fix is to check the __GFP_HIGH bit.  When set the
caller is signaling it is safe for the allocation to block,
when it's clear atomic allocations must be used.

Signed-off-by: Brian Behlendorf <[email protected]>
Issue openzfs/spl#249
behlendorf added a commit to behlendorf/zfs that referenced this issue Jun 18, 2013
The non-blocking allocation handlers in nvlist_alloc() would be
mistakenly assigned if any flags other than KM_SLEEP were passed.
This meant that nvlists allocated with KM_PUSHPUSH or other KM_*
debug flags were effectively always using atomic allocations.

While these failures were unlikely it could lead to assertions
because KM_PUSHPAGE allocations in particular are guaranteed to
succeed or block.  They must never fail.

Since the existing API does not allow us to pass allocation
flags to the private allocators the cleanest thing to do is to
add a KM_PUSHPAGE allocator.

Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs/spl#249
unya pushed a commit to unya/zfs that referenced this issue Dec 13, 2013
The non-blocking allocation handlers in nvlist_alloc() would be
mistakenly assigned if any flags other than KM_SLEEP were passed.
This meant that nvlists allocated with KM_PUSHPUSH or other KM_*
debug flags were effectively always using atomic allocations.

While these failures were unlikely it could lead to assertions
because KM_PUSHPAGE allocations in particular are guaranteed to
succeed or block.  They must never fail.

Since the existing API does not allow us to pass allocation
flags to the private allocators the cleanest thing to do is to
add a KM_PUSHPAGE allocator.

Signed-off-by: Brian Behlendorf <[email protected]>
Closes openzfs/spl#249
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants