Kernel panic in dmu_write() on i686 #1284

amospalla · 2013-02-10T17:50:55Z

Screenshot

Rebooted the machine with sysrq b

amospalla · 2013-02-10T17:51:46Z

Crash happened while browsing.

behlendorf · 2013-02-12T00:59:09Z

Thank we'll want to dig in to this when we have the time. Was this was a one time event?

amospalla · 2013-02-12T09:54:53Z

I moved data to zfs, and after a couple of days, that happened, but after that moved again to ext4.

behlendorf · 2013-02-12T18:17:34Z

OK, well we'll leave the issue open for reference in case someone else hits something similar. This however is the first report I've heard of something like this.

baryluk · 2013-02-26T09:09:31Z

Looks similar:

[    0.000000] Initializing cgroup subsys cpuset
[    0.000000] Initializing cgroup subsys cpu
[    0.000000] Linux version 3.8.0-rc7-t43-prod-dirty (baryluk@sredniczarny) (gcc version 4.7.2 (Debian 4.7.2-5) ) #34 SMP Sun Feb 10 01:07:48 CET 2013
...
...
[  474.080157] BUG: unable to handle kernel paging request at ffca1000
[  474.081611] IP: [<cc67ef71>] dmu_write+0x1a1/0x260 [zfs]
[  474.083058] *pdpt = 0000000001b00001 *pde = 000000000a50e067 *pte = 0000000000000000 
[  474.084009] Oops: 0000 [#1] SMP 
[  474.084009] Modules linked in: pktcdvd cdrom ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM dummy ppdev decnet lp bnep rfcomm bluetooth libipw lib80211 uinput nfsd zfs(PO) zunicode(PO) zavl(PO) zcommon(PO) znvpair(PO) spl(O) hdaps pcmcia acpi_cpufreq mperf yenta_socket pcmcia_rsrc i2c_i801 pcmcia_core radeon gpio_ich video parport_pc drm_kms_helper floppy parport ttm drm cfbfillrect cfbimgblt cfbcopyarea intel_agp i2c_algo_bit intel_gtt agpgart xhci_hcd [last unloaded: ipw2200]
[  474.084009] Pid: 9035, comm: flush-zfs-27 Tainted: P           O 3.8.0-rc7-t43-prod-dirty #34 IBM 2669UYD/2669UYD
[  474.084009] EIP: 0060:[<cc67ef71>] EFLAGS: 00010202 CPU: 0
[  474.084009] EIP is at dmu_write+0x1a1/0x260 [zfs]
[  474.084009] EAX: 00000000 EBX: c30face8 ECX: 0000082a EDX: d0113800
[  474.084009] ESI: ffca1000 EDI: d0114800 EBP: c56a3c70 ESP: c56a3c28
[  474.084009]  DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068
[  474.084009] CR0: 8005003b CR2: ffca1000 CR3: 073b7000 CR4: 000007f0
[  474.084009] DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000
[  474.084009] DR6: ffff0ff0 DR7: 00000400
[  474.084009] Process flush-zfs-27 (pid: 9035, ti=c56a2000 task=c576f480 task.ti=c56a2000)
[  474.084009] Stack:
[  474.084009]  00000000 00000000 000030a8 00000000 00000000 cc7224d6 c56a3c60 c56a3c5c
[  474.084009]  00000000 00000000 000030a8 00000000 000030a8 c7352650 00000001 00000010
[  474.084009]  c2d7c3c0 c5776000 c56a3d30 cc709e90 00000000 00000000 000030a8 00000000
[  474.084009] Call Trace:
[  474.084009]  [<cc709e90>] zfs_putpage+0x330/0x4e0 [zfs]
[  474.084009]  [<c1103a6c>] ? find_get_pages_tag+0xbc/0x160
[  474.084009]  [<cc71c13c>] zpl_putpage+0x2c/0x40 [zfs]
[  474.084009]  [<cc71c110>] ? zpl_readpage+0x60/0x60 [zfs]
[  474.084009]  [<c110bb61>] write_cache_pages+0x1b1/0x3c0
[  474.084009]  [<cc71c110>] ? zpl_readpage+0x60/0x60 [zfs]
[  474.084009]  [<c1089196>] ? dequeue_entity+0x116/0x580
[  474.084009]  [<cc71c0a8>] zpl_writepages+0x18/0x20 [zfs]
[  474.084009]  [<c110d1ba>] do_writepages+0x1a/0x40
[  474.084009]  [<c117ef4f>] __writeback_single_inode+0x2f/0x140
[  474.084009]  [<c1072f13>] ? wake_up_bit+0x23/0x30
[  474.084009]  [<c117fb62>] writeback_sb_inodes+0x162/0x330
[  474.084009]  [<c117fdac>] __writeback_inodes_wb+0x7c/0xb0
[  474.084009]  [<c117ffea>] wb_writeback+0x20a/0x290
[  474.084009]  [<c110c1ba>] ? global_dirty_limits+0x2a/0x110
[  474.084009]  [<c117da63>] ? over_bground_thresh+0x23/0xa0
[  474.084009]  [<c1181234>] wb_do_writeback+0x1f4/0x200
[  474.084009]  [<c11812b1>] bdi_writeback_thread+0x71/0x200
[  474.084009]  [<c1181240>] ? wb_do_writeback+0x200/0x200
[  474.084009]  [<c1072824>] kthread+0x94/0xa0
[  474.084009]  [<c1010000>] ? perf_trace_xen_mc_flush_reason+0x30/0xc0
[  474.084009]  [<c176d977>] ret_from_kernel_thread+0x1b/0x28
[  474.084009]  [<c1072790>] ? kthread_create_on_node+0xc0/0xc0
[  474.084009] Code: e8 e9 16 ff ff ff 8d 74 26 00 f6 c2 01 75 63 f7 c7 02 00 00 00 75 73 f7 c7 04 00 00 00 0f 85 87 00 00 00 89 c1 83 e0 03 c1 e9 02 <f3> a5 e9 2c ff ff ff 90 8d b4 26 00 00 00 00 e8 7b 95 ff ff 8b
[  474.084009] EIP: [<cc67ef71>] dmu_write+0x1a1/0x260 [zfs] SS:ESP 0068:c56a3c28
[  474.084009] CR2: 00000000ffca1000
[  474.084009] ---[ end trace a86b28d53f179a9f ]---

was compiling kernel. System was still working, and operational on different file systems. doing sync was blocking, unmounting or killing blocked processes was impossible. After a while, system freezed. But still was able to perform SysRq-b (-s/-u/-i/-e, was not operational I think).

$ cat /proc/cmdline
BOOT_IMAGE=/vmlinuz-3.8.0-rc7-t43-prod-dirty root=/dev/mapper/sredniczarny-root ro vmalloc=840M resume=/dev/mapper/sredniczarny-swap_1 thinkpad_acpi.fan_control=1

$ cat /sys/module/zfs/parameters/zfs_arc_max
536870912

machine have 2GB of ram.

I can repeat this error quite reliably.

baryluk · 2013-02-26T09:18:36Z

I will build debug kernel (not sure how, when I have sources on zfs itself... will need to clone/download again) and see if it will show any more interesting infos.

ryao · 2013-03-03T02:24:05Z

This looks like an issue in the current mmap() code. I am doing a rewrite that will replace all of the code involved in this backtrace. It should fix this when it is done.

ryao · 2013-03-03T02:27:12Z

@baryluk You might be able to catch the cause if this issue if you rebuild the spl and zfs with --enable-debug and you encounter a build failure.

ryao · 2013-04-01T18:23:44Z

@baryluk Your issue could be related to #1342.

behlendorf · 2014-10-03T21:41:45Z

This was almost certainly addressed by the various mmap improvements and bug fixes over the last 18 months. Since there are no recent similar reports I'm closing this as stale.

behlendorf closed this as completed Oct 3, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kernel panic in dmu_write() on i686 #1284

Kernel panic in dmu_write() on i686 #1284

amospalla commented Feb 10, 2013

amospalla commented Feb 10, 2013

behlendorf commented Feb 12, 2013

amospalla commented Feb 12, 2013

behlendorf commented Feb 12, 2013

baryluk commented Feb 26, 2013

baryluk commented Feb 26, 2013

ryao commented Mar 3, 2013

ryao commented Mar 3, 2013

ryao commented Apr 1, 2013

behlendorf commented Oct 3, 2014

Kernel panic in dmu_write() on i686 #1284

Kernel panic in dmu_write() on i686 #1284

Comments

amospalla commented Feb 10, 2013

amospalla commented Feb 10, 2013

behlendorf commented Feb 12, 2013

amospalla commented Feb 12, 2013

behlendorf commented Feb 12, 2013

baryluk commented Feb 26, 2013

baryluk commented Feb 26, 2013

ryao commented Mar 3, 2013

ryao commented Mar 3, 2013

ryao commented Apr 1, 2013

behlendorf commented Oct 3, 2014