Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NULL deref in balance_pgdat(), dup #287 #368

Closed
pazak opened this issue Aug 20, 2011 · 2 comments
Closed

NULL deref in balance_pgdat(), dup #287 #368

pazak opened this issue Aug 20, 2011 · 2 comments
Milestone

Comments

@pazak
Copy link

pazak commented Aug 20, 2011

Hi there,

While rsyncing file systems from xfs to zfs (that may be irrelevant), i've got following messages and system hangs afterwards:

---cut---
2011-08-20T12:06:24.175998+02:00 (none) kernel: BUG: unable to handle kernel NULL pointer dereference at (null)
2011-08-20T12:06:24.176101+02:00 (none) kernel: IP: [] balance_pgdat+0x488/0x750
2011-08-20T12:06:24.176126+02:00 (none) kernel: PGD 221d59067 PUD 220f1e067 PMD 0
2011-08-20T12:06:24.176147+02:00 (none) kernel: Oops: 0002 [#1] SMP
2011-08-20T12:06:24.176265+02:00 (none) kernel: CPU 2
2011-08-20T12:06:24.176280+02:00 (none) kernel: Modules linked in: zfs(P) zcommon(P) znvpair(P) zavl(P) zunicode(P) spl zlib_deflate bridge stp llc iptable_filter nvidia(P) rtc_cmos usblp i2c_piix4 asus_atk0110 sky2 iscsi_tcp libiscsi_tcp libiscsi raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor xor async_tx raid6_pq raid0 dm_snapshot dm_crypt dm_mirror dm_region_hash dm_log scsi_wait_scan hid_sunplus hid_sony hid_samsung hid_pl hid_petalynx hid_monterey hid_microsoft hid_logitech ff_memless hid_gyration hid_ezkey hid_cypress hid_chicony hid_cherry hid_belkin hid_apple hid_a4tech sl811_hcd scsi_transport_fc scsi_tgt
2011-08-20T12:06:24.176283+02:00 (none) kernel:
2011-08-20T12:06:24.176287+02:00 (none) kernel: Pid: 700, comm: kswapd0 Tainted: P 3.0.3-gentoo #1 System manufacturer System Product Name/M3A79-T DELUXE
2011-08-20T12:06:24.176291+02:00 (none) kernel: RIP: 0010:[] [] balance_pgdat+0x488/0x750
2011-08-20T12:06:24.176294+02:00 (none) kernel: RSP: 0018:ffff880221b21d10 EFLAGS: 00010286
2011-08-20T12:06:24.176296+02:00 (none) kernel: RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000011df6
2011-08-20T12:06:24.176299+02:00 (none) kernel: RDX: 0000000000000000 RSI: 28f5c28f5c28f5c3 RDI: ffff880221b21de0
2011-08-20T12:06:24.176302+02:00 (none) kernel: RBP: ffff880221b21e30 R08: 0000000000000001 R09: 0000000000000000
2011-08-20T12:06:24.176304+02:00 (none) kernel: R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
2011-08-20T12:06:24.176307+02:00 (none) kernel: R13: ffff88022fffb000 R14: ffff88022fffb000 R15: 0000000000000000
2011-08-20T12:06:24.176309+02:00 (none) kernel: FS: 00007fa5ab0628a0(0000) GS:ffff88022fd00000(0000) knlGS:0000000000000000
2011-08-20T12:06:24.176312+02:00 (none) kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
2011-08-20T12:06:24.176314+02:00 (none) kernel: CR2: 0000000000000000 CR3: 0000000221f73000 CR4: 00000000000006e0
2011-08-20T12:06:24.176317+02:00 (none) kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
2011-08-20T12:06:24.176320+02:00 (none) kernel: DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
2011-08-20T12:06:24.176322+02:00 (none) kernel: Process kswapd0 (pid: 700, threadinfo ffff880221b20000, task ffff880221b6f330)
2011-08-20T12:06:24.176324+02:00 (none) kernel: Stack:
2011-08-20T12:06:24.176327+02:00 (none) kernel: 0000000000010780 ffff880221b21d90 ffff880221b20000 0000000000000002
2011-08-20T12:06:24.176330+02:00 (none) kernel: ffff880221b6f330 0000000000010780 ffff880221b21dc4 0000000000000000
2011-08-20T12:06:24.176332+02:00 (none) kernel: ffff880221b21eac 000000000009f78f 0000000000000000 0000000000000286
2011-08-20T12:06:24.176334+02:00 (none) kernel: Call Trace:
2011-08-20T12:06:24.176336+02:00 (none) kernel: [] kswapd+0x1c7/0x2d0
2011-08-20T12:06:24.176339+02:00 (none) kernel: [] ? wake_up_bit+0x40/0x40
2011-08-20T12:06:24.176342+02:00 (none) kernel: [] ? balance_pgdat+0x750/0x750
2011-08-20T12:06:24.176345+02:00 (none) kernel: [] kthread+0x96/0xa0
2011-08-20T12:06:24.176348+02:00 (none) kernel: [] kernel_thread_helper+0x4/0x10
2011-08-20T12:06:24.176351+02:00 (none) kernel: [] ? kthread_worker_fn+0x180/0x180
2011-08-20T12:06:24.176354+02:00 (none) kernel: [] ? gs_change+0xb/0xb
2011-08-20T12:06:24.176357+02:00 (none) kernel: Code: ff e9 21 fd ff ff 0f 1f 44 00 00 8b bd 5c ff ff ff 48 8d 95 60 ff ff ff 4c 89 f6 e8 23 ee ff ff 48 8b 95 18 ff ff ff 48 8d 7d b0
2011-08-20T12:06:24.176362+02:00 (none) kernel: RIP [] balance_pgdat+0x488/0x750
2011-08-20T12:06:24.176364+02:00 (none) kernel: RSP
2011-08-20T12:06:24.176366+02:00 (none) kernel: CR2: 0000000000000000
2011-08-20T12:06:24.176368+02:00 (none) kernel: ---[ end trace 0881f2e4675155cf ]---
2011-08-20T12:06:24.672080+02:00 (none) kernel: rsync: page allocation failure: order:0, mode:0x20
2011-08-20T12:06:24.672105+02:00 (none) kernel: Pid: 7670, comm: rsync Tainted: P D 3.0.3-gentoo #1
2011-08-20T12:06:24.672108+02:00 (none) kernel: Call Trace:
2011-08-20T12:06:24.672113+02:00 (none) kernel: [] warn_alloc_failed+0xe3/0x130
2011-08-20T12:06:24.672120+02:00 (none) kernel: [] __alloc_pages_nodemask+0x57d/0x700
2011-08-20T12:06:24.672141+02:00 (none) kernel: [] alloc_pages_current+0xa0/0x100
2011-08-20T12:06:24.672144+02:00 (none) kernel: [] new_slab+0x25c/0x270
2011-08-20T12:06:24.672147+02:00 (none) kernel: [] __slab_alloc.clone.51+0x1be/0x350
2011-08-20T12:06:24.672150+02:00 (none) kernel: [] kmem_cache_alloc+0x9f/0xb0
2011-08-20T12:06:24.672153+02:00 (none) kernel: [] scsi_pool_alloc_command+0x45/0x80
2011-08-20T12:06:24.672157+02:00 (none) kernel: [] scsi_host_alloc_command.clone.9+0x2e/0x90
2011-08-20T12:06:24.672160+02:00 (none) kernel: [] __scsi_get_command+0x29/0xc0
2011-08-20T12:06:24.672162+02:00 (none) kernel: [] scsi_get_command+0x43/0xc0
2011-08-20T12:06:24.672165+02:00 (none) kernel: [] scsi_setup_fs_cmnd+0x8d/0xe0
2011-08-20T12:06:24.672168+02:00 (none) kernel: [] sd_prep_fn+0x14f/0xdc0
2011-08-20T12:06:24.672171+02:00 (none) kernel: [] blk_peek_request+0xae/0x1f0
2011-08-20T12:06:24.672174+02:00 (none) kernel: [] scsi_request_fn+0x4f/0x470
2011-08-20T12:06:24.672177+02:00 (none) kernel: [] __make_request+0x2a8/0x2d0
2011-08-20T12:06:24.672179+02:00 (none) kernel: [] generic_make_request+0x246/0x360
2011-08-20T12:06:24.672182+02:00 (none) kernel: [] ? mempool_alloc+0x58/0x130
2011-08-20T12:06:24.672185+02:00 (none) kernel: [] submit_bio+0x61/0xd0
2011-08-20T12:06:24.672188+02:00 (none) kernel: [] __vdev_disk_physio+0x36f/0x3f0 [zfs]
2011-08-20T12:06:24.672191+02:00 (none) kernel: [] vdev_disk_io_start+0x64/0x110 [zfs]
2011-08-20T12:06:24.672195+02:00 (none) kernel: [] zio_vdev_io_start+0xa4/0x2d0 [zfs]
2011-08-20T12:06:24.672198+02:00 (none) kernel: [] zio_nowait+0xa7/0x170 [zfs]
2011-08-20T12:06:24.672201+02:00 (none) kernel: [] vdev_mirror_io_start+0x18c/0x3f0 [zfs]
2011-08-20T12:06:24.672204+02:00 (none) kernel: [] ? vdev_config_sync+0x160/0x160 [zfs]
2011-08-20T12:06:24.672207+02:00 (none) kernel: [] zio_vdev_io_start+0x20f/0x2d0 [zfs]
2011-08-20T12:06:24.672210+02:00 (none) kernel: [] zio_nowait+0xa7/0x170 [zfs]
2011-08-20T12:06:24.672213+02:00 (none) kernel: [] ? arc_buf_clone.clone.14+0xa0/0xa0 [zfs]
2011-08-20T12:06:24.672233+02:00 (none) kernel: [] arc_read_nolock+0x493/0x740 [zfs]
2011-08-20T12:06:24.672237+02:00 (none) kernel: [] arc_read+0x7f/0x130 [zfs]
2011-08-20T12:06:24.672241+02:00 (none) kernel: [] ? kmem_alloc_debug+0xab/0x120 [spl]
2011-08-20T12:06:24.672244+02:00 (none) kernel: [] dsl_read+0x2c/0x30 [zfs]
2011-08-20T12:06:24.672247+02:00 (none) kernel: [] dbuf_prefetch+0x1c1/0x2a0 [zfs]
2011-08-20T12:06:24.672250+02:00 (none) kernel: [] dmu_zfetch_dofetch.clone.6+0xe8/0x160 [zfs]
2011-08-20T12:06:24.672253+02:00 (none) kernel: [] ? dbuf_sync_leaf.clone.6+0x310/0x310 [zfs]
2011-08-20T12:06:24.672256+02:00 (none) kernel: [] dmu_zfetch+0x6f0/0xd60 [zfs]
2011-08-20T12:06:24.672259+02:00 (none) kernel: [] dbuf_read+0x41b/0x770 [zfs]
2011-08-20T12:06:24.672262+02:00 (none) kernel: [] dmu_buf_hold_array_by_dnode+0x15a/0x480 [zfs]
2011-08-20T12:06:24.672266+02:00 (none) kernel: [] dmu_buf_hold_array+0x60/0x90 [zfs]
2011-08-20T12:06:24.672269+02:00 (none) kernel: [] ? avl_add+0x33/0x50 [zavl]
2011-08-20T12:06:24.672272+02:00 (none) kernel: [] dmu_read_uio+0x3c/0xd0 [zfs]
2011-08-20T12:06:24.672275+02:00 (none) kernel: [] zfs_read+0x3b7/0x490 [zfs]
2011-08-20T12:06:24.672278+02:00 (none) kernel: [] zpl_read_common+0x4d/0x80 [zfs]
2011-08-20T12:06:24.672305+02:00 (none) kernel: [] zpl_read+0x5f/0x90 [zfs]
2011-08-20T12:06:24.672308+02:00 (none) kernel: [] vfs_read+0xc3/0x170
2011-08-20T12:06:24.672311+02:00 (none) kernel: [] sys_read+0x4c/0x90
2011-08-20T12:06:24.672314+02:00 (none) kernel: [] system_call_fastpath+0x16/0x1b
2011-08-20T12:06:24.672316+02:00 (none) kernel: Mem-Info:
2011-08-20T12:06:24.672318+02:00 (none) kernel: Node 0 DMA per-cpu:
2011-08-20T12:06:24.672320+02:00 (none) kernel: CPU 0: hi: 0, btch: 1 usd: 0
2011-08-20T12:06:24.672322+02:00 (none) kernel: CPU 1: hi: 0, btch: 1 usd: 0
2011-08-20T12:06:24.672325+02:00 (none) kernel: CPU 2: hi: 0, btch: 1 usd: 0
2011-08-20T12:06:24.672327+02:00 (none) kernel: CPU 3: hi: 0, btch: 1 usd: 0
2011-08-20T12:06:24.672329+02:00 (none) kernel: Node 0 DMA32 per-cpu:
2011-08-20T12:06:24.672331+02:00 (none) kernel: CPU 0: hi: 186, btch: 31 usd: 26
2011-08-20T12:06:24.672333+02:00 (none) kernel: CPU 1: hi: 186, btch: 31 usd: 84
2011-08-20T12:06:24.672335+02:00 (none) kernel: CPU 2: hi: 186, btch: 31 usd: 26
2011-08-20T12:06:24.672337+02:00 (none) kernel: CPU 3: hi: 186, btch: 31 usd: 23
2011-08-20T12:06:24.672358+02:00 (none) kernel: Node 0 Normal per-cpu:
2011-08-20T12:06:24.672362+02:00 (none) kernel: CPU 0: hi: 186, btch: 31 usd: 0
2011-08-20T12:06:24.672364+02:00 (none) kernel: CPU 1: hi: 186, btch: 31 usd: 123
2011-08-20T12:06:24.672366+02:00 (none) kernel: CPU 2: hi: 186, btch: 31 usd: 0
2011-08-20T12:06:24.672369+02:00 (none) kernel: CPU 3: hi: 186, btch: 31 usd: 27
2011-08-20T12:06:24.672371+02:00 (none) kernel: active_anon:67036 inactive_anon:11535 isolated_anon:0
2011-08-20T12:06:24.672374+02:00 (none) kernel: active_file:188613 inactive_file:385584 isolated_file:0
2011-08-20T12:06:24.672376+02:00 (none) kernel: unevictable:1 dirty:13 writeback:0 unstable:0
2011-08-20T12:06:24.672379+02:00 (none) kernel: free:7057 slab_reclaimable:48558 slab_unreclaimable:27455
2011-08-20T12:06:24.672382+02:00 (none) kernel: mapped:15586 shmem:826 pagetables:3854 bounce:0
2011-08-20T12:06:24.672393+02:00 (none) kernel: Node 0 DMA free:15904kB min:20kB low:24kB high:28kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15688kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:8kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
2011-08-20T12:06:24.672411+02:00 (none) kernel: lowmem_reserve[]: 0 3255 8053 8053
2011-08-20T12:06:24.672484+02:00 (none) kernel: Node 0 DMA32 free:12324kB min:4636kB low:5792kB high:6952kB active_anon:11872kB inactive_anon:3224kB active_file:2824kB inactive_file:705308kB unevictable:4kB isolated(anon):0kB isolated(file):0kB present:3334048kB mlocked:4kB dirty:0kB writeback:0kB mapped:172kB shmem:164kB slab_reclaimable:52796kB slab_unreclaimable:22356kB kernel_stack:8kB pagetables:60kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:84 all_unreclaimable? no
2011-08-20T12:06:24.672492+02:00 (none) kernel: lowmem_reserve[]: 0 0 4797 4797
2011-08-20T12:06:24.672502+02:00 (none) kernel: Node 0 Normal free:0kB min:6832kB low:8540kB high:10248kB active_anon:256272kB inactive_anon:42916kB active_file:751628kB inactive_file:837028kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:4912640kB mlocked:0kB dirty:52kB writeback:0kB mapped:62172kB shmem:3140kB slab_reclaimable:141436kB slab_unreclaimable:87456kB kernel_stack:3408kB pagetables:15356kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
2011-08-20T12:06:24.672506+02:00 (none) kernel: lowmem_reserve[]: 0 0 0 0
2011-08-20T12:06:24.672510+02:00 (none) kernel: Node 0 DMA: 0_4kB 0_8kB 0_16kB 1_32kB 2_64kB 1_128kB 1_256kB 0_512kB 1_1024kB 1_2048kB 3_4096kB = 15904kB
2011-08-20T12:06:24.672540+02:00 (none) kernel: Node 0 DMA32: 1033_4kB 0_8kB 0_16kB 0_32kB 0_64kB 0_128kB 0_256kB 0_512kB 0_1024kB 0_2048kB 2_4096kB = 12324kB
2011-08-20T12:06:24.672544+02:00 (none) kernel: Node 0 Normal: 0_4kB 0_8kB 0_16kB 0_32kB 0_64kB 0_128kB 0_256kB 0_512kB 0_1024kB 0_2048kB 0*4096kB = 0kB
2011-08-20T12:06:24.672547+02:00 (none) kernel: 575024 total pagecache pages
2011-08-20T12:06:24.672549+02:00 (none) kernel: 0 pages in swap cache
2011-08-20T12:06:24.672552+02:00 (none) kernel: Swap cache stats: add 0, delete 0, find 0/0
2011-08-20T12:06:24.672565+02:00 (none) kernel: Free swap = 4192252kB
2011-08-20T12:06:24.672567+02:00 (none) kernel: Total swap = 4192252kB
2011-08-20T12:06:24.672962+02:00 (none) kernel: 2097136 pages RAM
2011-08-20T12:06:24.672980+02:00 (none) kernel: 48233 pages reserved
2011-08-20T12:06:24.672984+02:00 (none) kernel: 112367 pages shared
2011-08-20T12:06:24.672987+02:00 (none) kernel: 1980933 pages non-shared
2011-08-20T12:06:24.672989+02:00 (none) kernel: SLUB: Unable to allocate memory on node -1 (gfp=0x20)
2011-08-20T12:06:24.672993+02:00 (none) kernel: cache: kmalloc-128, object size: 128, buffer size: 128, default order: 0, min order: 0
2011-08-20T12:06:24.672996+02:00 (none) kernel: node 0: slabs: 0, objs: 0, free: 0
---cut---

This may be a kernel bug though.

This is 100% reproducible for me since i've upgraded to kernel 2.6.38-r7. the latest kernel latest 3.0.3 is affected too.

If this helps further:

System: Linux censored 3.0.3-gentoo #1 SMP Sat Aug 20 10:25:45 CEST 2011 x86_64 AMD Phenom(tm) 9950 Quad-Core Processor AuthenticAMD GNU/Linux

GLIBC: 2.13-r4

GCC: 4.5.3

SPL: 0.6.0-rc5
ZFS: 0.6.0-rc5

Thanks for your great job and I hope this will help to improve ZFS port on Linux.

Regards,
pazak.

@behlendorf
Copy link
Contributor

This looks like a duplicate of issue #287. This has been observed a few time but not resolved yet.

@behlendorf
Copy link
Contributor

Closing issue, this should be fixed by the following two commits for #287.

6a95d0b
openzfs/spl@b8b6e4c

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants