Skip to content
This repository has been archived by the owner on Aug 29, 2023. It is now read-only.

BUG: Access NULL pointer without checking the return value of alloc_page #18

Closed
colo-ft opened this issue May 16, 2017 · 5 comments
Closed

Comments

@colo-ft
Copy link

colo-ft commented May 16, 2017

Hi @naixia @dolohow ,

Sorry to bother you again. :)

We caught a new bug in our test, and its backtrace is:
[494558.706128] BUG: unable to handle kernel NULL pointer dereference at (null)
[494558.706135] IP: [] scan_vma_one_page+0x1a0/0x1430
[494558.706140] PGD 11b5f34067 PUD 11269f4067 PMD 0
[494558.706141] Thread overran stack, or stack corrupted
[494558.706144] Oops: 0000 [#1] SMP
[494558.706149] event processing in diff cpu! last event=0x10, last cpu=2, cur event=0x20, cur cpu=22.
[494558.706150] event maps(bit5-bit0): die-oom-intermit-reboot-emerge-panic
[494558.706153] the g_stDumpState = 0x10
this event = 0x20
record_cpu = 2
record_pid = 26769
cur_cpu = 22
cur_pid = 430
[494558.706154] subsequent! type = 1!
[494558.706160] 0000000000000092 00000000a6eb3604 ffff8803bd6539f8 ffffffffa0d70fcd
[494558.706164] ffff8803000001ae 0000000000000005 ffff8803bd653af8 0000000000000001
[494558.706169] ffff8803bd653a18 ffffffffa0d714e6 0000000000000001 ffff8803bd653af8
[494558.706174] ffff8803bd653ab0 ffffffffa0d76066 0000000000000000 ffff8803bd653b20
[494558.706179] ffffffff81656a31 ffffffff8164bfd5 000000000000000b 0000000000000000
[494558.706183] ffffffffa0d84680 0000000000000000 ffff8803bd653af8 0000000000000001
[494558.706188] ffffffffa0d84680 00000000a6eb3604 ffffffffa0d75ff5 00000000a6eb3604
[494558.706192] 00000000fffffffb 0000000000000000 0000000000000001 ffff8803bd653ae8
[494558.706197] ffffffff816507bc ffffffff818654d8 0000000000000000 ffff8803bd653c88
[494558.706202] ffff8803bd534500 0000000000000000 ffff8803bd653b20 ffffffff81650865
[494558.706207] ffff8803bd653c88 ffffffff818654d8 0000000000000000 0000000b0000000e
[494558.706211] 00000000a6eb3604 ffff8803bd653b48 ffffffff8164d6bf ffff8803bd653c88
[494558.706216] 0000000000000000 0000000000000046 ffff8803bd653b98 ffffffff8163da37
[494558.706220] 0003000100000000 ffff8803bd653c88 00000000a6eb3604 0000000000000000
[494558.706225] ffff8803bd653c88 0000000000000000 ffff8803bd534500 0000000000030001
[494558.706230] ffff8803bd653be0 ffffffff8163daf6 ffff8803bd653c88 0000000000000000
[494558.706234] 0000000000000000 ffff8803bd653c88 0000000000000000 0000000000000000
[494558.706239] 0000000000000aa9 ffff8803bd653bf0 ffffffff8163dc60 ffff8803bd653c50
[494558.706244] ffffffff81650466 ffffffff8164c56c 0000000000000000 ffff8803bd534500
[494558.706248] ffff8803bd653c88 0000000000000078 ffff8803bd653c88 0000000000000000
[494558.706253] 0000000000aa91e0 ffff880394945ba8 0000000000000aa9 ffff8803bd653c78
[494558.706258] ffffffff81650603 ffff88207efd7008 0000000000000001 ffff88135614c000
[494558.706262] ffff8803bd653db0 ffffffff8164c808 0000000000000aa9 ffff880394945ba8
[494558.706267] 0000000000aa91e0 ffff88135614c000 ffff8803bd653db0 0000000000000001
[494558.706268] Call Trace:
[494558.706279] [] kbox_event_printk+0xed/0x110 [kbox]
[494558.706287] [] kbox_event_pre_process+0x146/0x270 [kbox]
[494558.706296] [] kbox_die_callback+0x76/0x280 [kbox]
[494558.706302] [] ? ftrace_call+0x5/0x2f
[494558.706306] [] ? _raw_spin_unlock+0x5/0x30
[494558.706314] [] ? kbox_die_callback+0x5/0x280 [kbox]
[494558.706318] [] notifier_call_chain+0x4c/0x70
[494558.706323] [] notify_die+0x45/0x60
[494558.706328] [] __die+0x7f/0xf0
[494558.706334] [] no_context+0x266/0x2b2
[494558.706340] [] __bad_area_nosemaphore+0x73/0x1ca
[494558.706346] [] bad_area_nosemaphore+0x13/0x15
[494558.706349] [] __do_page_fault+0x2f6/0x470
[494558.706353] [] ? restore_args+0x30/0x30
[494558.706358] [] do_page_fault+0x23/0x80
[494558.706362] [] page_fault+0x28/0x30
[494558.706369] [] ? scan_vma_one_page+0x1a0/0x1430
[494558.706374] [] ? scan_vma_one_page+0xbec/0x1430
[494558.706380] [] uksm_do_scan+0x129/0x1910
[494558.706388] [] uksm_scan_thread+0x1b5/0x1e0
[494558.706394] [] ? wake_up_atomic_t+0x30/0x30
[494558.706398] [] ? uksm_do_scan+0x1910/0x1910
[494558.706402] [] kthread+0xcf/0xe0
[494558.706408] [] ? kthread_create_on_node+0x140/0x140
[494558.706412] [] ret_from_fork+0x58/0x90
[494558.706416] [] ? kthread_create_on_node+0x140/0x140
[494558.706420] die info:Oops:0000
[494558.706424] CPU: 22 PID: 430 Comm: uksmd Tainted: G OE ---- ------- 3.10.0-327.49.58.45_12.x86_64 #1
[494558.706425] Hardware name: Huawei CH121 V3/IT11SGCA1, BIOS 1.51 06/11/2015
[494558.706427] task: ffff8803bd534500 ti: ffff8803bd650000 task.ti: ffff8803bd650000
[494558.706431] RIP: 0010:[] [] scan_vma_one_page+0x1a0/0x1430
[494558.706433] RSP: 0000:ffff8803bd653d38 EFLAGS: 00010246
[494558.706434] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000002
[494558.706436] RDX: 0000000000000000 RSI: ffff8803bd534500 RDI: 00000000000082d0
[494558.706437] RBP: ffff8803bd653db0 R08: 0000000000000000 R09: 0000000000000040
[494558.706439] R10: 0000000000000001 R11: 0000000000000001 R12: ffff88135614c000
[494558.706440] R13: 0000000000aa91e0 R14: ffff880394945ba8 R15: 0000000000000aa9
[494558.706442] FS: 0000000000000000(0000) GS:ffff88203e280000(0000) knlGS:0000000000000000
[494558.706444] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[494558.706445] CR2: 0000000000000000 CR3: 000000110e600000 CR4: 00000000001427e0
[494558.706447] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[494558.706448] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[494558.706449] Stack:
[494558.706454] 00000000b4900000 0000000000000001 ffff88012ca77ae0 000000000015523c
[494558.706458] 0000000000000375 ffff8803bd653fd8 000000000000000b 0000000000000000
[494558.706463] ffff88135614c000 00000000a6eb3604 0000000000000064 0000000000000005
[494558.706467] ffff8803ba0dabc0 ffffffff81efe980 ffff88135614c000 ffff8803bd653e60
[494558.706472] ffffffff811c6409 ffff8803bd534500 ffff8803bd534500 0000172edfb67198
[494558.706477] 0000000000000014 0000000000000043 0000000000000000 0000000000000000
[494558.706481] 0000000000000000 0000000000000286 ffff8803bd534500 ffff8803bd653fd8
[494558.706486] 0000000000000000 0000000000000286 ffffffff819dbde0 00000000a6eb3604
[494558.706490] ffff8803bd534500 ffff8803bd534500 ffff8803bd653e88 ffff8803bd534500
[494558.706495] 0000000000000000 ffff8803bd653ec0 ffffffff811c7da5 0000000000000000
[494558.706500] ffff8803bd534500 ffffffff810a7110 ffff8803bd653e88 ffff8803bd653e88
[494558.706504] 00000000a6eb3604 ffff8810294b7d78 0000000000000000 ffffffff811c7bf0
[494558.706509] 0000000000000000 ffff8803bd653f48 ffffffff810a60ef 0000000000000000
[494558.706513] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[494558.706518] ffff8803bd653ef8 ffff8803bd653ef8 ffff882000000000 ffff882000000000
[494558.706523] ffff8803bd653f18 ffff8803bd653f18 00000000a6eb3604 ffffffff810a6020
[494558.706527] 0000000000000000 0000000000000000 ffff8810294b7d78 ffffffff81654d58
[494558.706532] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[494558.706536] ffff8810294b7d78 ffffffff810a6020 0000000000000000 0000000000000000
[494558.706541] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[494558.706545] 0000000000000000 0000000000000000 0000000000000000 ffffffffffffffff
[494558.706550] 0000000000000000 0000000000000010 0000000000000202 ffff8803bd653f58
[494558.706552] 0000000000000018
[494558.706553] Call Trace:
[494558.706560] [] uksm_do_scan+0x129/0x1910
[494558.706568] [] uksm_scan_thread+0x1b5/0x1e0
[494558.706571] [] ? wake_up_atomic_t+0x30/0x30
[494558.706576] [] ? uksm_do_scan+0x1910/0x1910
[494558.706579] [] kthread+0xcf/0xe0
[494558.706585] [] ? kthread_create_on_node+0x140/0x140
[494558.706589] [] ret_from_fork+0x58/0x90
[494558.706593] [] ? kthread_create_on_node+0x140/0x140
[494558.706650] Code: 00 00 00 00 00 16 00 00 48 01 d0 48 ba 00 00 00 00 00 88 ff ff 48 c1 f8 06 48 c1 e0 0c 48 01 d0 4c 01 e8 48 89 45 c0 48 8b 45 c0 <48> 83 38 00 0f 84 44 0d 00 00 49 83 3e 00 0f 84 04 12 00 00 48
[494558.784626] (o2hb_write-E5B3,29127,27):o2hb_write_disk_heartbeat_async:2512 ERROR: status = -12
[494558.856053] (o2hb_write-E5B3,29127,27):o2hb_write_disk_heartbeat_async:2512 ERROR: status = -12
[494558.879184] device-mapper: multipath: Failing path 68:0.
[494558.907627] device-mapper: multipath: Failing path 68:96.
[494559.707188] I(0x20) am waiting for the last events(0x10) done..2....4....6....8....10....12....14....16....18....20..
[494559.707268] Modules linked in: kboxdriver(O) kbox(O) bridge ip6table_filter ip6_tables iptable_filter dm_service_time dm_multipath iscsi_tcp libiscsi_tcp libiscsi scsi_transport_iscsi 8021q garp stp mrp llc vfat fat isofs ext4 xfs ocfs2_lockfs(OE) ocfs2(OE) ocfs2_adl(OE) jbd2 ocfs2_stack_o2cb(OE) ocfs2_dlm(OE) ocfs2_nodemanager(OE) ocfs2_stackglue(OE) dev_connlimit(O) vhba(OE) bum(O) ip_set nfnetlink prio(O) nat(O) vport_vxlan(O) openvswitch(O) nf_defrag_ipv6 gre hotpatch(OE) signo_catch(O) pmcint(O) ipmi_devintf ipmi_si ipmi_msghandler coretemp intel_rapl crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel be2net lrw gf128mul vxlan glue_helper ablk_helper iTCO_wdt cryptd sg ip6_udp_tunnel iTCO_vendor_support pcspkr i2c_i801 udp_tunnel kvm_intel(O) i2c_core kvm(O) lpc_ich sb_edac mfd_core edac_core
[494559.707316] mei_me mei shpchp acpi_power_meter remote_trigger(O) nf_conntrack_ipv4 nf_defrag_ipv4 vhost_net(O) tun(O) vhost(O) macvtap macvlan vfio_pci irqbypass vfio_iommu_type1 vfio xt_sctp nf_conntrack_proto_sctp nf_nat_proto_sctp nf_nat nf_conntrack sctp libcrc32c ip_tables ext3 mbcache jbd dm_mod sd_mod lpfc crc_t10dif crct10dif_generic ahci mpt2sas crct10dif_pclmul scsi_transport_fc libahci raid_class scsi_tgt libata scsi_transport_sas crct10dif_common nbd(OE) [last unloaded: kbox]
[494559.707320] CPU: 22 PID: 430 Comm: uksmd Tainted: G OE ---- ------- 3.10.0-327.49.58.45_12.x86_64 #1
[494559.707321] Hardware name: Huawei CH121 V3/IT11SGCA1, BIOS 1.51 06/11/2015
[494559.707323] task: ffff8803bd534500 ti: ffff8803bd650000 task.ti: ffff8803bd650000
[494559.707328] RIP: 0010:[] [] scan_vma_one_page+0x1a0/0x1430
[494559.707330] RSP: 0000:ffff8803bd653d38 EFLAGS: 00010246
[494559.707331] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000002
[494559.707333] RDX: 0000000000000000 RSI: ffff8803bd534500 RDI: 00000000000082d0
[494559.707334] RBP: ffff8803bd653db0 R08: 0000000000000000 R09: 0000000000000040
[494559.707336] R10: 0000000000000001 R11: 0000000000000001 R12: ffff88135614c000
[494559.707337] R13: 0000000000aa91e0 R14: ffff880394945ba8 R15: 0000000000000aa9
[494559.707339] FS: 0000000000000000(0000) GS:ffff88203e280000(0000) knlGS:0000000000000000
[494559.707341] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[494559.707342] CR2: 0000000000000000 CR3: 000000110e600000 CR4: 00000000001427e0
[494559.707343] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[494559.707345] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[494559.707346] Stack:
[494559.707351] 00000000b4900000 0000000000000001 ffff88012ca77ae0 000000000015523c
[494559.707355] 0000000000000375 ffff8803bd653fd8 000000000000000b 0000000000000000
[494559.707360] ffff88135614c000 00000000a6eb3604 0000000000000064 0000000000000005
[494559.707364] ffff8803ba0dabc0 ffffffff81efe980 ffff88135614c000 ffff8803bd653e60
[494559.707369] ffffffff811c6409 ffff8803bd534500 ffff8803bd534500 0000172edfb67198
[494559.707373] 0000000000000014 0000000000000043 0000000000000000 0000000000000000
[494559.707377] 0000000000000000 0000000000000286 ffff8803bd534500 ffff8803bd653fd8
[494559.707382] 0000000000000000 0000000000000286 ffffffff819dbde0 00000000a6eb3604
[494559.707387] ffff8803bd534500 ffff8803bd534500 ffff8803bd653e88 ffff8803bd534500
[494559.707392] 0000000000000000 ffff8803bd653ec0 ffffffff811c7da5 0000000000000000
[494559.707396] ffff8803bd534500 ffffffff810a7110 ffff8803bd653e88 ffff8803bd653e88
[494559.707401] 00000000a6eb3604 ffff8810294b7d78 0000000000000000 ffffffff811c7bf0
[494559.707406] 0000000000000000 ffff8803bd653f48 ffffffff810a60ef 0000000000000000
[494559.707410] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[494559.707415] ffff8803bd653ef8 ffff8803bd653ef8 ffff882000000000 ffff882000000000
[494559.707419] ffff8803bd653f18 ffff8803bd653f18 00000000a6eb3604 ffffffff810a6020
[494559.707424] 0000000000000000 0000000000000000 ffff8810294b7d78 ffffffff81654d58
[494559.707428] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[494559.707432] ffff8810294b7d78 ffffffff810a6020 0000000000000000 0000000000000000
[494559.707437] 0000000000000000 0000000000000000 0000000000000000 0000000000000000
[494559.707441] 0000000000000000 0000000000000000 0000000000000000 ffffffffffffffff
[494559.707446] 0000000000000000 0000000000000010 0000000000000202 ffff8803bd653f58
[494559.707447] 0000000000000018
[494559.707448] Call Trace:
[494559.707455] [] uksm_do_scan+0x129/0x1910
[494559.707463] [] uksm_scan_thread+0x1b5/0x1e0
[494559.707467] [] ? wake_up_atomic_t+0x30/0x30
[494559.707471] [] ? uksm_do_scan+0x1910/0x1910
[494559.707475] [] kthread+0xcf/0xe0
[494559.707480] [] ? kthread_create_on_node+0x140/0x140
[494559.707485] [] ret_from_fork+0x58/0x90
[494559.707489] [] ? kthread_create_on_node+0x140/0x140
[494559.707545] Code: 00 00 00 00 00 16 00 00 48 01 d0 48 ba 00 00 00 00 00 88 ff ff 48 c1 f8 06 48 c1 e0 0c 48 01 d0 4c 01 e8 48 89 45 c0 48 8b 45 c0 <48> 83 38 00 0f 84 44 0d 00 00 49 83 3e 00 0f 84 04 12 00 00 48
[494559.707548] RIP [] scan_vma_one_page+0x1a0/0x1430
[494559.707549] RSP
[494559.707550] CR2: 0000000000000000
[494559.708372] ---[ end trace 79ebb2813afb31f5 ]---

The corresponding codes is:
(gdb) b *(scan_vma_one_page+0x1a0)
Breakpoint 1 at 0xffffffff811c5050: file mm/uksm.c, line 3278.

3240 static struct rmap_item *get_next_rmap_item(struct vma_slot *slot, u32 *hash)
3276 if (swap_index != scan_index) {
3277 swap_entry = get_rmap_list_entry(slot, swap_index, 1);
3278 if (entry_is_new(swap_entry)) {
3279 swap_entry->addr = get_index_orig_addr(slot,
3280 swap_index);
3281 set_is_addr(swap_entry->addr);

For the above codes, we access the swap_entry directly, without checking its return value in extreme memory pressure, which the alloc_page() will fail. The related codes is show as bellow.
2964 struct rmap_list_entry *get_rmap_list_entry(struct vma_slot *slot,
2965 unsigned long index, int need_alloc)
2966 {
2967 unsigned long pool_index;
2968 struct page *page;
2969 void *addr;
2970
2971
2972 pool_index = get_pool_index(slot, index);
2973 if (!slot->rmap_list_pool[pool_index]) {
2974 if (!need_alloc)
2975 return NULL;
2976
2977 page = alloc_page(GFP_KERNEL | __GFP_ZERO | __GFP_NOWARN);
2978 if (!page)
2979 return NULL;
2980
2981 slot->rmap_list_pool[pool_index] = page;
2982 }

For this problem, IMHO, we have two ways to fix it, first one is check NULL before access it in *get_next_rmap_item(), and return NULL if get_rmap_list_entry() return NULL, another way is to ensure
alloc_page() not fail in this codes, use GFP_HIGH | GFP_NOFAIL.
So what's your ideas ?

Thanks

@naixia
Copy link
Collaborator

naixia commented May 16, 2017

@colo-ft
Good catch. I think it's safer to check the NULL returned by get_rmap_list_entry() and abort this call to scan_vma_one_page().

@colo-ft
Copy link
Author

colo-ft commented May 16, 2017

@naixia Got it, thanks for your quick reply.

@naixia
Copy link
Collaborator

naixia commented Jun 15, 2017

@colo-ft
Have you fixed this bug already? Will you share your patch with us?

@colo-ft
Copy link
Author

colo-ft commented Jul 6, 2017

@naixia
Hi Mr. Xia,

Sorry for the later reply.

The fix is quite simply, just as your suggestion, it is like this:

diff --git a/mm/uksm.c b/mm/uksm.c
--- a/mm/uksm.c
--- b/mm/uksm.c
@@ --3275,6 +3275,9 @@ static rmap_item *get_next_rmap_item(struct vma_slot *slot, u32 *hash)

if (swap_index != scan_index) {
	swap_entry = get_rmap_list_entry(slot, swap_index, 1);
  •   if (!swap_entry)
    
  •   	return NULL;
    
  •   if (entry_is_new(swap_entry)) {
      	swap_entry->addr = get_index_orig_addr(slot,
      							swap_index);
    

--

Thanks

@naixia
Copy link
Collaborator

naixia commented Jul 30, 2017

OK, I'll add the fix code to the next release.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants