BUG: unable to handle kernel NULL pointer dereference at tcp_validate_incoming - tp->mptcp == NULL but tp->mpc is set #71

cpaasch · 2015-01-19T22:12:22Z

Happened at least once with test simple_abndiff on f9edddd (v3.17 + MPTCP) but with the scheduling-patches (#70). However, seems to be absolutely unrelated to the scheduling.

Crash happens when accessing tp->mptcp in mptcp_reset_mopt().

[267465.961722] BUG: unable to handle kernel NULL pointer dereference at 0000000000000019
[267465.968188] IP: [<ffffffff8166305c>] tcp_validate_incoming+0x35c/0x3b0
[267465.974360] PGD 0
[267465.977647] Oops: 0000 [#1] SMP
[267465.981644] Modules linked in:
[267465.987349] CPU: 3 PID: 26928 Comm: apache2 Not tainted 3.17.0.mptcp #110
[267465.991906] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014
[267465.991906] task: ffff88002f2e2000 ti: ffff88003b5d0000 task.ti: ffff88003b5d0000
[267465.991906] RIP: 0010:[<ffffffff8166305c>]  [<ffffffff8166305c>] tcp_validate_incoming+0x35c/0x3b0
[267465.991906] RSP: 0018:ffff88003b5d3c18  EFLAGS: 00010202
[267465.991906] RAX: 0000000000000000 RBX: ffff88003c9cbd80 RCX: 0000000000000000
[267465.991906] RDX: 0000000000000000 RSI: ffff88002f2e26f0 RDI: 0000000000000286
[267465.991906] RBP: ffff88003b5d3c48 R08: 0000000000000000 R09: 0000000000000000
[267465.991906] R10: 0000000000000002 R11: 0000000000000000 R12: ffff88003c160c00
[267465.991906] R13: ffff88002f6b3462 R14: 0000000000000000 R15: ffff88003c160c28
[267465.991906] FS:  00007fd6f10c0700(0000) GS:ffff88003fd80000(0000) knlGS:0000000000000000
[267465.991906] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[267465.991906] CR2: 0000000000000019 CR3: 000000002ee47000 CR4: 00000000000006e0
[267465.991906] Stack:
[267465.991906]  0000000000000002 ffff88003c9cbd80 ffff88003c160c00 ffff88002f6b3462
[267465.991906]  0000000000000000 ffff88002ec9e988 ffff88003b5d3c98 ffffffff816641c8
[267465.991906]  ffff88003b5d3c78 ffff88003c160600 ffff88003e001700 ffff88003c160c00
[267465.991906] Call Trace:
[267465.991906]  [<ffffffff816641c8>] tcp_rcv_state_process+0x208/0x910
[267465.991906]  [<ffffffff8166ebf3>] tcp_v4_do_rcv+0xe3/0x2a0
[267465.991906]  [<ffffffff81703a10>] ? mptcp_backlog_rcv+0xa0/0xb0
[267465.991906]  [<ffffffff816f12fd>] tcp_v6_do_rcv+0x1dd/0x420
[267465.991906]  [<ffffffff817039bb>] mptcp_backlog_rcv+0x4b/0xb0
[267465.991906]  [<ffffffff815e87a2>] release_sock+0x92/0x1f0
[267465.991906]  [<ffffffff81705645>] mptcp_close+0x1f5/0x5c0
[267465.991906]  [<ffffffff816576fd>] tcp_close+0x2ed/0x4a0
[267465.991906]  [<ffffffff81685b4e>] inet_release+0xae/0xf0
[267465.991906]  [<ffffffff81685ac9>] ? inet_release+0x29/0xf0
[267465.991906]  [<ffffffff816c2d0f>] inet6_release+0x3f/0x50
[267465.991906]  [<ffffffff815e1349>] sock_release+0x29/0xa0
[267465.991906]  [<ffffffff815e1542>] sock_close+0x12/0x20
[267465.991906]  [<ffffffff811a4678>] __fput+0xc8/0x210
[267465.991906]  [<ffffffff811a486e>] ____fput+0xe/0x10
[267465.991906]  [<ffffffff8108a70d>] task_work_run+0xad/0xe0
[267465.991906]  [<ffffffff81014f45>] do_notify_resume+0x75/0x80
[267465.991906]  [<ffffffff813770de>] ? trace_hardirqs_on_thunk+0x3a/0x3f
[267465.991906]  [<ffffffff817317e2>] int_signal+0x12/0x17
[267465.991906] Code: 8b cc 07 00 00 c1 e0 0a f7 f6 39 c1 0f 87 4c ff ff ff e9 12 fe ff ff f6 83 58 09 00 00 01 0f 84 9f fd ff ff 48 8b 83 60 09 00 00 <0f> b6 50 19 80 60 18 8f 83 e2 80 88 50 19 e9 85 fd ff ff 3b b3
[267465.991906] RIP  [<ffffffff8166305c>] tcp_validate_incoming+0x35c/0x3b0
[267465.991906]  RSP <ffff88003b5d3c18>
[267465.991906] CR2: 0000000000000019
[267465.991906] ---[ end trace dca7bd246b046643 ]---
[267465.991906] Kernel panic - not syncing: Fatal exception in interrupt
[267465.991906] Kernel Offset: 0x0 from 0xffffffff81000000 (relocation range: 0xffffffff80000000-0xffffffff9fffffff)
[267465.991906] ---[ end Kernel panic - not syncing: Fatal exception in interrupt

The text was updated successfully, but these errors were encountered:

andywu106 · 2015-08-10T08:49:50Z

Hi, cpaasch, any progress on this issue?

cpaasch · 2015-08-10T15:51:27Z

I haven't seen this bug happening again. Can you reproduce it?

andywu106 · 2015-08-11T02:52:33Z

On 2015/8/10 23:52, Christoph Paasch wrote:

I haven't seen this bug happening again. Can you reproduce it?

I hit this once in mptcp_parse_options(), but I can not reproduce it too :)

—
Reply to this email directly or view it on GitHub #71 (comment).

cpaasch · 2015-08-11T03:14:59Z

Good that you hit it as well! This confirms that I was not dreaming :)

cpaasch · 2015-08-11T22:44:39Z

Are you sure that it happened in mptcp_handle_options() ?

Because, I am suspecting that something might go wrong when processing a RST (which would result in the socket being removed) while processing segments from the backlog queue.

Maybe you still have the crash-trace?

andywu106 · 2015-08-13T08:01:46Z

It did happend in mptcp_handle_options(), but I did not keep the crash-trace.

andywu106 · 2015-08-13T08:02:01Z

It did happened in mptcp_handle_options(), but I did not keep the crash-trace.

[ Upstream commit ddf1d39 ] An unprivileged user can trigger an oops on a kernel with CONFIG_CHECKPOINT_RESTORE. proc_pid_cmdline_read takes mmap_sem for reading and obtains args + env start/end values. These get sanity checked as follows: BUG_ON(arg_start > arg_end); BUG_ON(env_start > env_end); These can be changed by prctl_set_mm. Turns out also takes the semaphore for reading, effectively rendering it useless. This results in: kernel BUG at fs/proc/base.c:240! invalid opcode: 0000 [#1] SMP Modules linked in: virtio_net CPU: 0 PID: 925 Comm: a.out Not tainted 4.4.0-rc8-next-20160105dupa+ #71 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011 task: ffff880077a68000 ti: ffff8800784d0000 task.ti: ffff8800784d0000 RIP: proc_pid_cmdline_read+0x520/0x530 RSP: 0018:ffff8800784d3db8 EFLAGS: 00010206 RAX: ffff880077c5b6b0 RBX: ffff8800784d3f18 RCX: 0000000000000000 RDX: 0000000000000002 RSI: 00007f78e8857000 RDI: 0000000000000246 RBP: ffff8800784d3e40 R08: 0000000000000008 R09: 0000000000000001 R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000050 R13: 00007f78e8857800 R14: ffff88006fcef000 R15: ffff880077c5b600 FS: 00007f78e884a740(0000) GS:ffff88007b200000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007f78e8361770 CR3: 00000000790a5000 CR4: 00000000000006f0 Call Trace: __vfs_read+0x37/0x100 vfs_read+0x82/0x130 SyS_read+0x58/0xd0 entry_SYSCALL_64_fastpath+0x12/0x76 Code: 4c 8b 7d a8 eb e9 48 8b 9d 78 ff ff ff 4c 8b 7d 90 48 8b 03 48 39 45 a8 0f 87 f0 fe ff ff e9 d1 fe ff ff 4c 8b 7d 90 eb c6 0f 0b <0f> 0b 0f 0b 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 RIP proc_pid_cmdline_read+0x520/0x530 ---[ end trace 97882617ae9c6818 ]--- Turns out there are instances where the code just reads aformentioned values without locking whatsoever - namely environ_read and get_cmdline. Interestingly these functions look quite resilient against bogus values, but I don't believe this should be relied upon. The first patch gets rid of the oops bug by grabbing mmap_sem for writing. The second patch is optional and puts locking around aformentioned consumers for safety. Consumers of other fields don't seem to benefit from similar treatment and are left untouched. This patch (of 2): The code was taking the semaphore for reading, which does not protect against readers nor concurrent modifications. The problem could cause a sanity checks to fail in procfs's cmdline reader, resulting in an OOPS. Note that some functions perform an unlocked read of various mm fields, but they seem to be fine despite possible modificaton. Signed-off-by: Mateusz Guzik <[email protected]> Acked-by: Cyrill Gorcunov <[email protected]> Cc: Alexey Dobriyan <[email protected]> Cc: Jarod Wilson <[email protected]> Cc: Jan Stancek <[email protected]> Cc: Al Viro <[email protected]> Cc: Anshuman Khandual <[email protected]> Cc: <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

cpaasch · 2016-11-03T15:58:59Z

We had some fixes that solved memory corruption issues. This might/should have solved this here.

When target side trace in turned on and flush command is issued from the host it results in the following Oops. [ 856.789724] BUG: kernel NULL pointer dereference, address: 0000000000000068 [ 856.790686] #PF: supervisor read access in kernel mode [ 856.791262] #PF: error_code(0x0000) - not-present page [ 856.791863] PGD 6d7110067 P4D 6d7110067 PUD 66f0ad067 PMD 0 [ 856.792527] Oops: 0000 [multipath-tcp#1] SMP NOPTI [ 856.792950] CPU: 15 PID: 7034 Comm: nvme Tainted: G OE 5.9.0nvme-5.9+ multipath-tcp#71 [ 856.793790] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba5276e3214 [ 856.794956] RIP: 0010:trace_event_raw_event_nvmet_req_init+0x13e/0x170 [nvmet] [ 856.795734] Code: 41 5c 41 5d c3 31 d2 31 f6 e8 4e 9b b8 e0 e9 0e ff ff ff 49 8b 55 00 48 8b 38 8b 0 [ 856.797740] RSP: 0018:ffffc90001be3a60 EFLAGS: 00010246 [ 856.798375] RAX: 0000000000000000 RBX: ffff8887e7d2c01c RCX: 0000000000000000 [ 856.799234] RDX: 0000000000000020 RSI: 0000000057e70ea2 RDI: ffff8887e7d2c034 [ 856.800088] RBP: ffff88869f710578 R08: ffff888807500d40 R09: 00000000fffffffe [ 856.800951] R10: 0000000064c66670 R11: 00000000ef955201 R12: ffff8887e7d2c034 [ 856.801807] R13: ffff88869f7105c8 R14: 0000000000000040 R15: ffff88869f710440 [ 856.802667] FS: 00007f6a22bd8780(0000) GS:ffff888813a00000(0000) knlGS:0000000000000000 [ 856.803635] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 856.804367] CR2: 0000000000000068 CR3: 00000006d73e0000 CR4: 00000000003506e0 [ 856.805283] Call Trace: [ 856.805613] nvmet_req_init+0x27c/0x480 [nvmet] [ 856.806200] nvme_loop_queue_rq+0xcb/0x1d0 [nvme_loop] [ 856.806862] blk_mq_dispatch_rq_list+0x123/0x7b0 [ 856.807459] ? kvm_sched_clock_read+0x14/0x30 [ 856.808025] __blk_mq_sched_dispatch_requests+0xc7/0x170 [ 856.808708] blk_mq_sched_dispatch_requests+0x30/0x60 [ 856.809372] __blk_mq_run_hw_queue+0x70/0x100 [ 856.809935] __blk_mq_delay_run_hw_queue+0x156/0x170 [ 856.810574] blk_mq_run_hw_queue+0x86/0xe0 [ 856.811104] blk_mq_sched_insert_request+0xef/0x160 [ 856.811733] blk_execute_rq+0x69/0xc0 [ 856.812212] ? blk_mq_rq_ctx_init+0xd0/0x230 [ 856.812784] nvme_execute_passthru_rq+0x57/0x130 [nvme_core] [ 856.813461] nvme_submit_user_cmd+0xeb/0x300 [nvme_core] [ 856.814099] nvme_user_cmd.isra.82+0x11e/0x1a0 [nvme_core] [ 856.814752] blkdev_ioctl+0x1dc/0x2c0 [ 856.815197] block_ioctl+0x3f/0x50 [ 856.815606] __x64_sys_ioctl+0x84/0xc0 [ 856.816074] do_syscall_64+0x33/0x40 [ 856.816533] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 856.817168] RIP: 0033:0x7f6a222ed107 [ 856.817617] Code: 44 00 00 48 8b 05 81 cd 2c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 8 [ 856.819901] RSP: 002b:00007ffca848f058 EFLAGS: 00000202 ORIG_RAX: 0000000000000010 [ 856.820846] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f6a222ed107 [ 856.821726] RDX: 00007ffca848f060 RSI: 00000000c0484e43 RDI: 0000000000000003 [ 856.822603] RBP: 0000000000000003 R08: 000000000000003f R09: 0000000000000005 [ 856.823478] R10: 00007ffca848ece0 R11: 0000000000000202 R12: 00007ffca84912d3 [ 856.824359] R13: 00007ffca848f4d0 R14: 0000000000000002 R15: 000000000067e900 [ 856.825236] Modules linked in: nvme_loop(OE) nvmet(OE) nvme_fabrics(OE) null_blk nvme(OE) nvme_corel Move the nvmet_req_init() tracepoint after we parse the command in nvmet_req_init() so that we can get rid of the duplicate nvmet_find_namespace() call. Rename __assign_disk_name() -> __assign_req_name(). Now that we call tracepoint after parsing the command simplify the newly added __assign_req_name() which fixes this bug. Signed-off-by: Chaitanya Kulkarni <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>

cpaasch added Level: Normal bug labels Jan 20, 2015

cpaasch changed the title ~~tp->mptcp == NULL but tp->mpc is set~~ BUG: unable to handle kernel NULL pointer dereference at tcp_validate_incoming - tp->mptcp == NULL but tp->mpc is set Aug 20, 2015

cpaasch closed this as completed Nov 3, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: unable to handle kernel NULL pointer dereference at tcp_validate_incoming - tp->mptcp == NULL but tp->mpc is set #71

BUG: unable to handle kernel NULL pointer dereference at tcp_validate_incoming - tp->mptcp == NULL but tp->mpc is set #71

cpaasch commented Jan 19, 2015

andywu106 commented Aug 10, 2015

cpaasch commented Aug 10, 2015

andywu106 commented Aug 11, 2015

cpaasch commented Aug 11, 2015

cpaasch commented Aug 11, 2015

andywu106 commented Aug 13, 2015

andywu106 commented Aug 13, 2015

cpaasch commented Nov 3, 2016

BUG: unable to handle kernel NULL pointer dereference at tcp_validate_incoming - tp->mptcp == NULL but tp->mpc is set #71

BUG: unable to handle kernel NULL pointer dereference at tcp_validate_incoming - tp->mptcp == NULL but tp->mpc is set #71

Comments

cpaasch commented Jan 19, 2015

andywu106 commented Aug 10, 2015

cpaasch commented Aug 10, 2015

andywu106 commented Aug 11, 2015

cpaasch commented Aug 11, 2015

cpaasch commented Aug 11, 2015

andywu106 commented Aug 13, 2015

andywu106 commented Aug 13, 2015

cpaasch commented Nov 3, 2016