Update ctx.h #37

nmtigor · 2013-06-19T21:45:29Z

No description provided.

…ET pending data We accidentally call down to ip6_push_pending_frames when uncorking pending AF_INET data on a ipv6 socket. This results in the following splat (from Dave Jones): skbuff: skb_under_panic: text:ffffffff816765f6 len:48 put:40 head:ffff88013deb6df0 data:ffff88013deb6dec tail:0x2c end:0xc0 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:126! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: dccp_ipv4 dccp 8021q garp bridge stp dlci mpoa snd_seq_dummy sctp fuse hidp tun bnep nfnetlink scsi_transport_iscsi rfcomm can_raw can_bcm af_802154 appletalk caif_socket can caif ipt_ULOG x25 rose af_key pppoe pppox ipx phonet irda llc2 ppp_generic slhc p8023 psnap p8022 llc crc_ccitt atm bluetooth +netrom ax25 nfc rfkill rds af_rxrpc coretemp hwmon kvm_intel kvm crc32c_intel snd_hda_codec_realtek ghash_clmulni_intel microcode pcspkr snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep usb_debug snd_seq snd_seq_device snd_pcm e1000e snd_page_alloc snd_timer ptp snd pps_core soundcore xfs libcrc32c CPU: 2 PID: 8095 Comm: trinity-child2 Not tainted 3.10.0-rc7+ #37 task: ffff8801f52c2520 ti: ffff8801e6430000 task.ti: ffff8801e6430000 RIP: 0010:[<ffffffff816e759c>] [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP: 0018:ffff8801e6431de8 EFLAGS: 00010282 RAX: 0000000000000086 RBX: ffff8802353d3cc0 RCX: 0000000000000006 RDX: 0000000000003b90 RSI: ffff8801f52c2ca0 RDI: ffff8801f52c2520 RBP: ffff8801e6431e08 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022ea0c800 R13: ffff88022ea0cdf8 R14: ffff8802353ecb40 R15: ffffffff81cc7800 FS: 00007f5720a10740(0000) GS:ffff880244c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000005862000 CR3: 000000022843c000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Stack: ffff88013deb6dec 000000000000002c 00000000000000c0 ffffffff81a3f6e4 ffff8801e6431e18 ffffffff8159a9aa ffff8801e6431e90 ffffffff816765f6 ffffffff810b756b 0000000700000002 ffff8801e6431e40 0000fea9292aa8c0 Call Trace: [<ffffffff8159a9aa>] skb_push+0x3a/0x40 [<ffffffff816765f6>] ip6_push_pending_frames+0x1f6/0x4d0 [<ffffffff810b756b>] ? mark_held_locks+0xbb/0x140 [<ffffffff81694919>] udp_v6_push_pending_frames+0x2b9/0x3d0 [<ffffffff81694660>] ? udplite_getfrag+0x20/0x20 [<ffffffff8162092a>] udp_lib_setsockopt+0x1aa/0x1f0 [<ffffffff811cc5e7>] ? fget_light+0x387/0x4f0 [<ffffffff816958a4>] udpv6_setsockopt+0x34/0x40 [<ffffffff815949f4>] sock_common_setsockopt+0x14/0x20 [<ffffffff81593c31>] SyS_setsockopt+0x71/0xd0 [<ffffffff816f5d54>] tracesys+0xdd/0xe2 Code: 00 00 48 89 44 24 10 8b 87 d8 00 00 00 48 89 44 24 08 48 8b 87 e8 00 00 00 48 c7 c7 c0 04 aa 81 48 89 04 24 31 c0 e8 e1 7e ff ff <0f> 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 RIP [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP <ffff8801e6431de8> This patch adds a check if the pending data is of address family AF_INET and directly calls udp_push_ending_frames from udp_v6_push_pending_frames if that is the case. This bug was found by Dave Jones with trinity. (Also move the initialization of fl6 below the AF_INET check, even if not strictly necessary.) Cc: Dave Jones <[email protected]> Cc: YOSHIFUJI Hideaki <[email protected]> Signed-off-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]>

Running one program that continuously hotplugs and replugs a cpu concurrently with another program that continuously writes to the scaling_setspeed node eventually deadlocks with: ============================================= [ INFO: possible recursive locking detected ] 3.4.0 torvalds#37 Tainted: G W --------------------------------------------- filemonkey/122 is trying to acquire lock: (s_active#13){++++.+}, at: [<c01a3d28>] sysfs_remove_dir+0x9c/0xb4 but task is already holding lock: (s_active#13){++++.+}, at: [<c01a22f0>] sysfs_write_file+0xe8/0x140 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(s_active#13); lock(s_active#13); *** DEADLOCK *** May be due to missing lock nesting notation 2 locks held by filemonkey/122: #0: (&buffer->mutex){+.+.+.}, at: [<c01a2230>] sysfs_write_file+0x28/0x140 #1: (s_active#13){++++.+}, at: [<c01a22f0>] sysfs_write_file+0xe8/0x140 stack backtrace: [<c0014fcc>] (unwind_backtrace+0x0/0x120) from [<c00ca600>] (validate_chain+0x6f8/0x1054) [<c00ca600>] (validate_chain+0x6f8/0x1054) from [<c00cb778>] (__lock_acquire+0x81c/0x8d8) [<c00cb778>] (__lock_acquire+0x81c/0x8d8) from [<c00cb9c0>] (lock_acquire+0x18c/0x1e8) [<c00cb9c0>] (lock_acquire+0x18c/0x1e8) from [<c01a3ba8>] (sysfs_addrm_finish+0xd0/0x180) [<c01a3ba8>] (sysfs_addrm_finish+0xd0/0x180) from [<c01a3d28>] (sysfs_remove_dir+0x9c/0xb4) [<c01a3d28>] (sysfs_remove_dir+0x9c/0xb4) from [<c02d0e5c>] (kobject_del+0x10/0x38) [<c02d0e5c>] (kobject_del+0x10/0x38) from [<c02d0f74>] (kobject_release+0xf0/0x194) [<c02d0f74>] (kobject_release+0xf0/0x194) from [<c0565a98>] (cpufreq_cpu_put+0xc/0x24) [<c0565a98>] (cpufreq_cpu_put+0xc/0x24) from [<c05683f0>] (store+0x6c/0x74) [<c05683f0>] (store+0x6c/0x74) from [<c01a2314>] (sysfs_write_file+0x10c/0x140) [<c01a2314>] (sysfs_write_file+0x10c/0x140) from [<c014af44>] (vfs_write+0xb0/0x128) [<c014af44>] (vfs_write+0xb0/0x128) from [<c014b06c>] (sys_write+0x3c/0x68) [<c014b06c>] (sys_write+0x3c/0x68) from [<c000e0e0>] (ret_fast_syscall+0x0/0x3c) This is because store() in cpufreq.c indirectly calls kobject_get() via cpufreq_cpu_get() and is the last one to call kobject_put() via cpufreq_cpu_put(). Sysfs code should not call kobject_get() or kobject_put() directly (see the comment around sysfs_schedule_callback() for more information). Fix this deadlock by introducing two new functions: struct cpufreq_policy *cpufreq_cpu_get_sysfs(unsigned int cpu) void cpufreq_cpu_put_sysfs(struct cpufreq_policy *data) which do the same thing as cpufreq_cpu_{get,put}() but don't call kobject functions. To easily trigger this deadlock you can insert an msleep() with a reasonably large value right after the fail label at the bottom of the store() function in cpufreq.c and then write scaling_setspeed in one task and offline the cpu in another. The first task will hang and be detected by the hung task detector. Signed-off-by: Stephen Boyd <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]>

…ET pending data [ Upstream commit 8822b64 ] We accidentally call down to ip6_push_pending_frames when uncorking pending AF_INET data on a ipv6 socket. This results in the following splat (from Dave Jones): skbuff: skb_under_panic: text:ffffffff816765f6 len:48 put:40 head:ffff88013deb6df0 data:ffff88013deb6dec tail:0x2c end:0xc0 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:126! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: dccp_ipv4 dccp 8021q garp bridge stp dlci mpoa snd_seq_dummy sctp fuse hidp tun bnep nfnetlink scsi_transport_iscsi rfcomm can_raw can_bcm af_802154 appletalk caif_socket can caif ipt_ULOG x25 rose af_key pppoe pppox ipx phonet irda llc2 ppp_generic slhc p8023 psnap p8022 llc crc_ccitt atm bluetooth +netrom ax25 nfc rfkill rds af_rxrpc coretemp hwmon kvm_intel kvm crc32c_intel snd_hda_codec_realtek ghash_clmulni_intel microcode pcspkr snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep usb_debug snd_seq snd_seq_device snd_pcm e1000e snd_page_alloc snd_timer ptp snd pps_core soundcore xfs libcrc32c CPU: 2 PID: 8095 Comm: trinity-child2 Not tainted 3.10.0-rc7+ torvalds#37 task: ffff8801f52c2520 ti: ffff8801e6430000 task.ti: ffff8801e6430000 RIP: 0010:[<ffffffff816e759c>] [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP: 0018:ffff8801e6431de8 EFLAGS: 00010282 RAX: 0000000000000086 RBX: ffff8802353d3cc0 RCX: 0000000000000006 RDX: 0000000000003b90 RSI: ffff8801f52c2ca0 RDI: ffff8801f52c2520 RBP: ffff8801e6431e08 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022ea0c800 R13: ffff88022ea0cdf8 R14: ffff8802353ecb40 R15: ffffffff81cc7800 FS: 00007f5720a10740(0000) GS:ffff880244c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000005862000 CR3: 000000022843c000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Stack: ffff88013deb6dec 000000000000002c 00000000000000c0 ffffffff81a3f6e4 ffff8801e6431e18 ffffffff8159a9aa ffff8801e6431e90 ffffffff816765f6 ffffffff810b756b 0000000700000002 ffff8801e6431e40 0000fea9292aa8c0 Call Trace: [<ffffffff8159a9aa>] skb_push+0x3a/0x40 [<ffffffff816765f6>] ip6_push_pending_frames+0x1f6/0x4d0 [<ffffffff810b756b>] ? mark_held_locks+0xbb/0x140 [<ffffffff81694919>] udp_v6_push_pending_frames+0x2b9/0x3d0 [<ffffffff81694660>] ? udplite_getfrag+0x20/0x20 [<ffffffff8162092a>] udp_lib_setsockopt+0x1aa/0x1f0 [<ffffffff811cc5e7>] ? fget_light+0x387/0x4f0 [<ffffffff816958a4>] udpv6_setsockopt+0x34/0x40 [<ffffffff815949f4>] sock_common_setsockopt+0x14/0x20 [<ffffffff81593c31>] SyS_setsockopt+0x71/0xd0 [<ffffffff816f5d54>] tracesys+0xdd/0xe2 Code: 00 00 48 89 44 24 10 8b 87 d8 00 00 00 48 89 44 24 08 48 8b 87 e8 00 00 00 48 c7 c7 c0 04 aa 81 48 89 04 24 31 c0 e8 e1 7e ff ff <0f> 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 RIP [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP <ffff8801e6431de8> This patch adds a check if the pending data is of address family AF_INET and directly calls udp_push_ending_frames from udp_v6_push_pending_frames if that is the case. This bug was found by Dave Jones with trinity. (Also move the initialization of fl6 below the AF_INET check, even if not strictly necessary.) Signed-off-by: Hannes Frederic Sowa <[email protected]> Cc: Dave Jones <[email protected]> Cc: YOSHIFUJI Hideaki <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

…ET pending data [ Upstream commit 8822b64 ] We accidentally call down to ip6_push_pending_frames when uncorking pending AF_INET data on a ipv6 socket. This results in the following splat (from Dave Jones): skbuff: skb_under_panic: text:ffffffff816765f6 len:48 put:40 head:ffff88013deb6df0 data:ffff88013deb6dec tail:0x2c end:0xc0 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:126! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: dccp_ipv4 dccp 8021q garp bridge stp dlci mpoa snd_seq_dummy sctp fuse hidp tun bnep nfnetlink scsi_transport_iscsi rfcomm can_raw can_bcm af_802154 appletalk caif_socket can caif ipt_ULOG x25 rose af_key pppoe pppox ipx phonet irda llc2 ppp_generic slhc p8023 psnap p8022 llc crc_ccitt atm bluetooth +netrom ax25 nfc rfkill rds af_rxrpc coretemp hwmon kvm_intel kvm crc32c_intel snd_hda_codec_realtek ghash_clmulni_intel microcode pcspkr snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep usb_debug snd_seq snd_seq_device snd_pcm e1000e snd_page_alloc snd_timer ptp snd pps_core soundcore xfs libcrc32c CPU: 2 PID: 8095 Comm: trinity-child2 Not tainted 3.10.0-rc7+ #37 task: ffff8801f52c2520 ti: ffff8801e6430000 task.ti: ffff8801e6430000 RIP: 0010:[<ffffffff816e759c>] [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP: 0018:ffff8801e6431de8 EFLAGS: 00010282 RAX: 0000000000000086 RBX: ffff8802353d3cc0 RCX: 0000000000000006 RDX: 0000000000003b90 RSI: ffff8801f52c2ca0 RDI: ffff8801f52c2520 RBP: ffff8801e6431e08 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022ea0c800 R13: ffff88022ea0cdf8 R14: ffff8802353ecb40 R15: ffffffff81cc7800 FS: 00007f5720a10740(0000) GS:ffff880244c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000005862000 CR3: 000000022843c000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Stack: ffff88013deb6dec 000000000000002c 00000000000000c0 ffffffff81a3f6e4 ffff8801e6431e18 ffffffff8159a9aa ffff8801e6431e90 ffffffff816765f6 ffffffff810b756b 0000000700000002 ffff8801e6431e40 0000fea9292aa8c0 Call Trace: [<ffffffff8159a9aa>] skb_push+0x3a/0x40 [<ffffffff816765f6>] ip6_push_pending_frames+0x1f6/0x4d0 [<ffffffff810b756b>] ? mark_held_locks+0xbb/0x140 [<ffffffff81694919>] udp_v6_push_pending_frames+0x2b9/0x3d0 [<ffffffff81694660>] ? udplite_getfrag+0x20/0x20 [<ffffffff8162092a>] udp_lib_setsockopt+0x1aa/0x1f0 [<ffffffff811cc5e7>] ? fget_light+0x387/0x4f0 [<ffffffff816958a4>] udpv6_setsockopt+0x34/0x40 [<ffffffff815949f4>] sock_common_setsockopt+0x14/0x20 [<ffffffff81593c31>] SyS_setsockopt+0x71/0xd0 [<ffffffff816f5d54>] tracesys+0xdd/0xe2 Code: 00 00 48 89 44 24 10 8b 87 d8 00 00 00 48 89 44 24 08 48 8b 87 e8 00 00 00 48 c7 c7 c0 04 aa 81 48 89 04 24 31 c0 e8 e1 7e ff ff <0f> 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 RIP [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP <ffff8801e6431de8> This patch adds a check if the pending data is of address family AF_INET and directly calls udp_push_ending_frames from udp_v6_push_pending_frames if that is the case. This bug was found by Dave Jones with trinity. (Also move the initialization of fl6 below the AF_INET check, even if not strictly necessary.) Signed-off-by: Hannes Frederic Sowa <[email protected]> Cc: Dave Jones <[email protected]> Cc: YOSHIFUJI Hideaki <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

…ET pending data [ Upstream commit 8822b64 ] We accidentally call down to ip6_push_pending_frames when uncorking pending AF_INET data on a ipv6 socket. This results in the following splat (from Dave Jones): skbuff: skb_under_panic: text:ffffffff816765f6 len:48 put:40 head:ffff88013deb6df0 data:ffff88013deb6dec tail:0x2c end:0xc0 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:126! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: dccp_ipv4 dccp 8021q garp bridge stp dlci mpoa snd_seq_dummy sctp fuse hidp tun bnep nfnetlink scsi_transport_iscsi rfcomm can_raw can_bcm af_802154 appletalk caif_socket can caif ipt_ULOG x25 rose af_key pppoe pppox ipx phonet irda llc2 ppp_generic slhc p8023 psnap p8022 llc crc_ccitt atm bluetooth +netrom ax25 nfc rfkill rds af_rxrpc coretemp hwmon kvm_intel kvm crc32c_intel snd_hda_codec_realtek ghash_clmulni_intel microcode pcspkr snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep usb_debug snd_seq snd_seq_device snd_pcm e1000e snd_page_alloc snd_timer ptp snd pps_core soundcore xfs libcrc32c CPU: 2 PID: 8095 Comm: trinity-child2 Not tainted 3.10.0-rc7+ torvalds#37 task: ffff8801f52c2520 ti: ffff8801e6430000 task.ti: ffff8801e6430000 RIP: 0010:[<ffffffff816e759c>] [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP: 0018:ffff8801e6431de8 EFLAGS: 00010282 RAX: 0000000000000086 RBX: ffff8802353d3cc0 RCX: 0000000000000006 RDX: 0000000000003b90 RSI: ffff8801f52c2ca0 RDI: ffff8801f52c2520 RBP: ffff8801e6431e08 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022ea0c800 R13: ffff88022ea0cdf8 R14: ffff8802353ecb40 R15: ffffffff81cc7800 FS: 00007f5720a10740(0000) GS:ffff880244c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000005862000 CR3: 000000022843c000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Stack: ffff88013deb6dec 000000000000002c 00000000000000c0 ffffffff81a3f6e4 ffff8801e6431e18 ffffffff8159a9aa ffff8801e6431e90 ffffffff816765f6 ffffffff810b756b 0000000700000002 ffff8801e6431e40 0000fea9292aa8c0 Call Trace: [<ffffffff8159a9aa>] skb_push+0x3a/0x40 [<ffffffff816765f6>] ip6_push_pending_frames+0x1f6/0x4d0 [<ffffffff810b756b>] ? mark_held_locks+0xbb/0x140 [<ffffffff81694919>] udp_v6_push_pending_frames+0x2b9/0x3d0 [<ffffffff81694660>] ? udplite_getfrag+0x20/0x20 [<ffffffff8162092a>] udp_lib_setsockopt+0x1aa/0x1f0 [<ffffffff811cc5e7>] ? fget_light+0x387/0x4f0 [<ffffffff816958a4>] udpv6_setsockopt+0x34/0x40 [<ffffffff815949f4>] sock_common_setsockopt+0x14/0x20 [<ffffffff81593c31>] SyS_setsockopt+0x71/0xd0 [<ffffffff816f5d54>] tracesys+0xdd/0xe2 Code: 00 00 48 89 44 24 10 8b 87 d8 00 00 00 48 89 44 24 08 48 8b 87 e8 00 00 00 48 c7 c7 c0 04 aa 81 48 89 04 24 31 c0 e8 e1 7e ff ff <0f> 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 RIP [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP <ffff8801e6431de8> This patch adds a check if the pending data is of address family AF_INET and directly calls udp_push_ending_frames from udp_v6_push_pending_frames if that is the case. This bug was found by Dave Jones with trinity. (Also move the initialization of fl6 below the AF_INET check, even if not strictly necessary.) Cc: Dave Jones <[email protected]> Cc: YOSHIFUJI Hideaki <[email protected]> Signed-off-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Ben Hutchings <[email protected]>

…ET pending data [ Upstream commit 8822b64 ] We accidentally call down to ip6_push_pending_frames when uncorking pending AF_INET data on a ipv6 socket. This results in the following splat (from Dave Jones): skbuff: skb_under_panic: text:ffffffff816765f6 len:48 put:40 head:ffff88013deb6df0 data:ffff88013deb6dec tail:0x2c end:0xc0 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:126! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: dccp_ipv4 dccp 8021q garp bridge stp dlci mpoa snd_seq_dummy sctp fuse hidp tun bnep nfnetlink scsi_transport_iscsi rfcomm can_raw can_bcm af_802154 appletalk caif_socket can caif ipt_ULOG x25 rose af_key pppoe pppox ipx phonet irda llc2 ppp_generic slhc p8023 psnap p8022 llc crc_ccitt atm bluetooth +netrom ax25 nfc rfkill rds af_rxrpc coretemp hwmon kvm_intel kvm crc32c_intel snd_hda_codec_realtek ghash_clmulni_intel microcode pcspkr snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep usb_debug snd_seq snd_seq_device snd_pcm e1000e snd_page_alloc snd_timer ptp snd pps_core soundcore xfs libcrc32c CPU: 2 PID: 8095 Comm: trinity-child2 Not tainted 3.10.0-rc7+ #37 task: ffff8801f52c2520 ti: ffff8801e6430000 task.ti: ffff8801e6430000 RIP: 0010:[<ffffffff816e759c>] [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP: 0018:ffff8801e6431de8 EFLAGS: 00010282 RAX: 0000000000000086 RBX: ffff8802353d3cc0 RCX: 0000000000000006 RDX: 0000000000003b90 RSI: ffff8801f52c2ca0 RDI: ffff8801f52c2520 RBP: ffff8801e6431e08 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022ea0c800 R13: ffff88022ea0cdf8 R14: ffff8802353ecb40 R15: ffffffff81cc7800 FS: 00007f5720a10740(0000) GS:ffff880244c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000005862000 CR3: 000000022843c000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Stack: ffff88013deb6dec 000000000000002c 00000000000000c0 ffffffff81a3f6e4 ffff8801e6431e18 ffffffff8159a9aa ffff8801e6431e90 ffffffff816765f6 ffffffff810b756b 0000000700000002 ffff8801e6431e40 0000fea9292aa8c0 Call Trace: [<ffffffff8159a9aa>] skb_push+0x3a/0x40 [<ffffffff816765f6>] ip6_push_pending_frames+0x1f6/0x4d0 [<ffffffff810b756b>] ? mark_held_locks+0xbb/0x140 [<ffffffff81694919>] udp_v6_push_pending_frames+0x2b9/0x3d0 [<ffffffff81694660>] ? udplite_getfrag+0x20/0x20 [<ffffffff8162092a>] udp_lib_setsockopt+0x1aa/0x1f0 [<ffffffff811cc5e7>] ? fget_light+0x387/0x4f0 [<ffffffff816958a4>] udpv6_setsockopt+0x34/0x40 [<ffffffff815949f4>] sock_common_setsockopt+0x14/0x20 [<ffffffff81593c31>] SyS_setsockopt+0x71/0xd0 [<ffffffff816f5d54>] tracesys+0xdd/0xe2 Code: 00 00 48 89 44 24 10 8b 87 d8 00 00 00 48 89 44 24 08 48 8b 87 e8 00 00 00 48 c7 c7 c0 04 aa 81 48 89 04 24 31 c0 e8 e1 7e ff ff <0f> 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 RIP [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP <ffff8801e6431de8> This patch adds a check if the pending data is of address family AF_INET and directly calls udp_push_ending_frames from udp_v6_push_pending_frames if that is the case. This bug was found by Dave Jones with trinity. (Also move the initialization of fl6 below the AF_INET check, even if not strictly necessary.) Cc: Dave Jones <[email protected]> Cc: YOSHIFUJI Hideaki <[email protected]> Signed-off-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Kamal Mostafa <[email protected]>

As the new x86 CPU bootup printout format code maintainer, I am taking immediate action to improve and clean (and thus indulge my OCD) the reporting of the cores when coming up online. Fix padding to a right-hand alignment, cleanup code and bind reporting width to the max number of supported CPUs on the system, like this: [ 0.074509] smpboot: Booting Node 0, Processors: #1 #2 #3 #4 #5 torvalds#6 torvalds#7 OK [ 0.644008] smpboot: Booting Node 1, Processors: torvalds#8 torvalds#9 torvalds#10 torvalds#11 torvalds#12 torvalds#13 torvalds#14 torvalds#15 OK [ 1.245006] smpboot: Booting Node 2, Processors: torvalds#16 torvalds#17 torvalds#18 torvalds#19 torvalds#20 torvalds#21 torvalds#22 torvalds#23 OK [ 1.864005] smpboot: Booting Node 3, Processors: torvalds#24 torvalds#25 torvalds#26 torvalds#27 torvalds#28 torvalds#29 torvalds#30 torvalds#31 OK [ 2.489005] smpboot: Booting Node 4, Processors: torvalds#32 torvalds#33 torvalds#34 torvalds#35 torvalds#36 torvalds#37 torvalds#38 torvalds#39 OK [ 3.093005] smpboot: Booting Node 5, Processors: torvalds#40 torvalds#41 torvalds#42 torvalds#43 torvalds#44 torvalds#45 torvalds#46 torvalds#47 OK [ 3.698005] smpboot: Booting Node 6, Processors: torvalds#48 torvalds#49 torvalds#50 torvalds#51 #52 #53 torvalds#54 torvalds#55 OK [ 4.304005] smpboot: Booting Node 7, Processors: torvalds#56 torvalds#57 #58 torvalds#59 torvalds#60 torvalds#61 torvalds#62 torvalds#63 OK [ 4.961413] Brought up 64 CPUs and this: [ 0.072367] smpboot: Booting Node 0, Processors: #1 #2 #3 #4 #5 torvalds#6 torvalds#7 OK [ 0.686329] Brought up 8 CPUs Signed-off-by: Borislav Petkov <[email protected]> Cc: Libin <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>

Turn it into (for example): [ 0.073380] x86: Booting SMP configuration: [ 0.074005] .... node #0, CPUs: #1 #2 #3 #4 #5 torvalds#6 torvalds#7 [ 0.603005] .... node #1, CPUs: torvalds#8 torvalds#9 torvalds#10 torvalds#11 torvalds#12 torvalds#13 torvalds#14 torvalds#15 [ 1.200005] .... node #2, CPUs: torvalds#16 torvalds#17 torvalds#18 torvalds#19 torvalds#20 torvalds#21 torvalds#22 torvalds#23 [ 1.796005] .... node #3, CPUs: torvalds#24 torvalds#25 torvalds#26 torvalds#27 torvalds#28 torvalds#29 torvalds#30 torvalds#31 [ 2.393005] .... node #4, CPUs: torvalds#32 torvalds#33 torvalds#34 torvalds#35 torvalds#36 torvalds#37 torvalds#38 torvalds#39 [ 2.996005] .... node #5, CPUs: torvalds#40 torvalds#41 torvalds#42 torvalds#43 torvalds#44 torvalds#45 torvalds#46 torvalds#47 [ 3.600005] .... node torvalds#6, CPUs: torvalds#48 torvalds#49 torvalds#50 torvalds#51 #52 #53 torvalds#54 torvalds#55 [ 4.202005] .... node torvalds#7, CPUs: torvalds#56 torvalds#57 #58 torvalds#59 torvalds#60 torvalds#61 torvalds#62 torvalds#63 [ 4.811005] .... node torvalds#8, CPUs: torvalds#64 torvalds#65 torvalds#66 torvalds#67 torvalds#68 torvalds#69 #70 torvalds#71 [ 5.421006] .... node torvalds#9, CPUs: torvalds#72 torvalds#73 torvalds#74 torvalds#75 torvalds#76 torvalds#77 torvalds#78 torvalds#79 [ 6.032005] .... node torvalds#10, CPUs: torvalds#80 torvalds#81 torvalds#82 torvalds#83 torvalds#84 torvalds#85 torvalds#86 torvalds#87 [ 6.648006] .... node torvalds#11, CPUs: torvalds#88 torvalds#89 torvalds#90 torvalds#91 torvalds#92 torvalds#93 torvalds#94 torvalds#95 [ 7.262005] .... node torvalds#12, CPUs: torvalds#96 torvalds#97 torvalds#98 torvalds#99 torvalds#100 torvalds#101 torvalds#102 torvalds#103 [ 7.865005] .... node torvalds#13, CPUs: torvalds#104 torvalds#105 torvalds#106 torvalds#107 torvalds#108 torvalds#109 torvalds#110 torvalds#111 [ 8.466005] .... node torvalds#14, CPUs: torvalds#112 torvalds#113 torvalds#114 torvalds#115 torvalds#116 torvalds#117 torvalds#118 torvalds#119 [ 9.073006] .... node torvalds#15, CPUs: torvalds#120 torvalds#121 torvalds#122 torvalds#123 torvalds#124 torvalds#125 torvalds#126 torvalds#127 [ 9.679901] x86: Booted up 16 nodes, 128 CPUs and drop useless elements. Change num_digits() to hpa's division-avoiding, cell-phone-typed version which he went at great lengths and pains to submit on a Saturday evening. Signed-off-by: Borislav Petkov <[email protected]> Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: [email protected] Cc: Linus Torvalds <[email protected]> Cc: Andrew Morton <[email protected]> Cc: Peter Zijlstra <[email protected]> Cc: Thomas Gleixner <[email protected]> Link: http://lkml.kernel.org/r/[email protected] Signed-off-by: Ingo Molnar <[email protected]>

When operate harddisk and hit errors, md_set_badblocks is called after scsi_restart_operations which already disabled the irq. but md_set_badblocks will call write_sequnlock_irq and enable irq. so softirq can preempt the current thread and that may cause a deadlock. I think this situation should use write_sequnlock_irqsave/irqrestore instead. I met the situation and the call trace is below: [ 638.919974] BUG: spinlock recursion on CPU#0, scsi_eh_13/1010 [ 638.921923] lock: 0xffff8800d4d51fc8, .magic: dead4ead, .owner: scsi_eh_13/1010, .owner_cpu: 0 [ 638.923890] CPU: 0 PID: 1010 Comm: scsi_eh_13 Not tainted 3.12.0-rc5+ torvalds#37 [ 638.925844] Hardware name: To be filled by O.E.M. To be filled by O.E.M./MAHOBAY, BIOS 4.6.5 03/05/2013 [ 638.927816] ffff880037ad4640 ffff880118c03d50 ffffffff8172ff85 0000000000000007 [ 638.929829] ffff8800d4d51fc8 ffff880118c03d70 ffffffff81730030 ffff8800d4d51fc8 [ 638.931848] ffffffff81a72eb0 ffff880118c03d90 ffffffff81730056 ffff8800d4d51fc8 [ 638.933884] Call Trace: [ 638.935867] <IRQ> [<ffffffff8172ff85>] dump_stack+0x55/0x76 [ 638.937878] [<ffffffff81730030>] spin_dump+0x8a/0x8f [ 638.939861] [<ffffffff81730056>] spin_bug+0x21/0x26 [ 638.941836] [<ffffffff81336de4>] do_raw_spin_lock+0xa4/0xc0 [ 638.943801] [<ffffffff8173f036>] _raw_spin_lock+0x66/0x80 [ 638.945747] [<ffffffff814a73ed>] ? scsi_device_unbusy+0x9d/0xd0 [ 638.947672] [<ffffffff8173fb1b>] ? _raw_spin_unlock+0x2b/0x50 [ 638.949595] [<ffffffff814a73ed>] scsi_device_unbusy+0x9d/0xd0 [ 638.951504] [<ffffffff8149ec47>] scsi_finish_command+0x37/0xe0 [ 638.953388] [<ffffffff814a75e8>] scsi_softirq_done+0xa8/0x140 [ 638.955248] [<ffffffff8130e32b>] blk_done_softirq+0x7b/0x90 [ 638.957116] [<ffffffff8104fddd>] __do_softirq+0xfd/0x330 [ 638.958987] [<ffffffff810b964f>] ? __lock_release+0x6f/0x100 [ 638.960861] [<ffffffff8174a5cc>] call_softirq+0x1c/0x30 [ 638.962724] [<ffffffff81004c7d>] do_softirq+0x8d/0xc0 [ 638.964565] [<ffffffff8105024e>] irq_exit+0x10e/0x150 [ 638.966390] [<ffffffff8174ad4a>] smp_apic_timer_interrupt+0x4a/0x60 [ 638.968223] [<ffffffff817499af>] apic_timer_interrupt+0x6f/0x80 [ 638.970079] <EOI> [<ffffffff810b964f>] ? __lock_release+0x6f/0x100 [ 638.971899] [<ffffffff8173fa6a>] ? _raw_spin_unlock_irq+0x3a/0x50 [ 638.973691] [<ffffffff8173fa60>] ? _raw_spin_unlock_irq+0x30/0x50 [ 638.975475] [<ffffffff81562393>] md_set_badblocks+0x1f3/0x4a0 [ 638.977243] [<ffffffff81566e07>] rdev_set_badblocks+0x27/0x80 [ 638.978988] [<ffffffffa00d97bb>] raid5_end_read_request+0x36b/0x4e0 [raid456] [ 638.980723] [<ffffffff811b5a1d>] bio_endio+0x1d/0x40 [ 638.982463] [<ffffffff81304ff3>] req_bio_endio.isra.65+0x83/0xa0 [ 638.984214] [<ffffffff81306b9f>] blk_update_request+0x7f/0x350 [ 638.985967] [<ffffffff81306ea1>] blk_update_bidi_request+0x31/0x90 [ 638.987710] [<ffffffff813085e0>] __blk_end_bidi_request+0x20/0x50 [ 638.989439] [<ffffffff8130862f>] __blk_end_request_all+0x1f/0x30 [ 638.991149] [<ffffffff81308746>] blk_peek_request+0x106/0x250 [ 638.992861] [<ffffffff814a62a9>] ? scsi_kill_request.isra.32+0xe9/0x130 [ 638.994561] [<ffffffff814a633a>] scsi_request_fn+0x4a/0x3d0 [ 638.996251] [<ffffffff813040a7>] __blk_run_queue+0x37/0x50 [ 638.997900] [<ffffffff813045af>] blk_run_queue+0x2f/0x50 [ 638.999553] [<ffffffff814a5750>] scsi_run_queue+0xe0/0x1c0 [ 639.001185] [<ffffffff814a7721>] scsi_run_host_queues+0x21/0x40 [ 639.002798] [<ffffffff814a2e87>] scsi_restart_operations+0x177/0x200 [ 639.004391] [<ffffffff814a4fe9>] scsi_error_handler+0xc9/0xe0 [ 639.005996] [<ffffffff814a4f20>] ? scsi_unjam_host+0xd0/0xd0 [ 639.007600] [<ffffffff81072f6b>] kthread+0xdb/0xe0 [ 639.009205] [<ffffffff81072e90>] ? flush_kthread_worker+0x170/0x170 [ 639.010821] [<ffffffff81748cac>] ret_from_fork+0x7c/0xb0 [ 639.012437] [<ffffffff81072e90>] ? flush_kthread_worker+0x170/0x170 This bug was introduce in commit 2e8ac30 (the first time rdev_set_badblock was call from interrupt context), so this patch is appropriate for 3.5 and subsequent kernels. Cc: <[email protected]> (3.5+) Signed-off-by: Bian Yu <[email protected]> Reviewed-by: Jianpeng Ma <[email protected]> Signed-off-by: NeilBrown <[email protected]>

…ET pending data BugLink: http://bugs.launchpad.net/bugs/1214984 [ Upstream commit 8822b64 ] We accidentally call down to ip6_push_pending_frames when uncorking pending AF_INET data on a ipv6 socket. This results in the following splat (from Dave Jones): skbuff: skb_under_panic: text:ffffffff816765f6 len:48 put:40 head:ffff88013deb6df0 data:ffff88013deb6dec tail:0x2c end:0xc0 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:126! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: dccp_ipv4 dccp 8021q garp bridge stp dlci mpoa snd_seq_dummy sctp fuse hidp tun bnep nfnetlink scsi_transport_iscsi rfcomm can_raw can_bcm af_802154 appletalk caif_socket can caif ipt_ULOG x25 rose af_key pppoe pppox ipx phonet irda llc2 ppp_generic slhc p8023 psnap p8022 llc crc_ccitt atm bluetooth +netrom ax25 nfc rfkill rds af_rxrpc coretemp hwmon kvm_intel kvm crc32c_intel snd_hda_codec_realtek ghash_clmulni_intel microcode pcspkr snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep usb_debug snd_seq snd_seq_device snd_pcm e1000e snd_page_alloc snd_timer ptp snd pps_core soundcore xfs libcrc32c CPU: 2 PID: 8095 Comm: trinity-child2 Not tainted 3.10.0-rc7+ torvalds#37 task: ffff8801f52c2520 ti: ffff8801e6430000 task.ti: ffff8801e6430000 RIP: 0010:[<ffffffff816e759c>] [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP: 0018:ffff8801e6431de8 EFLAGS: 00010282 RAX: 0000000000000086 RBX: ffff8802353d3cc0 RCX: 0000000000000006 RDX: 0000000000003b90 RSI: ffff8801f52c2ca0 RDI: ffff8801f52c2520 RBP: ffff8801e6431e08 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022ea0c800 R13: ffff88022ea0cdf8 R14: ffff8802353ecb40 R15: ffffffff81cc7800 FS: 00007f5720a10740(0000) GS:ffff880244c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000005862000 CR3: 000000022843c000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Stack: ffff88013deb6dec 000000000000002c 00000000000000c0 ffffffff81a3f6e4 ffff8801e6431e18 ffffffff8159a9aa ffff8801e6431e90 ffffffff816765f6 ffffffff810b756b 0000000700000002 ffff8801e6431e40 0000fea9292aa8c0 Call Trace: [<ffffffff8159a9aa>] skb_push+0x3a/0x40 [<ffffffff816765f6>] ip6_push_pending_frames+0x1f6/0x4d0 [<ffffffff810b756b>] ? mark_held_locks+0xbb/0x140 [<ffffffff81694919>] udp_v6_push_pending_frames+0x2b9/0x3d0 [<ffffffff81694660>] ? udplite_getfrag+0x20/0x20 [<ffffffff8162092a>] udp_lib_setsockopt+0x1aa/0x1f0 [<ffffffff811cc5e7>] ? fget_light+0x387/0x4f0 [<ffffffff816958a4>] udpv6_setsockopt+0x34/0x40 [<ffffffff815949f4>] sock_common_setsockopt+0x14/0x20 [<ffffffff81593c31>] SyS_setsockopt+0x71/0xd0 [<ffffffff816f5d54>] tracesys+0xdd/0xe2 Code: 00 00 48 89 44 24 10 8b 87 d8 00 00 00 48 89 44 24 08 48 8b 87 e8 00 00 00 48 c7 c7 c0 04 aa 81 48 89 04 24 31 c0 e8 e1 7e ff ff <0f> 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 RIP [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP <ffff8801e6431de8> This patch adds a check if the pending data is of address family AF_INET and directly calls udp_push_ending_frames from udp_v6_push_pending_frames if that is the case. This bug was found by Dave Jones with trinity. (Also move the initialization of fl6 below the AF_INET check, even if not strictly necessary.) Cc: Dave Jones <[email protected]> Cc: YOSHIFUJI Hideaki <[email protected]> Signed-off-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]> Signed-off-by: Ben Hutchings <[email protected]> Signed-off-by: Brad Figg <[email protected]>

commit 905b029 upstream. When operate harddisk and hit errors, md_set_badblocks is called after scsi_restart_operations which already disabled the irq. but md_set_badblocks will call write_sequnlock_irq and enable irq. so softirq can preempt the current thread and that may cause a deadlock. I think this situation should use write_sequnlock_irqsave/irqrestore instead. I met the situation and the call trace is below: [ 638.919974] BUG: spinlock recursion on CPU#0, scsi_eh_13/1010 [ 638.921923] lock: 0xffff8800d4d51fc8, .magic: dead4ead, .owner: scsi_eh_13/1010, .owner_cpu: 0 [ 638.923890] CPU: 0 PID: 1010 Comm: scsi_eh_13 Not tainted 3.12.0-rc5+ #37 [ 638.925844] Hardware name: To be filled by O.E.M. To be filled by O.E.M./MAHOBAY, BIOS 4.6.5 03/05/2013 [ 638.927816] ffff880037ad4640 ffff880118c03d50 ffffffff8172ff85 0000000000000007 [ 638.929829] ffff8800d4d51fc8 ffff880118c03d70 ffffffff81730030 ffff8800d4d51fc8 [ 638.931848] ffffffff81a72eb0 ffff880118c03d90 ffffffff81730056 ffff8800d4d51fc8 [ 638.933884] Call Trace: [ 638.935867] <IRQ> [<ffffffff8172ff85>] dump_stack+0x55/0x76 [ 638.937878] [<ffffffff81730030>] spin_dump+0x8a/0x8f [ 638.939861] [<ffffffff81730056>] spin_bug+0x21/0x26 [ 638.941836] [<ffffffff81336de4>] do_raw_spin_lock+0xa4/0xc0 [ 638.943801] [<ffffffff8173f036>] _raw_spin_lock+0x66/0x80 [ 638.945747] [<ffffffff814a73ed>] ? scsi_device_unbusy+0x9d/0xd0 [ 638.947672] [<ffffffff8173fb1b>] ? _raw_spin_unlock+0x2b/0x50 [ 638.949595] [<ffffffff814a73ed>] scsi_device_unbusy+0x9d/0xd0 [ 638.951504] [<ffffffff8149ec47>] scsi_finish_command+0x37/0xe0 [ 638.953388] [<ffffffff814a75e8>] scsi_softirq_done+0xa8/0x140 [ 638.955248] [<ffffffff8130e32b>] blk_done_softirq+0x7b/0x90 [ 638.957116] [<ffffffff8104fddd>] __do_softirq+0xfd/0x330 [ 638.958987] [<ffffffff810b964f>] ? __lock_release+0x6f/0x100 [ 638.960861] [<ffffffff8174a5cc>] call_softirq+0x1c/0x30 [ 638.962724] [<ffffffff81004c7d>] do_softirq+0x8d/0xc0 [ 638.964565] [<ffffffff8105024e>] irq_exit+0x10e/0x150 [ 638.966390] [<ffffffff8174ad4a>] smp_apic_timer_interrupt+0x4a/0x60 [ 638.968223] [<ffffffff817499af>] apic_timer_interrupt+0x6f/0x80 [ 638.970079] <EOI> [<ffffffff810b964f>] ? __lock_release+0x6f/0x100 [ 638.971899] [<ffffffff8173fa6a>] ? _raw_spin_unlock_irq+0x3a/0x50 [ 638.973691] [<ffffffff8173fa60>] ? _raw_spin_unlock_irq+0x30/0x50 [ 638.975475] [<ffffffff81562393>] md_set_badblocks+0x1f3/0x4a0 [ 638.977243] [<ffffffff81566e07>] rdev_set_badblocks+0x27/0x80 [ 638.978988] [<ffffffffa00d97bb>] raid5_end_read_request+0x36b/0x4e0 [raid456] [ 638.980723] [<ffffffff811b5a1d>] bio_endio+0x1d/0x40 [ 638.982463] [<ffffffff81304ff3>] req_bio_endio.isra.65+0x83/0xa0 [ 638.984214] [<ffffffff81306b9f>] blk_update_request+0x7f/0x350 [ 638.985967] [<ffffffff81306ea1>] blk_update_bidi_request+0x31/0x90 [ 638.987710] [<ffffffff813085e0>] __blk_end_bidi_request+0x20/0x50 [ 638.989439] [<ffffffff8130862f>] __blk_end_request_all+0x1f/0x30 [ 638.991149] [<ffffffff81308746>] blk_peek_request+0x106/0x250 [ 638.992861] [<ffffffff814a62a9>] ? scsi_kill_request.isra.32+0xe9/0x130 [ 638.994561] [<ffffffff814a633a>] scsi_request_fn+0x4a/0x3d0 [ 638.996251] [<ffffffff813040a7>] __blk_run_queue+0x37/0x50 [ 638.997900] [<ffffffff813045af>] blk_run_queue+0x2f/0x50 [ 638.999553] [<ffffffff814a5750>] scsi_run_queue+0xe0/0x1c0 [ 639.001185] [<ffffffff814a7721>] scsi_run_host_queues+0x21/0x40 [ 639.002798] [<ffffffff814a2e87>] scsi_restart_operations+0x177/0x200 [ 639.004391] [<ffffffff814a4fe9>] scsi_error_handler+0xc9/0xe0 [ 639.005996] [<ffffffff814a4f20>] ? scsi_unjam_host+0xd0/0xd0 [ 639.007600] [<ffffffff81072f6b>] kthread+0xdb/0xe0 [ 639.009205] [<ffffffff81072e90>] ? flush_kthread_worker+0x170/0x170 [ 639.010821] [<ffffffff81748cac>] ret_from_fork+0x7c/0xb0 [ 639.012437] [<ffffffff81072e90>] ? flush_kthread_worker+0x170/0x170 This bug was introduce in commit 2e8ac30 (the first time rdev_set_badblock was call from interrupt context), so this patch is appropriate for 3.5 and subsequent kernels. Signed-off-by: Bian Yu <[email protected]> Reviewed-by: Jianpeng Ma <[email protected]> Signed-off-by: NeilBrown <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 905b029 upstream. When operate harddisk and hit errors, md_set_badblocks is called after scsi_restart_operations which already disabled the irq. but md_set_badblocks will call write_sequnlock_irq and enable irq. so softirq can preempt the current thread and that may cause a deadlock. I think this situation should use write_sequnlock_irqsave/irqrestore instead. I met the situation and the call trace is below: [ 638.919974] BUG: spinlock recursion on CPU#0, scsi_eh_13/1010 [ 638.921923] lock: 0xffff8800d4d51fc8, .magic: dead4ead, .owner: scsi_eh_13/1010, .owner_cpu: 0 [ 638.923890] CPU: 0 PID: 1010 Comm: scsi_eh_13 Not tainted 3.12.0-rc5+ #37 [ 638.925844] Hardware name: To be filled by O.E.M. To be filled by O.E.M./MAHOBAY, BIOS 4.6.5 03/05/2013 [ 638.927816] ffff880037ad4640 ffff880118c03d50 ffffffff8172ff85 0000000000000007 [ 638.929829] ffff8800d4d51fc8 ffff880118c03d70 ffffffff81730030 ffff8800d4d51fc8 [ 638.931848] ffffffff81a72eb0 ffff880118c03d90 ffffffff81730056 ffff8800d4d51fc8 [ 638.933884] Call Trace: [ 638.935867] <IRQ> [<ffffffff8172ff85>] dump_stack+0x55/0x76 [ 638.937878] [<ffffffff81730030>] spin_dump+0x8a/0x8f [ 638.939861] [<ffffffff81730056>] spin_bug+0x21/0x26 [ 638.941836] [<ffffffff81336de4>] do_raw_spin_lock+0xa4/0xc0 [ 638.943801] [<ffffffff8173f036>] _raw_spin_lock+0x66/0x80 [ 638.945747] [<ffffffff814a73ed>] ? scsi_device_unbusy+0x9d/0xd0 [ 638.947672] [<ffffffff8173fb1b>] ? _raw_spin_unlock+0x2b/0x50 [ 638.949595] [<ffffffff814a73ed>] scsi_device_unbusy+0x9d/0xd0 [ 638.951504] [<ffffffff8149ec47>] scsi_finish_command+0x37/0xe0 [ 638.953388] [<ffffffff814a75e8>] scsi_softirq_done+0xa8/0x140 [ 638.955248] [<ffffffff8130e32b>] blk_done_softirq+0x7b/0x90 [ 638.957116] [<ffffffff8104fddd>] __do_softirq+0xfd/0x330 [ 638.958987] [<ffffffff810b964f>] ? __lock_release+0x6f/0x100 [ 638.960861] [<ffffffff8174a5cc>] call_softirq+0x1c/0x30 [ 638.962724] [<ffffffff81004c7d>] do_softirq+0x8d/0xc0 [ 638.964565] [<ffffffff8105024e>] irq_exit+0x10e/0x150 [ 638.966390] [<ffffffff8174ad4a>] smp_apic_timer_interrupt+0x4a/0x60 [ 638.968223] [<ffffffff817499af>] apic_timer_interrupt+0x6f/0x80 [ 638.970079] <EOI> [<ffffffff810b964f>] ? __lock_release+0x6f/0x100 [ 638.971899] [<ffffffff8173fa6a>] ? _raw_spin_unlock_irq+0x3a/0x50 [ 638.973691] [<ffffffff8173fa60>] ? _raw_spin_unlock_irq+0x30/0x50 [ 638.975475] [<ffffffff81562393>] md_set_badblocks+0x1f3/0x4a0 [ 638.977243] [<ffffffff81566e07>] rdev_set_badblocks+0x27/0x80 [ 638.978988] [<ffffffffa00d97bb>] raid5_end_read_request+0x36b/0x4e0 [raid456] [ 638.980723] [<ffffffff811b5a1d>] bio_endio+0x1d/0x40 [ 638.982463] [<ffffffff81304ff3>] req_bio_endio.isra.65+0x83/0xa0 [ 638.984214] [<ffffffff81306b9f>] blk_update_request+0x7f/0x350 [ 638.985967] [<ffffffff81306ea1>] blk_update_bidi_request+0x31/0x90 [ 638.987710] [<ffffffff813085e0>] __blk_end_bidi_request+0x20/0x50 [ 638.989439] [<ffffffff8130862f>] __blk_end_request_all+0x1f/0x30 [ 638.991149] [<ffffffff81308746>] blk_peek_request+0x106/0x250 [ 638.992861] [<ffffffff814a62a9>] ? scsi_kill_request.isra.32+0xe9/0x130 [ 638.994561] [<ffffffff814a633a>] scsi_request_fn+0x4a/0x3d0 [ 638.996251] [<ffffffff813040a7>] __blk_run_queue+0x37/0x50 [ 638.997900] [<ffffffff813045af>] blk_run_queue+0x2f/0x50 [ 638.999553] [<ffffffff814a5750>] scsi_run_queue+0xe0/0x1c0 [ 639.001185] [<ffffffff814a7721>] scsi_run_host_queues+0x21/0x40 [ 639.002798] [<ffffffff814a2e87>] scsi_restart_operations+0x177/0x200 [ 639.004391] [<ffffffff814a4fe9>] scsi_error_handler+0xc9/0xe0 [ 639.005996] [<ffffffff814a4f20>] ? scsi_unjam_host+0xd0/0xd0 [ 639.007600] [<ffffffff81072f6b>] kthread+0xdb/0xe0 [ 639.009205] [<ffffffff81072e90>] ? flush_kthread_worker+0x170/0x170 [ 639.010821] [<ffffffff81748cac>] ret_from_fork+0x7c/0xb0 [ 639.012437] [<ffffffff81072e90>] ? flush_kthread_worker+0x170/0x170 This bug was introduce in commit 2e8ac30 (the first time rdev_set_badblock was call from interrupt context), so this patch is appropriate for 3.5 and subsequent kernels. Signed-off-by: Bian Yu <[email protected]> Reviewed-by: Jianpeng Ma <[email protected]> Signed-off-by: NeilBrown <[email protected]> Signed-off-by: Kamal Mostafa <[email protected]>

…ET pending data commit 8822b64 upstream. We accidentally call down to ip6_push_pending_frames when uncorking pending AF_INET data on a ipv6 socket. This results in the following splat (from Dave Jones): skbuff: skb_under_panic: text:ffffffff816765f6 len:48 put:40 head:ffff88013deb6df0 data:ffff88013deb6dec tail:0x2c end:0xc0 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:126! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: dccp_ipv4 dccp 8021q garp bridge stp dlci mpoa snd_seq_dummy sctp fuse hidp tun bnep nfnetlink scsi_transport_iscsi rfcomm can_raw can_bcm af_802154 appletalk caif_socket can caif ipt_ULOG x25 rose af_key pppoe pppox ipx phonet irda llc2 ppp_generic slhc p8023 psnap p8022 llc crc_ccitt atm bluetooth +netrom ax25 nfc rfkill rds af_rxrpc coretemp hwmon kvm_intel kvm crc32c_intel snd_hda_codec_realtek ghash_clmulni_intel microcode pcspkr snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep usb_debug snd_seq snd_seq_device snd_pcm e1000e snd_page_alloc snd_timer ptp snd pps_core soundcore xfs libcrc32c CPU: 2 PID: 8095 Comm: trinity-child2 Not tainted 3.10.0-rc7+ torvalds#37 task: ffff8801f52c2520 ti: ffff8801e6430000 task.ti: ffff8801e6430000 RIP: 0010:[<ffffffff816e759c>] [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP: 0018:ffff8801e6431de8 EFLAGS: 00010282 RAX: 0000000000000086 RBX: ffff8802353d3cc0 RCX: 0000000000000006 RDX: 0000000000003b90 RSI: ffff8801f52c2ca0 RDI: ffff8801f52c2520 RBP: ffff8801e6431e08 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022ea0c800 R13: ffff88022ea0cdf8 R14: ffff8802353ecb40 R15: ffffffff81cc7800 FS: 00007f5720a10740(0000) GS:ffff880244c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000005862000 CR3: 000000022843c000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Stack: ffff88013deb6dec 000000000000002c 00000000000000c0 ffffffff81a3f6e4 ffff8801e6431e18 ffffffff8159a9aa ffff8801e6431e90 ffffffff816765f6 ffffffff810b756b 0000000700000002 ffff8801e6431e40 0000fea9292aa8c0 Call Trace: [<ffffffff8159a9aa>] skb_push+0x3a/0x40 [<ffffffff816765f6>] ip6_push_pending_frames+0x1f6/0x4d0 [<ffffffff810b756b>] ? mark_held_locks+0xbb/0x140 [<ffffffff81694919>] udp_v6_push_pending_frames+0x2b9/0x3d0 [<ffffffff81694660>] ? udplite_getfrag+0x20/0x20 [<ffffffff8162092a>] udp_lib_setsockopt+0x1aa/0x1f0 [<ffffffff811cc5e7>] ? fget_light+0x387/0x4f0 [<ffffffff816958a4>] udpv6_setsockopt+0x34/0x40 [<ffffffff815949f4>] sock_common_setsockopt+0x14/0x20 [<ffffffff81593c31>] SyS_setsockopt+0x71/0xd0 [<ffffffff816f5d54>] tracesys+0xdd/0xe2 Code: 00 00 48 89 44 24 10 8b 87 d8 00 00 00 48 89 44 24 08 48 8b 87 e8 00 00 00 48 c7 c7 c0 04 aa 81 48 89 04 24 31 c0 e8 e1 7e ff ff <0f> 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 RIP [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP <ffff8801e6431de8> This patch adds a check if the pending data is of address family AF_INET and directly calls udp_push_ending_frames from udp_v6_push_pending_frames if that is the case. This bug was found by Dave Jones with trinity. (Also move the initialization of fl6 below the AF_INET check, even if not strictly necessary.) Cc: Dave Jones <[email protected]> Cc: YOSHIFUJI Hideaki <[email protected]> Signed-off-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]> [PG: The line "flowi6 *fl6 = &inet->cork.fl.u.ip6" was "flowi *fl = &inet->cork.fl" in 2.6.34 kernel.] Signed-off-by: Paul Gortmaker <[email protected]>

Running one program that continuously hotplugs and replugs a cpu concurrently with another program that continuously writes to the scaling_set_speed node eventually deadlocks with: ============================================= [ INFO: possible recursive locking detected ] 3.4.0 torvalds#37 Tainted: G W --------------------------------------------- filemonkey/122 is trying to acquire lock: (s_active#13){++++.+}, at: [<c01a3d28>] sysfs_remove_dir+0x9c/0xb4 but task is already holding lock: (s_active#13){++++.+}, at: [<c01a22f0>] sysfs_write_file+0xe8/0x140 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(s_active#13); lock(s_active#13); *** DEADLOCK *** May be due to missing lock nesting notation 2 locks held by filemonkey/122: #0: (&buffer->mutex){+.+.+.}, at: [<c01a2230>] sysfs_write_file+0x28/0x140 #1: (s_active#13){++++.+}, at: [<c01a22f0>] sysfs_write_file+0xe8/0x140 stack backtrace: [<c0014fcc>] (unwind_backtrace+0x0/0x120) from [<c00ca600>] (validate_chain+0x6f8/0x1054) [<c00ca600>] (validate_chain+0x6f8/0x1054) from [<c00cb778>] (__lock_acquire+0x81c/0x8d8) [<c00cb778>] (__lock_acquire+0x81c/0x8d8) from [<c00cb9c0>] (lock_acquire+0x18c/0x1e8) [<c00cb9c0>] (lock_acquire+0x18c/0x1e8) from [<c01a3ba8>] (sysfs_addrm_finish+0xd0/0x180) [<c01a3ba8>] (sysfs_addrm_finish+0xd0/0x180) from [<c01a3d28>] (sysfs_remove_dir+0x9c/0xb4) [<c01a3d28>] (sysfs_remove_dir+0x9c/0xb4) from [<c02d0e5c>] (kobject_del+0x10/0x38) [<c02d0e5c>] (kobject_del+0x10/0x38) from [<c02d0f74>] (kobject_release+0xf0/0x194) [<c02d0f74>] (kobject_release+0xf0/0x194) from [<c0565a98>] (cpufreq_cpu_put+0xc/0x24) [<c0565a98>] (cpufreq_cpu_put+0xc/0x24) from [<c05683f0>] (store+0x6c/0x74) [<c05683f0>] (store+0x6c/0x74) from [<c01a2314>] (sysfs_write_file+0x10c/0x140) [<c01a2314>] (sysfs_write_file+0x10c/0x140) from [<c014af44>] (vfs_write+0xb0/0x128) [<c014af44>] (vfs_write+0xb0/0x128) from [<c014b06c>] (sys_write+0x3c/0x68) [<c014b06c>] (sys_write+0x3c/0x68) from [<c000e0e0>] (ret_fast_syscall+0x0/0x3c) This is because store() in cpufreq.c indirectly grabs a kobject with kobject_get() and is the last one to call kobject_put() indirectly via cpufreq_cpu_put(). Fix this deadlock by introducing two new functions, cpufreq_cpu_get_sysfs() and cpufreq_cpu_put_sysfs() which do the same thing as cpufreq_cpu_{get,put}() but don't grab the kobject. CRs-fixed: 366560 Change-Id: I23ea50a01793b2b2af0972cde52dba7396925fe3 Signed-off-by: Stephen Boyd <[email protected]>

…ET pending data CVE-2013-4162 BugLink: http://bugs.launchpad.net/bugs/1205070 We accidentally call down to ip6_push_pending_frames when uncorking pending AF_INET data on a ipv6 socket. This results in the following splat (from Dave Jones): skbuff: skb_under_panic: text:ffffffff816765f6 len:48 put:40 head:ffff88013deb6df0 data:ffff88013deb6dec tail:0x2c end:0xc0 dev:<NULL> ------------[ cut here ]------------ kernel BUG at net/core/skbuff.c:126! invalid opcode: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC Modules linked in: dccp_ipv4 dccp 8021q garp bridge stp dlci mpoa snd_seq_dummy sctp fuse hidp tun bnep nfnetlink scsi_transport_iscsi rfcomm can_raw can_bcm af_802154 appletalk caif_socket can caif ipt_ULOG x25 rose af_key pppoe pppox ipx phonet irda llc2 ppp_generic slhc p8023 psnap p8022 llc crc_ccitt atm bluetooth +netrom ax25 nfc rfkill rds af_rxrpc coretemp hwmon kvm_intel kvm crc32c_intel snd_hda_codec_realtek ghash_clmulni_intel microcode pcspkr snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hwdep usb_debug snd_seq snd_seq_device snd_pcm e1000e snd_page_alloc snd_timer ptp snd pps_core soundcore xfs libcrc32c CPU: 2 PID: 8095 Comm: trinity-child2 Not tainted 3.10.0-rc7+ torvalds#37 task: ffff8801f52c2520 ti: ffff8801e6430000 task.ti: ffff8801e6430000 RIP: 0010:[<ffffffff816e759c>] [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP: 0018:ffff8801e6431de8 EFLAGS: 00010282 RAX: 0000000000000086 RBX: ffff8802353d3cc0 RCX: 0000000000000006 RDX: 0000000000003b90 RSI: ffff8801f52c2ca0 RDI: ffff8801f52c2520 RBP: ffff8801e6431e08 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000001 R11: 0000000000000001 R12: ffff88022ea0c800 R13: ffff88022ea0cdf8 R14: ffff8802353ecb40 R15: ffffffff81cc7800 FS: 00007f5720a10740(0000) GS:ffff880244c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000005862000 CR3: 000000022843c000 CR4: 00000000001407e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000600 Stack: ffff88013deb6dec 000000000000002c 00000000000000c0 ffffffff81a3f6e4 ffff8801e6431e18 ffffffff8159a9aa ffff8801e6431e90 ffffffff816765f6 ffffffff810b756b 0000000700000002 ffff8801e6431e40 0000fea9292aa8c0 Call Trace: [<ffffffff8159a9aa>] skb_push+0x3a/0x40 [<ffffffff816765f6>] ip6_push_pending_frames+0x1f6/0x4d0 [<ffffffff810b756b>] ? mark_held_locks+0xbb/0x140 [<ffffffff81694919>] udp_v6_push_pending_frames+0x2b9/0x3d0 [<ffffffff81694660>] ? udplite_getfrag+0x20/0x20 [<ffffffff8162092a>] udp_lib_setsockopt+0x1aa/0x1f0 [<ffffffff811cc5e7>] ? fget_light+0x387/0x4f0 [<ffffffff816958a4>] udpv6_setsockopt+0x34/0x40 [<ffffffff815949f4>] sock_common_setsockopt+0x14/0x20 [<ffffffff81593c31>] SyS_setsockopt+0x71/0xd0 [<ffffffff816f5d54>] tracesys+0xdd/0xe2 Code: 00 00 48 89 44 24 10 8b 87 d8 00 00 00 48 89 44 24 08 48 8b 87 e8 00 00 00 48 c7 c7 c0 04 aa 81 48 89 04 24 31 c0 e8 e1 7e ff ff <0f> 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 48 89 e5 0f 0b 55 RIP [<ffffffff816e759c>] skb_panic+0x63/0x65 RSP <ffff8801e6431de8> This patch adds a check if the pending data is of address family AF_INET and directly calls udp_push_ending_frames from udp_v6_push_pending_frames if that is the case. This bug was found by Dave Jones with trinity. (Also move the initialization of fl6 below the AF_INET check, even if not strictly necessary.) Cc: Dave Jones <[email protected]> Cc: YOSHIFUJI Hideaki <[email protected]> Signed-off-by: Hannes Frederic Sowa <[email protected]> Signed-off-by: David S. Miller <[email protected]> (back ported from commit 8822b64) Signed-off-by: Luis Henriques <[email protected]> Acked-by: Brad Figg <[email protected]> Signed-off-by: Tim Gardner <[email protected]> Signed-off-by: Willy Tarreau <[email protected]>

Running one program that continuously hotplugs and replugs a cpu concurrently with another program that continuously writes to the scaling_set_speed node eventually deadlocks with: ============================================= [ INFO: possible recursive locking detected ] 3.4.0 torvalds#37 Tainted: G W --------------------------------------------- filemonkey/122 is trying to acquire lock: (s_active#13){++++.+}, at: [<c01a3d28>] sysfs_remove_dir+0x9c/0xb4 but task is already holding lock: (s_active#13){++++.+}, at: [<c01a22f0>] sysfs_write_file+0xe8/0x140 other info that might help us debug this: Possible unsafe locking scenario: CPU0 ---- lock(s_active#13); lock(s_active#13); *** DEADLOCK *** May be due to missing lock nesting notation 2 locks held by filemonkey/122: #0: (&buffer->mutex){+.+.+.}, at: [<c01a2230>] sysfs_write_file+0x28/0x140 #1: (s_active#13){++++.+}, at: [<c01a22f0>] sysfs_write_file+0xe8/0x140 stack backtrace: [<c0014fcc>] (unwind_backtrace+0x0/0x120) from [<c00ca600>] (validate_chain+0x6f8/0x1054) [<c00ca600>] (validate_chain+0x6f8/0x1054) from [<c00cb778>] (__lock_acquire+0x81c/0x8d8) [<c00cb778>] (__lock_acquire+0x81c/0x8d8) from [<c00cb9c0>] (lock_acquire+0x18c/0x1e8) [<c00cb9c0>] (lock_acquire+0x18c/0x1e8) from [<c01a3ba8>] (sysfs_addrm_finish+0xd0/0x180) [<c01a3ba8>] (sysfs_addrm_finish+0xd0/0x180) from [<c01a3d28>] (sysfs_remove_dir+0x9c/0xb4) [<c01a3d28>] (sysfs_remove_dir+0x9c/0xb4) from [<c02d0e5c>] (kobject_del+0x10/0x38) [<c02d0e5c>] (kobject_del+0x10/0x38) from [<c02d0f74>] (kobject_release+0xf0/0x194) [<c02d0f74>] (kobject_release+0xf0/0x194) from [<c0565a98>] (cpufreq_cpu_put+0xc/0x24) [<c0565a98>] (cpufreq_cpu_put+0xc/0x24) from [<c05683f0>] (store+0x6c/0x74) [<c05683f0>] (store+0x6c/0x74) from [<c01a2314>] (sysfs_write_file+0x10c/0x140) [<c01a2314>] (sysfs_write_file+0x10c/0x140) from [<c014af44>] (vfs_write+0xb0/0x128) [<c014af44>] (vfs_write+0xb0/0x128) from [<c014b06c>] (sys_write+0x3c/0x68) [<c014b06c>] (sys_write+0x3c/0x68) from [<c000e0e0>] (ret_fast_syscall+0x0/0x3c) This is because store() in cpufreq.c indirectly grabs a kobject with kobject_get() and is the last one to call kobject_put() indirectly via cpufreq_cpu_put(). Fix this deadlock by introducing two new functions, cpufreq_cpu_get_sysfs() and cpufreq_cpu_put_sysfs() which do the same thing as cpufreq_cpu_{get,put}() but don't grab the kobject. CRs-fixed: 366560 Signed-off-by: Ramakrishna Prasad N <[email protected]> (cherry picked from commit ddda0cf) Change-Id: I6ac89789e89099cff315d7a02de216423376b7f7 Signed-off-by: Shruthi Krishna <[email protected]>

Since crossbar is s/w configurable, the initial settings of the crossbar cannot be assumed to be sane. This implies that: a) On initialization all un-reserved crossbars must be initialized to a known 'safe' value. b) When unmapping the interrupt, the safe value must be written to ensure that the crossbar mapping matches with interrupt controller usage. So provide a safe value in the dt data to map if '0' is not safe for the platform and use it during init and unmap While at this, fix the below checkpatch warning. Fixes checkpatch warning: WARNING: Unnecessary space before function pointer arguments torvalds#37: FILE: drivers/irqchip/irq-crossbar.c:37: + void (*write) (int, int); Signed-off-by: Nishanth Menon <[email protected]> Signed-off-by: Sricharan R <[email protected]> Acked-by: Santosh Shilimkar <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Jason Cooper <[email protected]>

ERROR: code indent should use tabs where possible torvalds#37: FILE: include/linux/mmdebug.h:33: + do {^I^I^I^I^I^I^I^I\$ WARNING: please, no spaces at the start of a line torvalds#37: FILE: include/linux/mmdebug.h:33: + do {^I^I^I^I^I^I^I^I\$ ERROR: code indent should use tabs where possible torvalds#38: FILE: include/linux/mmdebug.h:34: + if (unlikely(cond)) {^I^I^I^I^I\$ WARNING: please, no spaces at the start of a line torvalds#38: FILE: include/linux/mmdebug.h:34: + if (unlikely(cond)) {^I^I^I^I^I\$ ERROR: code indent should use tabs where possible torvalds#39: FILE: include/linux/mmdebug.h:35: + dump_mm(mm);^I^I^I^I^I\$ WARNING: please, no spaces at the start of a line torvalds#39: FILE: include/linux/mmdebug.h:35: + dump_mm(mm);^I^I^I^I^I\$ ERROR: code indent should use tabs where possible torvalds#40: FILE: include/linux/mmdebug.h:36: + BUG();^I^I^I^I^I^I\$ WARNING: please, no spaces at the start of a line torvalds#40: FILE: include/linux/mmdebug.h:36: + BUG();^I^I^I^I^I^I\$ ERROR: code indent should use tabs where possible torvalds#41: FILE: include/linux/mmdebug.h:37: + }^I^I^I^I^I^I^I\$ WARNING: please, no spaces at the start of a line torvalds#41: FILE: include/linux/mmdebug.h:37: + }^I^I^I^I^I^I^I\$ ERROR: code indent should use tabs where possible torvalds#42: FILE: include/linux/mmdebug.h:38: + } while (0)$ WARNING: please, no spaces at the start of a line torvalds#42: FILE: include/linux/mmdebug.h:38: + } while (0)$ WARNING: Prefer [subsystem eg: netdev]_alert([subsystem]dev, ... then dev_alert(dev, ... then pr_alert(... to printk(KERN_ALERT ... torvalds#74: FILE: mm/debug.c:171: + printk(KERN_ALERT total: 6 errors, 7 warnings, 109 lines checked NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or scripts/cleanfile ./patches/mm-introduce-vm_bug_on_mm.patch has style problems, please review. If any of these errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Sasha Levin <[email protected]> Signed-off-by: Andrew Morton <[email protected]>

When doing vfio passthrough a VF, the kernel will crash with following message: [ 442.656459] Unable to handle kernel paging request for data at address 0x00000060 [ 442.656593] Faulting instruction address: 0xc000000000038b88 [ 442.656706] Oops: Kernel access of bad area, sig: 11 [#1] [ 442.656798] SMP NR_CPUS=1024 NUMA PowerNV [ 442.656890] Modules linked in: vfio_pci mlx4_core nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6t_REJECT xt_conntrack bnep bluetooth rfkill ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw tg3 nfsd be2net nfs_acl ses lockd ptp enclosure pps_core kvm_hv kvm_pr shpchp binfmt_misc kvm sunrpc uinput lpfc scsi_transport_fc ipr scsi_tgt [last unloaded: mlx4_core] [ 442.658152] CPU: 40 PID: 14948 Comm: qemu-system-ppc Not tainted 3.10.42yw-pkvm+ torvalds#37 [ 442.658219] task: c000000f7e2a9a00 ti: c000000f6dc3c000 task.ti: c000000f6dc3c000 [ 442.658287] NIP: c000000000038b88 LR: c0000000004435a8 CTR: c000000000455bc0 [ 442.658352] REGS: c000000f6dc3f580 TRAP: 0300 Not tainted (3.10.42yw-pkvm+) [ 442.658419] MSR: 9000000000009032 <SF,HV,EE,ME,IR,DR,RI> CR: 28004882 XER: 20000000 [ 442.658577] CFAR: c00000000000908c DAR: 0000000000000060 DSISR: 40000000 SOFTE: 1 GPR00: c0000000004435a8 c000000f6dc3f800 c0000000012b1c10 c00000000da24000 GPR04: 0000000000000003 0000000000001004 00000000000015b3 000000000000ffff GPR08: c00000000127f5d8 0000000000000000 000000000000ffff 0000000000000000 GPR12: c000000000068078 c00000000fdd6800 000001003c320c80 000001003c3607f0 GPR16: 0000000000000001 00000000105480c8 000000001055aaa8 000001003c31ab18 GPR20: 000001003c10fb40 000001003c360ae8 000000001063bcf0 000000001063bdb0 GPR24: 000001003c15ed70 0000000010548f40 c000001fe5514c88 c000001fe5514cb0 GPR28: c00000000da24000 0000000000000000 c00000000da24000 0000000000000003 [ 442.659471] NIP [c000000000038b88] .pcibios_set_pcie_reset_state+0x28/0x130 [ 442.659530] LR [c0000000004435a8] .pci_set_pcie_reset_state+0x28/0x40 [ 442.659585] Call Trace: [ 442.659610] [c000000f6dc3f800] [00000000000719e0] 0x719e0 (unreliable) [ 442.659677] [c000000f6dc3f880] [c0000000004435a8] .pci_set_pcie_reset_state+0x28/0x40 [ 442.659757] [c000000f6dc3f900] [c000000000455bf8] .reset_fundamental+0x38/0x80 [ 442.659835] [c000000f6dc3f980] [c0000000004562a8] .pci_dev_specific_reset+0xa8/0xf0 [ 442.659913] [c000000f6dc3fa00] [c0000000004448c4] .__pci_dev_reset+0x44/0x430 [ 442.659980] [c000000f6dc3fab0] [c000000000444d5c] .pci_reset_function+0x7c/0xc0 [ 442.660059] [c000000f6dc3fb30] [d00000001c141ab8] .vfio_pci_open+0xe8/0x2b0 [vfio_pci] [ 442.660139] [c000000f6dc3fbd0] [c000000000586c30] .vfio_group_fops_unl_ioctl+0x3a0/0x630 [ 442.660219] [c000000f6dc3fc90] [c000000000255fbc] .do_vfs_ioctl+0x4ec/0x7c0 [ 442.660286] [c000000f6dc3fd80] [c000000000256364] .SyS_ioctl+0xd4/0xf0 [ 442.660354] [c000000f6dc3fe30] [c000000000009e54] syscall_exit+0x0/0x98 [ 442.660420] Instruction dump: [ 442.660454] 4bfffce9 4bfffee4 7c0802a6 fbc1fff0 fbe1fff8 f8010010 f821ff81 7c7e1b78 [ 442.660566] 7c9f2378 60000000 60000000 e93e02c8 <e8690060> 2fa30000 41de00c4 2b9f0002 [ 442.660679] ---[ end trace a64ac9546bcf0328 ]--- [ 442.660724] The reason is current VF is not EEH enabled. This patch introduces a macro to convert eeh_dev to eeh_pe. By doing so, it will prevent converting with NULL pointer. Signed-off-by: Wei Yang <[email protected]> Acked-by: Gavin Shan <[email protected]> CC: Michael Ellerman <[email protected]> V3 -> V4: 1. move the macro definition from include/linux/pci.h to arch/powerpc/include/asm/eeh.h V2 -> V3: 1. rebased on 3.17-rc4 2. introduce a macro 3. use this macro in several other places V1 -> V2: 1. code style and patch subject adjustment Signed-off-by: Michael Ellerman <[email protected]>

ERROR: code indent should use tabs where possible torvalds#37: FILE: include/linux/mmdebug.h:33: + do {^I^I^I^I^I^I^I^I\$ WARNING: please, no spaces at the start of a line torvalds#37: FILE: include/linux/mmdebug.h:33: + do {^I^I^I^I^I^I^I^I\$ ERROR: code indent should use tabs where possible torvalds#38: FILE: include/linux/mmdebug.h:34: + if (unlikely(cond)) {^I^I^I^I^I\$ WARNING: please, no spaces at the start of a line torvalds#38: FILE: include/linux/mmdebug.h:34: + if (unlikely(cond)) {^I^I^I^I^I\$ ERROR: code indent should use tabs where possible torvalds#39: FILE: include/linux/mmdebug.h:35: + dump_mm(mm);^I^I^I^I^I\$ WARNING: please, no spaces at the start of a line torvalds#39: FILE: include/linux/mmdebug.h:35: + dump_mm(mm);^I^I^I^I^I\$ ERROR: code indent should use tabs where possible torvalds#40: FILE: include/linux/mmdebug.h:36: + BUG();^I^I^I^I^I^I\$ WARNING: please, no spaces at the start of a line torvalds#40: FILE: include/linux/mmdebug.h:36: + BUG();^I^I^I^I^I^I\$ ERROR: code indent should use tabs where possible torvalds#41: FILE: include/linux/mmdebug.h:37: + }^I^I^I^I^I^I^I\$ WARNING: please, no spaces at the start of a line torvalds#41: FILE: include/linux/mmdebug.h:37: + }^I^I^I^I^I^I^I\$ ERROR: code indent should use tabs where possible torvalds#42: FILE: include/linux/mmdebug.h:38: + } while (0)$ WARNING: please, no spaces at the start of a line torvalds#42: FILE: include/linux/mmdebug.h:38: + } while (0)$ WARNING: Prefer [subsystem eg: netdev]_alert([subsystem]dev, ... then dev_alert(dev, ... then pr_alert(... to printk(KERN_ALERT ... torvalds#74: FILE: mm/debug.c:171: + printk(KERN_ALERT total: 6 errors, 7 warnings, 109 lines checked NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or scripts/cleanfile ./patches/mm-introduce-vm_bug_on_mm.patch has style problems, please review. If any of these errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Sasha Levin <[email protected]> Signed-off-by: Andrew Morton <[email protected]>

GIT 6fe676b243e5a0cb4cc4d9a4b094de8db0cdbf74 commit e500f488c27659bb6f5d313b336621f3daa67701 Author: Fabian Frederick <[email protected]> Date: Wed Oct 1 06:52:06 2014 +0200 net/dccp/ccid.c: add __init to ccid_activate ccid_activate is only called by __init ccid_initialize_builtins in same module. Signed-off-by: Fabian Frederick <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 0c5b8a46294d43fc63788839d3c18de0961ec1bc Author: Fabian Frederick <[email protected]> Date: Wed Oct 1 06:48:03 2014 +0200 net/dccp/proto.c: add __init to dccp_mib_init dccp_mib_init is only called by __init dccp_init in same module. Signed-off-by: Fabian Frederick <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 082f58ac4a48d3f5cb4597232cb2ac6823a96f43 Author: Quinn Tran <[email protected]> Date: Thu Sep 25 06:22:28 2014 -0400 target: Fix queue full status NULL pointer for SCF_TRANSPORT_TASK_SENSE During temporary resource starvation at lower transport layer, command is placed on queue full retry path, which expose this problem. The TCM queue full handling of SCF_TRANSPORT_TASK_SENSE currently sends the same cmd twice to lower layer. The 1st time led to cmd normal free path. The 2nd time cause Null pointer access. This regression bug was originally introduced v3.1-rc code in the following commit: commit e057f53308a5f071556ee80586b99ee755bf07f5 Author: Christoph Hellwig <[email protected]> Date: Mon Oct 17 13:56:41 2011 -0400 target: remove the transport_qf_callback se_cmd callback Signed-off-by: Quinn Tran <[email protected]> Signed-off-by: Saurav Kashyap <[email protected]> Cc: <[email protected]> # v3.1+ Signed-off-by: Nicholas Bellinger <[email protected]> commit db3a99b9921f27fe71ca8c0f218ee810e0e7fb69 Author: Joern Engel <[email protected]> Date: Tue Sep 16 16:23:19 2014 -0400 qla_target: rearrange struct qla_tgt_prm On most (non-x86) 64bit platforms this will remove 8 padding bytes from the structure. Signed-off-by: Joern Engel <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]> commit f9b6721a9cef94908467abf7a2cacbd15a7d23cb Author: Joern Engel <[email protected]> Date: Tue Sep 16 16:23:18 2014 -0400 qla_target: improve qlt_unmap_sg() Remove the inline attribute. Modern compilers ignore it and the function has grown beyond where inline made sense anyway. Remove the BUG_ON(!cmd->sg_mapped), and instead return if sg_mapped is not set. Every caller is doing this check, so we might as well have it in one place instead of four. Signed-off-by: Joern Engel <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]> commit 55a9066fffd2f533e7ed434b072469ef09d6c476 Author: Joern Engel <[email protected]> Date: Tue Sep 16 16:23:15 2014 -0400 qla_target: make some global functions static Also removes the declarations from the header - including two declarations without function definitions or callers. Signed-off-by: Joern Engel <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]> commit c57010420654aca179c500f61e86315a337244ca Author: Joern Engel <[email protected]> Date: Tue Sep 16 16:23:14 2014 -0400 qla_target: remove unused parameter Signed-off-by: Joern Engel <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]> commit f81ccb489a7a641c1bed41b49cf8d72c199c68d5 Author: Joern Engel <[email protected]> Date: Tue Sep 16 16:23:13 2014 -0400 target: simplify core_tmr_abort_task list_for_each_entry_safe is necessary if list objects are deleted from the list while traversing it. Not the case here, so we can use the base list_for_each_entry variant. Signed-off-by: Joern Engel <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]> commit 33940d09937276cd3c81f2874faf43e37c2db0e2 Author: Joern Engel <[email protected]> Date: Tue Sep 16 16:23:12 2014 -0400 target: encapsulate smp_mb__after_atomic() The target code has a rather generous helping of smp_mb__after_atomic() throughout the code base. Most atomic operations were followed by one and none were preceded by smp_mb__before_atomic(), nor accompanied by a comment explaining the need for a barrier. Instead of trying to prove for every case whether or not it is needed, this patch introduces atomic_inc_mb() and atomic_dec_mb(), which explicitly include the memory barriers before and after the atomic operation. For now they are defined in a target header, although they could be of general use. Most of the existing atomic/mb combinations were replaced by the new helpers. In a few cases the atomic was sandwiched in spin_lock/spin_unlock and I simply removed the barrier. I suspect that in most cases the correct conversion would have been to drop the barrier. I also suspect that a few cases exist where a) the barrier was necessary and b) a second barrier before the atomic would have been necessary and got added by this patch. Signed-off-by: Joern Engel <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]> commit 74ed7e62289dc6d388996d7c8f89c2e7e95b9657 Author: Joern Engel <[email protected]> Date: Tue Sep 16 16:23:11 2014 -0400 target: remove some smp_mb__after_atomic()s atomic_inc_return() already does an implicit memory barrier and the second case was moved from an atomic to a plain flag operation. If a barrier were needed in the second case, it would have to be smp_mb(), not a variant optimized away for x86 and other architectures. Signed-off-by: Joern Engel <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]> commit 8f83269048628d7b139dacbfc6cc97befcbdd2e9 Author: Joern Engel <[email protected]> Date: Tue Sep 16 16:23:10 2014 -0400 target: simplify core_tmr_release_req() And while at it, do minimal coding style fixes in the area. Signed-off-by: Joern Engel <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]> commit 9c7d6154bc4b9dfefd580490cdca5f7c72321464 Author: Andy Grover <[email protected]> Date: Mon Jun 30 16:39:46 2014 -0700 target: Remove core_tpg_release_virtual_lun0 function Simple and just called from one place. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andy Grover <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]> commit cd9d7cbaec8b622eee4edcd8bf481c4047f74915 Author: Andy Grover <[email protected]> Date: Mon Jun 30 16:39:44 2014 -0700 target: Change core_dev_del_lun to take a se_lun instead of unpacked_lun Remove core_tpg_pre_dellun entirely, since we don't need to get/check a pointer we already have. Nothing else can return an error, so core_dev_del_lun can return void. Rename core_tpg_post_dellun to remove_lun - a clearer name, now that pre_dellun is gone. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andy Grover <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]> commit cc83881f2c57caaf4b14adaffa65595640a59661 Author: Andy Grover <[email protected]> Date: Mon Jun 30 16:39:43 2014 -0700 target: core_tpg_post_dellun can return void Nothing in it can raise an error. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Andy Grover <[email protected]> Signed-off-by: Nicholas Bellinger <[email protected]> commit 49be17235c0acd96f2ff0fe282867fe3a83f554c Author: hayeswang <[email protected]> Date: Wed Oct 1 13:25:11 2014 +0800 r8152: disable power cut for RTL8153 The firmware would be clear when the power cut is enabled for RTL8153. Signed-off-by: Hayes Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 204c8704128943bf3f8b605f4b40bdc2b6bd89dc Author: hayeswang <[email protected]> Date: Wed Oct 1 13:25:10 2014 +0800 r8152: remove clearing bp The xxx_clear_bp() is used to halt the firmware. It only necessary for updating the new firmware. Besides, depend on the version of the current firmware, it may have problem to halt the firmware directly. Finally, halt the firmware would let the firmware code useless, and the bugs which are fixed by the firmware would occur. Signed-off-by: Hayes Wang <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit aa55c8e2f7a395dfc9e67fc6637321e19ce9bfe1 Author: Masahiro Yamada <[email protected]> Date: Tue Sep 9 20:02:24 2014 +0900 kbuild: handle C=... and M=... after entering into build directory This commit avoids processing C=... and M=... twice when O=... is also given. Besides, we can also remove KBUILD_EXTMOD="$(KBUILD_EXTMOD)" in the sub-make target. Signed-off-by: Masahiro Yamada <[email protected]> Acked-by: Peter Foley <[email protected]> Signed-off-by: Michal Marek <[email protected]> commit 745a254322c898dadf019342cd7140f7867d2d0f Author: Masahiro Yamada <[email protected]> Date: Tue Sep 9 20:02:23 2014 +0900 kbuild: use $(Q) for sub-make target Since commit 066b7ed9558087a7957a1128f27d7a3462ff117f (kbuild: Do not print the build directory with make -s), "Q" is defined above the sub-make target. This commit takes advantage of that and replaces "$(if $(KBUILD_VERBOSE:1=),@)" with "$(Q)". Signed-off-by: Masahiro Yamada <[email protected]> Acked-by: Peter Foley <[email protected]> Signed-off-by: Michal Marek <[email protected]> commit 7ff525712acf9325e9acdb27bbc93049ea2e850c Author: Masahiro Yamada <[email protected]> Date: Tue Sep 9 20:02:22 2014 +0900 kbuild: fake the "Entering directory ..." message more simply Commit c2e28dc975ea87feed84415006ae143424912ac7 (kbuild: Print the name of the build directory) added a gimmick to show the "Entering directory ...". Instead of echoing the hard-coded message (that is, we need to know the exact message), moving --no-print-directory would be easier. Signed-off-by: Masahiro Yamada <[email protected]> Acked-by: Peter Foley <[email protected]> Signed-off-by: Michal Marek <[email protected]> commit 1b0ecb28b0cc216535ce6477d39aa610c3ff68a1 Author: Vlad Yasevich <[email protected]> Date: Tue Sep 30 19:39:37 2014 -0400 bnx2: Correctly receive full sized 802.1ad fragmes This driver, similar to tg3, has a check that will cause full sized 802.1ad frames to be dropped. The frame will be larger then the standard mtu due to the presense of vlan header that has not been stripped. The driver should not drop this frame and should process it just like it does for 802.1q. CC: Sony Chacko <[email protected]> CC: [email protected] Signed-off-by: Vladislav Yasevich <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 7d3083ee36b51e425b6abd76778a2046906b0fd3 Author: Vlad Yasevich <[email protected]> Date: Tue Sep 30 19:39:36 2014 -0400 tg3: Allow for recieve of full-size 8021AD frames When receiving a vlan-tagged frame that still contains a vlan header, the length of the packet will be greater then MTU+ETH_HLEN since it will account of the extra vlan header. TG3 checks this for the case for 802.1Q, but not for 802.1ad. As a result, full sized 802.1ad frames get dropped by the card. Add a check for 802.1ad protocol when receving full sized frames. Suggested-by: Prashant Sreedharan <[email protected]> CC: Prashant Sreedharan <[email protected]> CC: Michael Chan <[email protected]> Signed-off-by: Vladislav Yasevich <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 1e918876853aa85435e0f17fd8b4a92dcfff53d6 Author: Florian Westphal <[email protected]> Date: Wed Oct 1 13:38:03 2014 +0200 r8169: add support for Byte Queue Limits tested on RTL8168d/8111d model using 'super_netperf 40' with TCP/UDP_STREAM. Output of while true; do for n in inflight limit; do echo -n $n\ ; cat $n; done; sleep 1; done during netperf run, 100mbit peer: inflight 0 limit 3028 inflight 6056 limit 4542 [ trimmed output for brevity, no limit/inflight changes during test steady-state ] limit 4542 inflight 3028 limit 6122 inflight 0 limit 6122 [ changed cable to 1gbit peer, restart netperf ] inflight 37850 limit 36336 inflight 33308 limit 31794 inflight 33308 limit 31794 inflight 27252 limit 25738 [ again, no changes during test ] inflight 27252 limit 25738 inflight 0 limit 28766 [ change cable to 100mbit peer, restart netperf ] limit 28766 inflight 27370 limit 28766 inflight 4542 limit 5990 inflight 6056 limit 4542 [ .. ] inflight 6056 limit 4542 inflight 0 [end of test] Cc: Francois Romieu <[email protected]> Cc: Hayes Wang <[email protected]> Signed-off-by: Florian Westphal <[email protected]> Acked-by: Eric Dumazet <[email protected]> Acked-by: Tom Herbert <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit d0bf4a9e92b9a93ffeeacbd7b6cb83e0ee3dc2ef Author: Eric Dumazet <[email protected]> Date: Mon Sep 29 13:29:15 2014 -0700 net: cleanup and document skb fclone layout Lets use a proper structure to clearly document and implement skb fast clones. Then, we might experiment more easily alternative layouts. This patch adds a new skb_fclone_busy() helper, used by tcp and xfrm, to stop leaking of implementation details. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 0f1ca65ee50df042051e8fa3a14f73b0c71d45b9 Author: Arianna Avanzini <[email protected]> Date: Fri Aug 22 13:20:02 2014 +0200 xen, blkfront: factor out flush-related checks from do_blkif_request() This commit factors out some checks related to the request insertion path, which can be done in an function instead of by itself. Reviewed-by: David Vrabel <[email protected]> Signed-off-by: Arianna Avanzini <[email protected]> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]> commit 61cecca865280bef4f8a9748d0a9afa5df351ac2 Author: Roger Pau Monné <[email protected]> Date: Mon Sep 15 11:55:27 2014 +0200 xen-blkback: fix leak on grant map error path Fix leaking a page when a grant mapping has failed. CC: [email protected] Signed-off-by: Roger Pau Monné <[email protected]> Reported-and-Tested-by: Tao Chen <[email protected]> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]> commit 12ea729645ace01e08f9654df155622898d3aae6 Author: Vitaly Kuznetsov <[email protected]> Date: Mon Sep 8 15:21:33 2014 +0200 xen/blkback: unmap all persistent grants when frontend gets disconnected blkback does not unmap persistent grants when frontend goes to Closed state (e.g. when blkfront module is being removed). This leads to the following in guest's dmesg: [ 343.243825] xen:grant_table: WARNING: g.e. 0x445 still in use! [ 343.243825] xen:grant_table: WARNING: g.e. 0x42a still in use! ... When load module -> use device -> unload module sequence is performed multiple times it is possible to hit BUG() condition in blkfront module: [ 343.243825] kernel BUG at drivers/block/xen-blkfront.c:954! [ 343.243825] invalid opcode: 0000 [#1] SMP [ 343.243825] Modules linked in: xen_blkfront(-) ata_generic pata_acpi [last unloaded: xen_blkfront] ... [ 343.243825] Call Trace: [ 343.243825] [<ffffffff814111ef>] ? unregister_xenbus_watch+0x16f/0x1e0 [ 343.243825] [<ffffffffa0016fbf>] blkfront_remove+0x3f/0x140 [xen_blkfront] ... [ 343.243825] RIP [<ffffffffa0016aae>] blkif_free+0x34e/0x360 [xen_blkfront] [ 343.243825] RSP <ffff88001eb8fdc0> We don't need to keep these grants if we're disconnecting as frontend might already forgot about them. Solve the issue by moving xen_blkbk_free_caches() call from xen_blkif_free() to xen_blkif_disconnect(). Now we can see the following: [ 928.590893] xen:grant_table: WARNING: g.e. 0x587 still in use! [ 928.591861] xen:grant_table: WARNING: g.e. 0x372 still in use! ... [ 929.592146] xen:grant_table: freeing g.e. 0x587 [ 929.597174] xen:grant_table: freeing g.e. 0x372 ... Backend does not keep persistent grants any more, reconnect works fine. CC: [email protected] Signed-off-by: Vitaly Kuznetsov <[email protected]> Signed-off-by: Konrad Rzeszutek Wilk <[email protected]> commit b248230c34970a6c1c17c591d63b464e8d2cfc33 Author: Yuchung Cheng <[email protected]> Date: Mon Sep 29 13:20:38 2014 -0700 tcp: abort orphan sockets stalling on zero window probes Currently we have two different policies for orphan sockets that repeatedly stall on zero window ACKs. If a socket gets a zero window ACK when it is transmitting data, the RTO is used to probe the window. The socket is aborted after roughly tcp_orphan_retries() retries (as in tcp_write_timeout()). But if the socket was idle when it received the zero window ACK, and later wants to send more data, we use the probe timer to probe the window. If the receiver always returns zero window ACKs, icsk_probes keeps getting reset in tcp_ack() and the orphan socket can stall forever until the system reaches the orphan limit (as commented in tcp_probe_timer()). This opens up a simple attack to create lots of hanging orphan sockets to burn the memory and the CPU, as demonstrated in the recent netdev post "TCP connection will hang in FIN_WAIT1 after closing if zero window is advertised." http://www.spinics.net/lists/netdev/msg296539.html This patch follows the design in RTO-based probe: we abort an orphan socket stalling on zero window when the probe timer reaches both the maximum backoff and the maximum RTO. For example, an 100ms RTT connection will timeout after roughly 153 seconds (0.3 + 0.6 + .... + 76.8) if the receiver keeps the window shut. If the orphan socket passes this check, but the system already has too many orphans (as in tcp_out_of_resources()), we still abort it but we'll also send an RST packet as the connection may still be active. In addition, we change TCP_USER_TIMEOUT to cover (life or dead) sockets stalled on zero-window probes. This changes the semantics of TCP_USER_TIMEOUT slightly because it previously only applies when the socket has pending transmission. Signed-off-by: Yuchung Cheng <[email protected]> Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: Neal Cardwell <[email protected]> Reported-by: Andrey Dmitrov <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 3edfe0030bb7a82dab2a30a29ea6e1800e600c4b Author: Helge Deller <[email protected]> Date: Wed Oct 1 22:11:01 2014 +0200 parisc: Fix serial console for machines with serial port on superio chip Fix the serial console on machines where the serial port is located on the SuperIO chip. Signed-off-by: Helge Deller <[email protected]> Cc: Peter Hurley <[email protected]> commit baf378126b08474de2e2428b16e62a69df0339d9 Author: Michael Opdenacker <[email protected]> Date: Wed Oct 1 14:07:39 2014 -0600 rsxx: Remove deprecated IRQF_DISABLED This removes the use of the IRQF_DISABLED flag from drivers/block/rsxx/core.c It's a NOOP since 2.6.35 and it will be removed one day. Signed-off-by: Michael Opdenacker <[email protected]> Acked-by Philip Kelleher <[email protected]> Signed-off-by: Jens Axboe <[email protected]> commit cb57659a15c6c0576493cc8a10474ce7ffd44eb3 Author: Fabian Frederick <[email protected]> Date: Wed Oct 1 19:30:03 2014 +0200 cipso: add __init to cipso_v4_cache_init cipso_v4_cache_init is only called by __init cipso_v4_init Signed-off-by: Fabian Frederick <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 57a02c39c1c20ed03a86f8014c11a8c18b94cac3 Author: Fabian Frederick <[email protected]> Date: Wed Oct 1 19:18:57 2014 +0200 inet: frags: add __init to ip4_frags_ctl_register ip4_frags_ctl_register is only called by __init ipfrag_init Signed-off-by: Fabian Frederick <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 47d7a88c188f06ffaea3a539f84fe10cb4e77787 Author: Fabian Frederick <[email protected]> Date: Wed Oct 1 18:27:50 2014 +0200 tcp: add __init to tcp_init_mem tcp_init_mem is only called by __init tcp_init. Signed-off-by: Fabian Frederick <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit ee7a1beb9759c94aea67dd887faf5e447a5c6710 Author: Chun-Hao Lin <[email protected]> Date: Wed Oct 1 23:17:21 2014 +0800 r8169:call "rtl8168_driver_start" "rtl8168_driver_stop" only when hardware dash function is enabled These two functions are used to inform dash firmware that driver is been brought up or brought down. So call these two functions only when hardware dash function is enabled. Signed-off-by: Chun-Hao Lin <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 2a9b4d9670e71784896d95c41c9b0acd50db1dbb Author: Chun-Hao Lin <[email protected]> Date: Wed Oct 1 23:17:20 2014 +0800 r8169:modify the behavior of function "rtl8168_oob_notify" In function "rtl8168_oob_notify", using function "rtl_eri_write" to access eri register 0xe8, instead of using MAC register "ERIDR" and "ERIAR" to access it. For using function "rtl_eri_write" in function "rtl8168_oob_notify", need to move down "rtl8168_oob_notify" related functions under the function "rtl_eri_write". Signed-off-by: Chun-Hao Lin <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 2f8c040ce6791ef0477e6d59768ee3d5fd0df0fd Author: Chun-Hao Lin <[email protected]> Date: Wed Oct 1 23:17:19 2014 +0800 r8169:change the name of function "r8168dp_check_dash" to "r8168_check_dash" DASH function not only RTL8168DP can support, but also RTL8168EP. So change the name of function "r8168dp_check_dash" to "r8168_check_dash". Signed-off-by: Chun-Hao Lin <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 706123d06c18b55da5e9da21e2d138ee789bf8f4 Author: Chun-Hao Lin <[email protected]> Date: Wed Oct 1 23:17:18 2014 +0800 r8169:change the name of function"rtl_w1w0_eri" Change the name of function "rtl_w1w0_eri" to "rtl_w0w1_eri". In this function, the local variable "val" is "write zeros then write ones". Please see below code. (val & ~m) | p In this patch, change the function name from "xx_w1w0_xx" to "xx_w0w1_xx". The changed function name is more suitable for it's behavior. Signed-off-by: Chun-Hao Lin <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 7656442824f6174b56a19c664fe560972df56ad4 Author: Chun-Hao Lin <[email protected]> Date: Wed Oct 1 23:17:17 2014 +0800 r8169:for function "rtl_w1w0_phy" change its name and behavior Change function name from "rtl_w1w0_phy" to "rtl_w0w1_phy". And its behavior from "write ones then write zeros" to "write zeros then write ones". In Realtek internal driver, bitwise operations are almost "write zeros then write ones". For easy to port hardware parameters from Realtek internal driver to Linux kernal driver "r8169", we would like to change this function's behavior and its name. Signed-off-by: Chun-Hao Lin <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit ac85bcdbc0ffd3903d6db4abcd769ecacf98605b Author: Chun-Hao Lin <[email protected]> Date: Wed Oct 1 23:17:16 2014 +0800 r8169:add more chips to support magic packet v2 For RTL8168F RTL8168FB RTL8168G RTL8168GU RTL8411 RTL8411B RTL8402 RTL8107E, the magic packet enable bit is changed to eri 0xde bit0. In this patch, change magic packet enable bit of these chips to eri 0xde bit0. Signed-off-by: Chun-Hao Lin <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 89cceb2729c752e6ff9b3bc8650a70f29884f116 Author: Chun-Hao Lin <[email protected]> Date: Wed Oct 1 23:17:15 2014 +0800 r8169:add support more chips to get mac address from backup mac address register RTL8168FB RTL8168G RTL8168GU RTL8411 RTL8411B RTL8106EUS RTL8402 can support get mac address from backup mac address register. Signed-off-by: Chun-Hao Lin <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 42fde7371035144037844f41bd16950de9912bdb Author: Chun-Hao Lin <[email protected]> Date: Wed Oct 1 23:17:14 2014 +0800 r8169:add disable/enable RTL8411B pll function RTL8411B can support disable/enable pll function. Signed-off-by: Chun-Hao Lin <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit b8e5e6ad7115befef13a4493f1d2b8e438abc058 Author: Chun-Hao Lin <[email protected]> Date: Wed Oct 1 23:17:13 2014 +0800 r8169:add disable/enable RTL8168G pll function RTL8168G also can disable/enable pll function. Signed-off-by: Chun-Hao Lin <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 05b9687bb3606190304f08c2e4cd63de8717e30b Author: Chun-Hao Lin <[email protected]> Date: Wed Oct 1 23:17:12 2014 +0800 r8169:change uppercase number to lowercase number Signed-off-by: Chun-Hao Lin <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit a29c9c43bb633a9965909cd548879fee4aa789a4 Author: David L Stevens <[email protected]> Date: Wed Oct 1 11:05:27 2014 -0400 sunvnet: fix potential NULL pointer dereference One of the error cases for vnet_start_xmit()'s "out_dropped" label is port == NULL, so only mess with port->clean_timer when port is not NULL. Signed-off-by: David L Stevens <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit e506d405ac7d34d03996c97ac68aa2ac010be64a Author: Thierry Reding <[email protected]> Date: Wed Oct 1 13:59:00 2014 +0200 net: dsa: Fix build warning for !PM_SLEEP The dsa_switch_suspend() and dsa_switch_resume() functions are only used when PM_SLEEP is enabled, so they need #ifdef CONFIG_PM_SLEEP protection to avoid a compiler warning. Signed-off-by: Thierry Reding <[email protected]> Acked-by: Florian Fainelli <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 84ac1f2ca41f5888cc995944c073a5220f3ed549 Author: Tanmay Inamdar <[email protected]> Date: Fri Sep 26 14:08:25 2014 -0700 arm64: dts: Add APM X-Gene PCIe device tree nodes Add the device tree nodes for APM X-Gene PCIe host controller and PCIe clock interface. Since X-Gene SOC supports maximum 5 ports, 5 dts nodes are added. Signed-off-by: Tanmay Inamdar <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> commit 2896e4418b17363f211e084471b589e3c06a7248 Author: Bjorn Helgaas <[email protected]> Date: Wed Oct 1 13:01:35 2014 -0600 PCI: xgene: Add APM X-Gene PCIe driver Add the AppliedMicro X-Gene SOC PCIe host controller driver. The X-Gene PCIe controller supports up to 8 lanes and GEN3 speed. The X-Gene SOC supports up to 5 PCIe ports. [bhelgaas: folded in MAINTAINERS and bindings updates] Tested-by: Ming Lei <[email protected]> Tested-by: Dann Frazier <[email protected]> Signed-off-by: Tanmay Inamdar <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Liviu Dudau <[email protected]> (driver) commit 3c87dcbfb36ce6d3d9087f0163c02ba5690d9a85 Author: Subbaraya Sundeep Bhatta <[email protected]> Date: Wed Oct 1 11:01:17 2014 +0200 net: ll_temac: Remove unnecessary ether_setup after alloc_etherdev Calling ether_setup is redundant since alloc_etherdev calls it. Signed-off-by: Subbaraya Sundeep Bhatta <[email protected]> Signed-off-by: Michal Simek <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 8493ecca74a7b4a66e19676de1a0f14194179941 Author: Benjamin Tissoires <[email protected]> Date: Wed Oct 1 11:59:47 2014 -0400 HID: uHID: fix excepted report type When uhid_get_report() or uhid_set_report() are called, they emit on the char device a UHID_GET_REPORT or UHID_SET_REPORT message. Then, the protocol says that the user space asnwers with UHID_GET_REPORT_REPLY or UHID_SET_REPORT_REPLY. Unfortunatelly, the current code waits for an event of type UHID_GET_REPORT or UHID_SET_REPORT instead of the reply one. Add 1 to UHID_GET_REPORT or UHID_SET_REPORT to actually wait for the reply, and validate the reply. Signed-off-by: Benjamin Tissoires <[email protected]> Reviewed-by: David Herrmann <[email protected]> Signed-off-by: Jiri Kosina <[email protected]> commit c8df6ac9452e8f47a6f660993c526d13e858a6f3 Author: Lucas Stach <[email protected]> Date: Tue Sep 30 18:36:27 2014 +0200 PCI: designware: Remove open-coded bitmap operations Replace them by using the standard kernel bitmap ops. No functional change, but makes the code a lot cleaner. Signed-off-by: Lucas Stach <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Reviewed-by: Pratyush Anand <[email protected]> Acked-by: Jingoo Han <[email protected]> commit 2199f0608864cf4e8c93d37842a5ee50c8d79843 Author: Mikulas Patocka <[email protected]> Date: Fri Mar 28 15:51:56 2014 -0400 dm crypt: sort writes Write requests are sorted in a red-black tree structure and are submitted in the sorted order. In theory the sorting should be performed by the underlying disk scheduler, however, in practice the disk scheduler only accepts and sorts a finite number of requests. To allow the sorting of all requests, dm-crypt needs to implement its own own sorting. The overhead associated with rbtree-based sorting is considered negligible so it is not used conditionally. Even on SSD sorting can be beneficial since in-order request dispatch promotes lower latency IO completion to the upper layers. Signed-off-by: Mikulas Patocka <[email protected]> Signed-off-by: Mike Snitzer <[email protected]> commit 648fee35be4c75667aa18bf513f7e7e65c01640b Author: Mikulas Patocka <[email protected]> Date: Fri Mar 28 15:51:56 2014 -0400 dm crypt: offload writes to thread Submitting write bios directly in the encryption thread caused serious performance degradation. On a multiprocessor machine, encryption requests finish in a different order than they were submitted. Consequently, write requests would be submitted in a different order and it could cause severe performance degradation. Move the submission of write requests to a separate thread so that the requests can be sorted before submitting. But this commit improves dm-crypt performance even without having dm-crypt perform request sorting (in particular it enables IO schedulers like CFQ to sort more effectively). Note: it is required that a previous commit ("dm crypt: don't allocate pages for a partial request") be applied before applying this patch. Otherwise, this commit could introduce a crash. Signed-off-by: Mikulas Patocka <[email protected]> Signed-off-by: Mike Snitzer <[email protected]> commit 4a0d7e0464226eee625a5b77484c339334453882 Author: Mikulas Patocka <[email protected]> Date: Fri Mar 28 15:51:55 2014 -0400 dm crypt: use unbound workqueue for request processing Use unbound workqueue so that work is automatically balanced between available CPUs. Signed-off-by: Mikulas Patocka <[email protected]> Signed-off-by: Mike Snitzer <[email protected]> commit 72bfc40ca3b393cb0bc6b5e2ce364e6c6ce0f390 Author: Mikulas Patocka <[email protected]> Date: Thu May 29 14:18:12 2014 -0400 dm crypt: remove io_pending refcount member from dm_crypt_io Commit "dm crypt: don't allocate pages for a partial request" changed the code to allocate all pages for one request. There is always just one pending request, so the io_pending refcount may be removed. Signed-off-by: Mikulas Patocka <[email protected]> Signed-off-by: Mike Snitzer <[email protected]> commit 42196fec8945cc84c032b7f59deaffee82036245 Author: Mikulas Patocka <[email protected]> Date: Fri Mar 28 15:51:56 2014 -0400 dm crypt: remove unused io_pool and _crypt_io_pool The previous commits ("dm crypt: use per-bio data") and ("dm crypt: don't allocate pages for a partial request") stopped using the io_pool slab mempool and backing _crypt_io_pool kmem cache. Signed-off-by: Mikulas Patocka <[email protected]> Signed-off-by: Mike Snitzer <[email protected]> commit ebfda24b1e1bf483accdb900f8625151d8f01383 Author: Mikulas Patocka <[email protected]> Date: Fri Mar 28 15:51:56 2014 -0400 dm crypt: avoid deadlock in mempools Fix a theoretical deadlock introduced in the previous commit ("dm crypt: don't allocate pages for a partial request"). The function crypt_alloc_buffer may be called concurrently. If we allocate from the mempool concurrently, there is a possibility of deadlock. For example, if we have mempool of 256 pages, two processes, each wanting 256, pages allocate from the mempool concurrently, it may deadlock in a situation where both processes have allocated 128 pages and the mempool is exhausted. In order to avoid such a scenario, we allocate the pages under a mutex. In order to not degrade performance with excessive locking, we try non-blocking allocations without a mutex first and if it fails, we fallback to a blocking allocation with a mutex. Signed-off-by: Mikulas Patocka <[email protected]> Signed-off-by: Mike Snitzer <[email protected]> commit b9ea7cb3fb237078be400522880932008c630fb7 Author: Mikulas Patocka <[email protected]> Date: Fri Mar 28 15:51:56 2014 -0400 dm crypt: don't allocate pages for a partial request Change crypt_alloc_buffer so that it only ever allocates pages for a full request. This change is a prerequisite for the commit "dm crypt: offload writes to thread". Which implies this change is effectively required for the upcoming cpu parallelization changes. But this change simplifies the dm-crypt code at the expense of reduced throughput in low memory conditions (where allocation for a partial request is most useful). This change also enables the removal of the io_pending refcount. Note: the next commit ("dm-crypt: avoid deadlock in mempools") is needed to fix a theoretical deadlock. Signed-off-by: Mikulas Patocka <[email protected]> Signed-off-by: Mike Snitzer <[email protected]> commit 117cd3e12232afea97dd31489fbde8888ad22b3e Author: Heinz Mauelshagen <[email protected]> Date: Wed Sep 24 17:47:19 2014 +0200 dm raid: add discard support for RAID levels 4, 5 and 6 In case of RAID levels 4, 5 and 6 we have to verify each RAID members' ability to zero data on discards to avoid stripe data corruption -- if discard_zeroes_data is not set for each RAID member discard support must be disabled. Also add an 'ignore_discard' table argument to the target in order to ignore discard processing completely on a RAID array, hence not passing down discards to MD personalities. This 'ignore_discard' control provides the ability to: - prohibit discards in case of _potential_ data corruptions in RAID4/5/6 (e.g. if ability to zero data on discard is flawed in a RAID member) - avoid discard processing overhead Signed-off-by: Heinz Mauelshagen <[email protected]> Signed-off-by: Mike Snitzer <[email protected]> commit 04c308f43a90a9b3b84c344b324d6af29288da05 Author: Mikulas Patocka <[email protected]> Date: Wed Oct 1 13:29:48 2014 -0400 dm bufio: when done scanning return from __scan immediately When __scan frees the required number of buffer entries that the shrinker requested (nr_to_scan becomes zero) it must return. Before this fix the __scan code exited only the inner loop and continued in the outer loop. Also, move dm_bufio_cond_resched to __scan's inner loop, so that iterating the bufio client's lru lists doesn't result in scheduling latency. Reported-by: Joe Thornber <[email protected]> Signed-off-by: Mikulas Patocka <[email protected]> Signed-off-by: Mike Snitzer <[email protected]> Cc: [email protected] # 3.2+ commit 5ec094057c7df5ff80f5e7fe282f47ad205fb976 Author: Bjorn Helgaas <[email protected]> Date: Tue Sep 23 14:38:28 2014 -0600 PCI/MSI: Remove unnecessary temporary variable The only use of "status" is to hold a value which is immediately returned, so just return and remove the variable directly. Signed-off-by: Bjorn Helgaas <[email protected]> commit 56b72b40957947f7c08771f030102351d4c906df Author: Yijing Wang <[email protected]> Date: Mon Sep 29 18:35:16 2014 -0600 PCI/MSI: Use __write_msi_msg() instead of write_msi_msg() default_restore_msi_irq() already has the struct msi_desc pointer required by __write_msi_msg(), so call it directly instead of having write_msi_msg() look it up from the IRQ. No functional change. [bhelgaas: split into separate patch] Signed-off-by: Yijing Wang <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> commit 1e8f4cc82eded0c3c97ef6e2f119782e42deda35 Author: Yijing Wang <[email protected]> Date: Wed Sep 24 11:09:45 2014 +0800 MSI/powerpc: Use __read_msi_msg() instead of read_msi_msg() rtas_setup_msi_irqs() already has the struct msi_desc pointer required by __read_msi_msg(), so call it directly instead of having read_msi_msg() look it up from the IRQ. No functional change. [bhelgaas: changelog] Signed-off-by: Yijing Wang <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Acked-by: Michael Ellerman <[email protected]> CC: Benjamin Herrenschmidt <[email protected]> CC: [email protected] commit 2b260085e466c345e78f23b1c9ad1d123d509ef8 Author: Yijing Wang <[email protected]> Date: Tue Sep 23 13:27:25 2014 +0800 PCI/MSI: Use __get_cached_msi_msg() instead of get_cached_msi_msg() Both callers of get_cached_msi_msg() start with a struct irq_data pointer, look up the corresponding IRQ number, and pass it to get_cached_msi_msg(), which then uses irq_get_irq_data() to look up the struct irq_data again to call __get_cached_msi_msg(). Since we already have the struct irq_data, call __get_cached_msi_msg() directly and skip the lookup work done by get_cached_msi_msg(). No functional change. [bhelgaas: changelog] Signed-off-by: Yijing Wang <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> CC: Tony Luck <[email protected]> CC: [email protected] commit 468ff15a3ab98ed7153c29c68229ffb97f15a251 Author: Yijing Wang <[email protected]> Date: Tue Sep 23 13:27:24 2014 +0800 PCI/MSI: Add "msi_bus" sysfs MSI/MSI-X control for endpoints The "msi_bus" sysfs file for bridges sets a bus flag to allow or disallow future driver requests for MSI or MSI-X. Previously, the sysfs file existed for endpoints but did nothing. Add "msi_bus" support for endpoints, so an administrator can prevent the use of MSI and MSI-X for individual devices. Note that as for bridges, these changes only affect future driver requests for MSI or MSI-X, so drivers may need to be reloaded. Add documentation for the "msi_bus" sysfs file. [bhelgaas: changelog, comments, add "subordinate", add endpoint printk, rework bus_flags setting, make bus_flags printk unconditional] Signed-off-by: Yijing Wang <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> commit 48c3c38f003c25d50a09d3da558667c5ecd530aa Author: Yijing Wang <[email protected]> Date: Tue Sep 23 11:02:42 2014 -0600 PCI/MSI: Remove "pos" from the struct msi_desc msi_attrib "msi_attrib.pos" is only used for MSI (not MSI-X), and we already cache the MSI capability offset in "dev->msi_cap". Remove "pos" from the struct msi_attrib and use "dev->msi_cap" directly. [bhelgaas: changelog, fix whitespace] Signed-off-by: Yijing Wang <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> commit 81052769e48609525c452d8f078a5786b673e178 Author: Yijing Wang <[email protected]> Date: Tue Sep 23 13:27:22 2014 +0800 PCI/MSI: Remove unused kobject from struct msi_desc After commit 1c51b50c2995 ("PCI/MSI: Export MSI mode using attributes, not kobjects"), the kobject in struct msi_desc is unused. Remove the unused struct kobject from struct msi_desc. [bhelgaas: changelog] Fixes: 1c51b50c2995 ("PCI/MSI: Export MSI mode using attributes, not kobjects") Signed-off-by: Yijing Wang <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Acked-by: Greg Kroah-Hartman <[email protected]> commit a06cd74cefe754341f747ddc4cf7b0058fa9bff8 Author: Alexander Gordeev <[email protected]> Date: Tue Sep 23 12:45:58 2014 -0600 PCI/MSI: Rename pci_msi_check_device() to pci_msi_supported() Rename pci_msi_check_device() to pci_msi_supported() for clarity. Note that pci_msi_supported() returns true if MSI/MSI-X is supported, so code like: if (pci_msi_supported(...)) reads naturally. [bhelgaas: changelog, split to separate patch, reverse sense] Signed-off-by: Alexander Gordeev <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> commit 27e20603c54ba633ed259284d006275f13c9f95b Author: Alexander Gordeev <[email protected]> Date: Tue Sep 23 14:25:11 2014 -0600 PCI/MSI: Move D0 check into pci_msi_check_device() Both callers of pci_msi_check_device() check that the device is in D0 state, so move the check from the callers into pci_msi_check_device() itself. In pci_enable_msi_range(), note that pci_msi_check_device() never returns a positive value any more, so the loop that called it until it returns zero or negative is no longer necessary. [bhelgaas: changelog, split to separate patch] Signed-off-by: Alexander Gordeev <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> commit ad975ebad4c3ce8dcc7d0bb4db26ea5aca4cfc99 Author: Alexander Gordeev <[email protected]> Date: Tue Sep 23 12:39:54 2014 -0600 PCI/MSI: Remove arch_msi_check_device() No architectures implement arch_msi_check_device() or the struct msi_chip .check_device() method, so remove them. Remove the "type" parameter to pci_msi_check_device() because it was only used to call arch_msi_check_device() and is no longer needed. [bhelgaas: changelog, split to separate patch] Signed-off-by: Alexander Gordeev <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> commit 3930115e0dd67f61b3b1882c7a34d0baeff1bb4c Author: Alexander Gordeev <[email protected]> Date: Sun Sep 7 20:57:54 2014 +0200 irqchip: armada-370-xp: Remove arch_msi_check_device() Move MSI checks from arch_msi_check_device() to arch_setup_msi_irqs(). This makes the code more compact and allows removing arch_msi_check_device() from generic MSI code. Tested-by: Thomas Petazzoni <[email protected]> Signed-off-by: Alexander Gordeev <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Acked-by: Jason Cooper <[email protected]> CC: Thomas Gleixner <[email protected]> commit 6b2fd7efeb888fa781c1f767de6c36497ac1596b Author: Alexander Gordeev <[email protected]> Date: Sun Sep 7 20:57:53 2014 +0200 PCI/MSI/PPC: Remove arch_msi_check_device() Move MSI checks from arch_msi_check_device() to arch_setup_msi_irqs(). This makes the code more compact and allows removing arch_msi_check_device() from generic MSI code. Signed-off-by: Alexander Gordeev <[email protected]> Signed-off-by: Bjorn Helgaas <[email protected]> Acked-by: Michael Ellerman <[email protected]> commit 977104ece1568f2e2ad3f5fd8e55bd640e8ab55a Author: Mark Charlebois <[email protected]> Date: Thu Sep 4 14:16:17 2014 -0700 arm: LLVMLinux: Use global stack register variable for percpu Using global current_stack_pointer works on both clang and gcc. current_stack_pointer is an unsigned long and needs to be cast as a pointer to dereference. KernelVersion: 3.17.0-rc6 Signed-off-by: Mark Charlebois <[email protected]> Signed-off-by: Behan Webster <[email protected]> commit a35dc594542b29935cd3a92e53233ad4ba4e622f Author: Behan Webster <[email protected]> Date: Tue Sep 3 22:27:27 2013 -0400 arm: LLVMLinux: Use current_stack_pointer in unwind_backtrace Use the global current_stack_pointer to get the value of the stack pointer. This change supports being able to compile the kernel with both gcc and clang. KernelVersion: 3.17.0-rc6 Signed-off-by: Behan Webster <[email protected]> Reviewed-by: Mark Charlebois <[email protected]> Reviewed-by: Jan-Simon Möller <[email protected]> Acked-by: Will Deacon <[email protected]> Acked-by: Nicolas Pitre <[email protected]> commit 5c5da6724d8e1767405a3f4b611451a11ece99e2 Author: Behan Webster <[email protected]> Date: Tue Sep 3 22:27:27 2013 -0400 arm: LLVMLinux: Calculate current_thread_info from current_stack_pointer Use the global current_stack_pointer to get the value of the stack pointer. This change supports being able to compile the kernel with both gcc and clang. KernelVersion: 3.17.0-rc6 Signed-off-by: Behan Webster <[email protected]> Reviewed-by: Mark Charlebois <[email protected]> Reviewed-by: Jan-Simon Möller <[email protected]> Acked-by: Will Deacon <[email protected]> Acked-by: Nicolas Pitre <[email protected]> commit f2b6d8c6c56c9a164a2d885ba34a09d613c959c9 Author: Behan Webster <[email protected]> Date: Tue Sep 3 22:27:27 2013 -0400 arm: LLVMLinux: Use current_stack_pointer in save_stack_trace_tsk Use the global current_stack_pointer to get the value of the stack pointer. This change supports being able to compile the kernel with both gcc and clang. KernelVersion: 3.17.0-rc6 Signed-off-by: Behan Webster <[email protected]> Reviewed-by: Mark Charlebois <[email protected]> Reviewed-by: Jan-Simon Möller <[email protected]> Acked-by: Will Deacon <[email protected]> Acked-by: Nicolas Pitre <[email protected]> commit 40802b84566a3d9731a8fea43b144301d9ac450d Author: Behan Webster <[email protected]> Date: Tue Sep 3 22:27:27 2013 -0400 arm: LLVMLinux: Use current_stack_pointer for return_address Use the global current_stack_pointer to get the value of the stack pointer. This change supports being able to compile the kernel with both gcc and Clang. KernelVersion: 3.17.0-rc6 Signed-off-by: Behan Webster <[email protected]> Reviewed-by: Mark Charlebois <[email protected]> Reviewed-by: Jan-Simon Möller <[email protected]> Acked-by: Will Deacon <[email protected]> Acked-by: Nicolas Pitre <[email protected]> commit d80ced5236764b8c4ffda5545d5b357cf88c77c1 Author: Behan Webster <[email protected]> Date: Tue Sep 3 22:27:27 2013 -0400 arm: LLVMLinux: Use current_stack_pointer to calculate pt_regs address Use the global current_stack_pointer to calculate the end of the stack for current_pt_regs() KernelVersion: 3.17.0-rc6 Signed-off-by: Behan Webster <[email protected]> Reviewed-by: Mark Charlebois <[email protected]> Reviewed-by: Jan-Simon Möller <[email protected]> Acked-by: Will Deacon <[email protected]> Acked-by: Nicolas Pitre <[email protected]> commit 9d0d6994806b36891453beb1e94b6253f853af61 Author: Behan Webster <[email protected]> Date: Tue Sep 3 22:27:26 2013 -0400 arm: LLVMLinux: Add global named register current_stack_pointer for ARM Define a global named register for current_stack_pointer. The use of this new variable guarantees that both gcc and clang can access this register in C code. KernelVersion: 3.17.0-rc6 Signed-off-by: Behan Webster <[email protected]> Reviewed-by: Jan-Simon Möller <[email protected]> Reviewed-by: Mark Charlebois <[email protected]> Acked-by: Will Deacon <[email protected]> Acked-by: Nicolas Pitre <[email protected]> commit 2c804d0f8fc7799981d9fdd8c88653541b28c1a7 Author: Eric Dumazet <[email protected]> Date: Tue Sep 30 22:12:05 2014 -0700 ipv4: mentions skb_gro_postpull_rcsum() in inet_gro_receive() Proper CHECKSUM_COMPLETE support needs to adjust skb->csum when we remove one header. Its done using skb_gro_postpull_rcsum() In the case of IPv4, we know that the adjustment is not really needed, because the checksum over IPv4 header is 0. Lets add a comment to ease code comprehension and avoid copy/paste errors. Signed-off-by: Eric Dumazet <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit eb51bbaf8dedf142a54a7ff58514a29b40d515bb Author: Stephen Rothwell <[email protected]> Date: Wed Oct 1 17:00:49 2014 +1000 fm10k: using vmalloc requires including linux/vmalloc.h Signed-off-by: Stephen Rothwell <[email protected]> Acked-by: Jeff Kirsher <[email protected]> Signed-off-by: David S. Miller <[email protected]> commit 078efae00ffc76381c3248006e9cf0988163488f Author: Anish Bhatt <[email protected]> Date: Mon Sep 15 17:44:18 2014 -0700 [SCSI] cxgb4i: avoid holding mutex in interrupt context cxgbi_inet6addr_handler() can be called in interrupt context, so use rcu protected list while finding netdev. This is observed as a scheduling in atomic oops when running over ipv6. Fixes: fc8d0590d914 ("libcxgbi: Add ipv6 api to driver") Fixes: 759a0cc5a3e1 ("cxgb4i: Add ipv6 code to driver, call into libcxgbi ipv6 api") Signed-off-by: Anish Bhatt <[email protected]> Signed-off-by: Karen Xie <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: James Bottomley <[email protected]> commit 34549ab09e62db9703811c6ed4715f2ffa1fd7fb Author: Jeff Layton <[email protected]> Date: Wed Oct 1 08:05:22 2014 -0400 nfsd: eliminate "to_delegation" define We now have cb_to_delegation and to_delegation, which do the same thing and are defined separately in different .c files. Move the cb_to_delegation definition into a header file and eliminate the redundant to_delegation definition. Reviewed-by: Christoph Hellwig <[email protected]> Signed-off-by: Jeff Layton <[email protected]> commit 4a0efdc933680d908de11712a774a2c9492c3d5a Author: Hannes Reinecke <[email protected]> Date: Wed Oct 1 14:32:31 2014 +0200 block: misplaced rq_complete tracepoint The rq_complete tracepoint was never issued for empty requests, causing the resulting blktrace information to never show any completion for those request. Signed-off-by: Hannes Reinecke <[email protected]> Acked-by: Tejun Heo <[email protected]> Signed-off-by: Jens Axboe <[email protected]> commit fc2021fb9baf9ed375c8161b40b68e120e75c60e Author: Michael Opdenacker <[email protected]> Date: Wed Oct 1 12:07:07 2014 +0200 block: hd: remove deprecated IRQF_DISABLED This patch removes the use of the IRQF_DISABLED flag from drivers/block/hd.c It's a NOOP since 2.6.35 and it will be removed one day. This also removes a related comment which is obsolete too. Signed-off-by: Michael Opdenacker <[email protected]> Signed-off-by: Jens Axboe <[email protected]> commit 19aeb5a65f1a6504fc665466c188241e7393d66f Author: Bob Peterson <[email protected]> Date: Mon Sep 29 08:52:04 2014 -0400 GFS2: Make rename not save dirent location This patch fixes a regression in the patch "GFS2: Remember directory insert point", commit 2b47dad866d04f14c328f888ba5406057b8c7d33. The problem had to do with the rename function: The function found space for the new dirent, and remembered that location. But then the old dirent was removed, which often moved the eligible location for the renamed dirent. Putting the new dirent at the saved location caused file system corruption. This patch adds a new "save_loc" variable to struct gfs2_diradd. If 1, the dirent location is saved. If 0, the dirent location is not saved and the buffer_head is released as per previous behavior. Signed-off-by: Bob Peterson <[email protected]> Signed-off-by: Steven Whitehouse <[email protected]> commit 5235166fbc332c8b5dcf49e3a498a8b510a77449 Author: Oliver Neukum <[email protected]> Date: Tue Sep 30 12:54:56 2014 +0200 HID: usbhid: add another mouse that needs QUIRK_ALWAYS_POLL There is a second mouse sharing the same vendor strings but different IDs. Signed-off-by: Oliver Neukum <[email protected]> Signed-off-by: Jiri Kosina <[email protected]> commit 2013add4ce73c93ae2148969a9ec3ecc8b1e26fa Author: Gavin Shan <[email protected]> Date: Wed Oct 1 14:34:51 2014 +1000 powerpc/eeh: Show hex prefix for PE state sysfs As Michael suggested, the hex prefix for the output of EEH PE state sysfs entry (/sys/bus/pci/devices/xxx/eeh_pe_state) is always informative to users. Suggested-by: Michael Ellerman <[email protected]> Signed-off-by: Gavin Shan <[email protected]> Signed-off-by: Michael Ellerman <[email protected]> commit 24c20f10583647e30afe87b6f6d5e14bc7b1cbc6 Author: Christoph Hellwig <[email protected]> Date: Tue Sep 30 16:43:46 2014 +0200 scsi: add a CONFIG_SCSI_MQ_DEFAULT option Add a Kconfig option to enable the blk-mq path for SCSI by default to ease testing and deployment in setups that know they benefit from blk-mq. Signed-off-by: Christoph Hellwig <[email protected]> Reviewed-by: Martin K. Petersen <[email protected]> Reviewed-by: Robert Elliott <[email protected]> Tested-by: Robert Elliott <[email protected]> commit e785060ea3a1c8e37a8bc1449c79e36bff2b5b13 Author: Dolev Raviv <[email protected]> Date: Thu Sep 25 15:32:36 2014 +0300 ufs: definitions for phy interface - Adding some of the definitions missing in unipro.h, including power enumeration. - Read Modify Write Line helper function - Indication for the type of suspend Signed-off-by: Dolev Raviv <[email protected]> Signed-off-by: Subhash Jadavani <[email protected]> Signed-off-by: Yaniv Gardi <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> commit 374a246e4ebda1fc55d537877bf2412e511ecc7b Author: Subhash Jadavani <[email protected]> Date: Thu Sep 25 15:32:35 2014 +0300 ufs: tune bkops while power managment events Add capability to control the auto bkops during suspend. If host explicitly enables the auto bkops (background operation) on device then only device would perform the bkops on its own. If auto bkops is not enabled explicitly and if the device reaches to state where it must do background operation, device would raise the urgent bkops exception event to host and then host will enable the auto bkops on device. This patch adds the option to choose whether auto bkops should be enabled during runtime suspend or not. Since we don't want to keep the device active to perform the non critical bkops, host will enable urgent bkops only. Keep auto-bkops enabled after resume if urgent bkops needed. If device bkops status shows that its in critical need of executing background operations, host should allow the device to continue doing background operations. Signed-off-by: Subhash Jadavani <[email protected]> Signed-off-by: Dolev Raviv <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> commit 856b348305c98d4e0c8e5eafa97c61443197f8d3 Author: Sahitya Tummala <[email protected]> Date: Thu Sep 25 15:32:34 2014 +0300 ufs: Add support for clock scaling using devfreq framework The clocks for UFS device will be managed by generic DVFS (Dynamic Voltage and Frequency Scaling) framework within kernel. This devfreq framework works with different governors to scale the clocks. By default, UFS devices uses simple_ondemand governor which scales the clocks up if the load is more than upthreshold and scales down if the load is less than downthreshold. Signed-off-by: Sahitya Tummala <[email protected]> Signed-off-by: Dolev Raviv <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> commit 4cff6d991e4a291cf50fe2659da2ea9ad46620bf Author: Sahitya Tummala <[email protected]> Date: Thu Sep 25 15:32:33 2014 +0300 ufs: Add freq-table-hz property for UFS device Add freq-table-hz propery for UFS device to keep track of <min max> frequencies supported by UFS clocks. Signed-off-by: Sahitya Tummala <[email protected]> Signed-off-by: Dolev Raviv <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> commit 1ab27c9cf8b63dd8dec9e17b5c17721c7f3b6cc7 Author: Sahitya Tummala <[email protected]> Date: Thu Sep 25 15:32:32 2014 +0300 ufs: Add support for clock gating The UFS controller clocks can be gated after certain period of inactivity, which is typically less than runtime suspend timeout. In addition to clocks the link will also be put into Hibern8 mode to save more power. The clock gating can be turned on by enabling the capability UFSHCD_CAP_CLK_GATING. To enable entering into Hibern8 mode as part of clock gating, set the capability UFSHCD_CAP_HIBERN8_WITH_CLK_GATING. The tracing events for clock gating can be enabled through debugfs as: echo 1 > /sys/kernel/debug/tracing/events/ufs/ufshcd_clk_gating/enable cat /sys/kernel/debug/tracing/trace_pipe Signed-off-by: Sahitya Tummala <[email protected]> Signed-off-by: Dolev Raviv <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> commit 7eb584db73bebbc9852a14341431ed6935419bec Author: Dolev Raviv <[email protected]> Date: Thu Sep 25 15:32:31 2014 +0300 ufs: refactor configuring power mode Sometimes, the device shall report its maximum power and speed capabilities, but we might not wish to configure it to use those maximum capabilities. This change adds support for the vendor specific host driver to implement power change notify callback. To enable configuring different power modes (number of lanes, gear number and fast/slow modes) it is necessary to split the configuration stage from the stage that reads the device max power mode. In addition, it is not required to read the configuration more than once, thus the configuration is stored after reading it once. Signed-off-by: Dolev Raviv <[email protected]> Signed-off-by: Yaniv Gardi <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> commit 57d104c153d3d6d7bea60089e80f37501851ed2c Author: Subhash Jadavani <[email protected]> Date: Thu Sep 25 15:32:30 2014 +0300 ufs: add UFS power management support This patch adds support for UFS device and UniPro link power management during runtime/system PM. Main idea is to define multiple UFS low power levels based on UFS device and UFS link power states. This would allow any specific platform or pci driver to choose the best suited low power level during runtime and system suspend based on their power goals. bkops handlig: To put the UFS device in sleep state when bkops is disabled, first query the bkops status from the device and enable bkops on device only if device needs time to perform the bkops. START_STOP handling: Before sending START_STOP_UNIT to the device well-known logical unit (w-lun) to make sure that the device w-lun unit attention condition is cleared. Write protection: UFS device specification allows LUs to be write protected, either permanently or power on write protected. If any LU is power on write protected and if the card is power cycled (by powering off VCCQ and/or VCC rails), LU's write protect status would be lost. So this means those LUs can be written now. To ensures that UFS device is power cycled only if the power on protect is not set for any of the LUs, check if power on write protect is set and if device is in sleep/power-off state & link in inactive state (Hibern8 or OFF state). If none of the Logical Units on UFS device is power on write protected then all UFS device power rails (VCC, VCCQ & VCCQ2) can be turned off if UFS device is in power-off state and UFS link is in OFF state. But current implementation would disable all device power rails even if UFS link is not in OFF state. Low power mode: If UFS link is in OFF state then UFS host controller can be power collapsed to avoid leakage current from it. Note that if UFS host controller is power collapsed, full UFS reinitialization will be required on resume to re-establish the link between host and device. Signed-off-by: Subhash Jadavani <[email protected]> Signed-off-by: Dolev Raviv <[email protected]> Signed-off-by: Sujit Reddy Thumma <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> commit 0ce147d48a3e3352859f0c185e98e8392bee7a25 Author: Subhash Jadavani <[email protected]> Date: Thu Sep 25 15:32:29 2014 +0300 ufs: introduce well known logical unit in ufs UFS device may have standard LUs and LUN id could be from 0x00 to 0x7F. UFS device specification use "Peripheral Device Addressing Format" (SCSI SAM-5) for standard LUs. UFS device may also have the Well Known LUs (also referred as W-LU) which again could be from 0x00 to 0x7F. For W-LUs, UFS device specification only allows the "Extended Addressing Format" (SCSI SAM-5) which means the W-LUNs would start from 0xC100 onwards. This means max. LUN number reported from UFS device could be 0xC17F hence this patch advertise the "max_lun" as 0xC17F which will allow SCSI mid layer to detect the W-LUs as well. But once the W-LUs are detected, UFSHCD driver may get the commands with SCSI LUN id upto 0xC17F but UPIU LUN id field is only 8-bit wide so it requires the mapping of SCSI LUN id to UPIU LUN id. This patch also add support for this mapping. Signed-off-by: Subhash Jadavani <[email protected]> Signed-off-by: Dolev Raviv <[email protected]> Signed-off-by: Sujit Reddy Thumma <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> commit 2a8fa600445c45222632810a4811ce820279d106 Author: Subhash Jadavani <su…

ERROR: code indent should use tabs where possible torvalds#37: FILE: include/linux/mmdebug.h:33: + do {^I^I^I^I^I^I^I^I\$ WARNING: please, no spaces at the start of a line torvalds#37: FILE: include/linux/mmdebug.h:33: + do {^I^I^I^I^I^I^I^I\$ ERROR: code indent should use tabs where possible torvalds#38: FILE: include/linux/mmdebug.h:34: + if (unlikely(cond)) {^I^I^I^I^I\$ WARNING: please, no spaces at the start of a line torvalds#38: FILE: include/linux/mmdebug.h:34: + if (unlikely(cond)) {^I^I^I^I^I\$ ERROR: code indent should use tabs where possible torvalds#39: FILE: include/linux/mmdebug.h:35: + dump_mm(mm);^I^I^I^I^I\$ WARNING: please, no spaces at the start of a line torvalds#39: FILE: include/linux/mmdebug.h:35: + dump_mm(mm);^I^I^I^I^I\$ ERROR: code indent should use tabs where possible torvalds#40: FILE: include/linux/mmdebug.h:36: + BUG();^I^I^I^I^I^I\$ WARNING: please, no spaces at the start of a line torvalds#40: FILE: include/linux/mmdebug.h:36: + BUG();^I^I^I^I^I^I\$ ERROR: code indent should use tabs where possible torvalds#41: FILE: include/linux/mmdebug.h:37: + }^I^I^I^I^I^I^I\$ WARNING: please, no spaces at the start of a line torvalds#41: FILE: include/linux/mmdebug.h:37: + }^I^I^I^I^I^I^I\$ ERROR: code indent should use tabs where possible torvalds#42: FILE: include/linux/mmdebug.h:38: + } while (0)$ WARNING: please, no spaces at the start of a line torvalds#42: FILE: include/linux/mmdebug.h:38: + } while (0)$ WARNING: Prefer [subsystem eg: netdev]_alert([subsystem]dev, ... then dev_alert(dev, ... then pr_alert(... to printk(KERN_ALERT ... torvalds#74: FILE: mm/debug.c:171: + printk(KERN_ALERT total: 6 errors, 7 warnings, 109 lines checked NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or scripts/cleanfile ./patches/mm-introduce-vm_bug_on_mm.patch has style problems, please review. If any of these errors are false positives, please report them to the maintainer, see CHECKPATCH in MAINTAINERS. Please run checkpatch prior to sending patches Cc: Sasha Levin <[email protected]> Signed-off-by: Andrew Morton <[email protected]>

The mlxbf_gige driver encounters a NULL pointer exception in mlxbf_gige_open() when kdump is enabled. The sequence to reproduce the exception is as follows: a) enable kdump b) trigger kdump via "echo c > /proc/sysrq-trigger" c) kdump kernel executes d) kdump kernel loads mlxbf_gige module e) the mlxbf_gige module runs its open() as the the "oob_net0" interface is brought up f) mlxbf_gige module will experience an exception during its open(), something like: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Mem abort info: ESR = 0x0000000086000004 EC = 0x21: IABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault user pgtable: 4k pages, 48-bit VAs, pgdp=00000000e29a4000 [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 0000000086000004 [#1] SMP CPU: 0 PID: 812 Comm: NetworkManager Tainted: G OE 5.15.0-1035-bluefield torvalds#37-Ubuntu Hardware name: https://www.mellanox.com BlueField-3 SmartNIC Main Card/BlueField-3 SmartNIC Main Card, BIOS 4.6.0.13024 Jan 19 2024 pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : 0x0 lr : __napi_poll+0x40/0x230 sp : ffff800008003e00 x29: ffff800008003e00 x28: 0000000000000000 x27: 00000000ffffffff x26: ffff000066027238 x25: ffff00007cedec00 x24: ffff800008003ec8 x23: 000000000000012c x22: ffff800008003eb7 x21: 0000000000000000 x20: 0000000000000001 x19: ffff000066027238 x18: 0000000000000000 x17: ffff578fcb450000 x16: ffffa870b083c7c0 x15: 0000aaab010441d0 x14: 0000000000000001 x13: 00726f7272655f65 x12: 6769675f6662786c x11: 0000000000000000 x10: 0000000000000000 x9 : ffffa870b0842398 x8 : 0000000000000004 x7 : fe5a48b9069706ea x6 : 17fdb11fc84ae0d2 x5 : d94a82549d594f35 x4 : 0000000000000000 x3 : 0000000000400100 x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000066027238 Call trace: 0x0 net_rx_action+0x178/0x360 __do_softirq+0x15c/0x428 __irq_exit_rcu+0xac/0xec irq_exit+0x18/0x2c handle_domain_irq+0x6c/0xa0 gic_handle_irq+0xec/0x1b0 call_on_irq_stack+0x20/0x2c do_interrupt_handler+0x5c/0x70 el1_interrupt+0x30/0x50 el1h_64_irq_handler+0x18/0x2c el1h_64_irq+0x7c/0x80 __setup_irq+0x4c0/0x950 request_threaded_irq+0xf4/0x1bc mlxbf_gige_request_irqs+0x68/0x110 [mlxbf_gige] mlxbf_gige_open+0x5c/0x170 [mlxbf_gige] __dev_open+0x100/0x220 __dev_change_flags+0x16c/0x1f0 dev_change_flags+0x2c/0x70 do_setlink+0x220/0xa40 __rtnl_newlink+0x56c/0x8a0 rtnl_newlink+0x58/0x84 rtnetlink_rcv_msg+0x138/0x3c4 netlink_rcv_skb+0x64/0x130 rtnetlink_rcv+0x20/0x30 netlink_unicast+0x2ec/0x360 netlink_sendmsg+0x278/0x490 __sock_sendmsg+0x5c/0x6c ____sys_sendmsg+0x290/0x2d4 ___sys_sendmsg+0x84/0xd0 __sys_sendmsg+0x70/0xd0 __arm64_sys_sendmsg+0x2c/0x40 invoke_syscall+0x78/0x100 el0_svc_common.constprop.0+0x54/0x184 do_el0_svc+0x30/0xac el0_svc+0x48/0x160 el0t_64_sync_handler+0xa4/0x12c el0t_64_sync+0x1a4/0x1a8 Code: bad PC value ---[ end trace 7d1c3f3bf9d81885 ]--- Kernel panic - not syncing: Oops: Fatal exception in interrupt Kernel Offset: 0x2870a7a00000 from 0xffff800008000000 PHYS_OFFSET: 0x80000000 CPU features: 0x0,000005c1,a3332a5a Memory Limit: none ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]--- The exception happens because there is a pending RX interrupt before the call to request_irq(RX IRQ) executes. Then, the RX IRQ handler fires immediately after this request_irq() completes. The RX IRQ handler runs "napi_schedule()" before NAPI is fully initialized via "netif_napi_add()" and "napi_enable()", both which happen later in the open() logic. The logic in mlxbf_gige_open() must fully initialize NAPI before any calls to request_irq() execute. Fixes: f92e186 ("Add Mellanox BlueField Gigabit Ethernet driver") Signed-off-by: David Thompson <[email protected]> Reviewed-by: Asmaa Mnebhi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>

[ Upstream commit f7442a6 ] The mlxbf_gige driver encounters a NULL pointer exception in mlxbf_gige_open() when kdump is enabled. The sequence to reproduce the exception is as follows: a) enable kdump b) trigger kdump via "echo c > /proc/sysrq-trigger" c) kdump kernel executes d) kdump kernel loads mlxbf_gige module e) the mlxbf_gige module runs its open() as the the "oob_net0" interface is brought up f) mlxbf_gige module will experience an exception during its open(), something like: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Mem abort info: ESR = 0x0000000086000004 EC = 0x21: IABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault user pgtable: 4k pages, 48-bit VAs, pgdp=00000000e29a4000 [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 0000000086000004 [#1] SMP CPU: 0 PID: 812 Comm: NetworkManager Tainted: G OE 5.15.0-1035-bluefield torvalds#37-Ubuntu Hardware name: https://www.mellanox.com BlueField-3 SmartNIC Main Card/BlueField-3 SmartNIC Main Card, BIOS 4.6.0.13024 Jan 19 2024 pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : 0x0 lr : __napi_poll+0x40/0x230 sp : ffff800008003e00 x29: ffff800008003e00 x28: 0000000000000000 x27: 00000000ffffffff x26: ffff000066027238 x25: ffff00007cedec00 x24: ffff800008003ec8 x23: 000000000000012c x22: ffff800008003eb7 x21: 0000000000000000 x20: 0000000000000001 x19: ffff000066027238 x18: 0000000000000000 x17: ffff578fcb450000 x16: ffffa870b083c7c0 x15: 0000aaab010441d0 x14: 0000000000000001 x13: 00726f7272655f65 x12: 6769675f6662786c x11: 0000000000000000 x10: 0000000000000000 x9 : ffffa870b0842398 x8 : 0000000000000004 x7 : fe5a48b9069706ea x6 : 17fdb11fc84ae0d2 x5 : d94a82549d594f35 x4 : 0000000000000000 x3 : 0000000000400100 x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000066027238 Call trace: 0x0 net_rx_action+0x178/0x360 __do_softirq+0x15c/0x428 __irq_exit_rcu+0xac/0xec irq_exit+0x18/0x2c handle_domain_irq+0x6c/0xa0 gic_handle_irq+0xec/0x1b0 call_on_irq_stack+0x20/0x2c do_interrupt_handler+0x5c/0x70 el1_interrupt+0x30/0x50 el1h_64_irq_handler+0x18/0x2c el1h_64_irq+0x7c/0x80 __setup_irq+0x4c0/0x950 request_threaded_irq+0xf4/0x1bc mlxbf_gige_request_irqs+0x68/0x110 [mlxbf_gige] mlxbf_gige_open+0x5c/0x170 [mlxbf_gige] __dev_open+0x100/0x220 __dev_change_flags+0x16c/0x1f0 dev_change_flags+0x2c/0x70 do_setlink+0x220/0xa40 __rtnl_newlink+0x56c/0x8a0 rtnl_newlink+0x58/0x84 rtnetlink_rcv_msg+0x138/0x3c4 netlink_rcv_skb+0x64/0x130 rtnetlink_rcv+0x20/0x30 netlink_unicast+0x2ec/0x360 netlink_sendmsg+0x278/0x490 __sock_sendmsg+0x5c/0x6c ____sys_sendmsg+0x290/0x2d4 ___sys_sendmsg+0x84/0xd0 __sys_sendmsg+0x70/0xd0 __arm64_sys_sendmsg+0x2c/0x40 invoke_syscall+0x78/0x100 el0_svc_common.constprop.0+0x54/0x184 do_el0_svc+0x30/0xac el0_svc+0x48/0x160 el0t_64_sync_handler+0xa4/0x12c el0t_64_sync+0x1a4/0x1a8 Code: bad PC value ---[ end trace 7d1c3f3bf9d81885 ]--- Kernel panic - not syncing: Oops: Fatal exception in interrupt Kernel Offset: 0x2870a7a00000 from 0xffff800008000000 PHYS_OFFSET: 0x80000000 CPU features: 0x0,000005c1,a3332a5a Memory Limit: none ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]--- The exception happens because there is a pending RX interrupt before the call to request_irq(RX IRQ) executes. Then, the RX IRQ handler fires immediately after this request_irq() completes. The RX IRQ handler runs "napi_schedule()" before NAPI is fully initialized via "netif_napi_add()" and "napi_enable()", both which happen later in the open() logic. The logic in mlxbf_gige_open() must fully initialize NAPI before any calls to request_irq() execute. Fixes: f92e186 ("Add Mellanox BlueField Gigabit Ethernet driver") Signed-off-by: David Thompson <[email protected]> Reviewed-by: Asmaa Mnebhi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

__kernel_map_pages() is a debug function which clears the valid bit in page table entry for deallocated pages to detect illegal memory accesses to freed pages. This function set/clear the valid bit using __set_memory(). __set_memory() acquires init_mm's semaphore, and this operation may sleep. This is problematic, because __kernel_map_pages() can be called in atomic context, and thus is illegal to sleep. An example warning that this causes: BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1578 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2, name: kthreadd preempt_count: 2, expected: 0 CPU: 0 PID: 2 Comm: kthreadd Not tainted 6.9.0-g1d4c6d784ef6 torvalds#37 Hardware name: riscv-virtio,qemu (DT) Call Trace: [<ffffffff800060dc>] dump_backtrace+0x1c/0x24 [<ffffffff8091ef6e>] show_stack+0x2c/0x38 [<ffffffff8092baf8>] dump_stack_lvl+0x5a/0x72 [<ffffffff8092bb24>] dump_stack+0x14/0x1c [<ffffffff8003b7ac>] __might_resched+0x104/0x10e [<ffffffff8003b7f4>] __might_sleep+0x3e/0x62 [<ffffffff8093276a>] down_write+0x20/0x72 [<ffffffff8000cf00>] __set_memory+0x82/0x2fa [<ffffffff8000d324>] __kernel_map_pages+0x5a/0xd4 [<ffffffff80196cca>] __alloc_pages_bulk+0x3b2/0x43a [<ffffffff8018ee82>] __vmalloc_node_range+0x196/0x6ba [<ffffffff80011904>] copy_process+0x72c/0x17ec [<ffffffff80012ab4>] kernel_clone+0x60/0x2fe [<ffffffff80012f62>] kernel_thread+0x82/0xa0 [<ffffffff8003552c>] kthreadd+0x14a/0x1be [<ffffffff809357de>] ret_from_fork+0xe/0x1c Rewrite this function with apply_to_existing_page_range(). It is fine to not have any locking, because __kernel_map_pages() works with pages being allocated/deallocated and those pages are not changed by anyone else in the meantime. Fixes: 5fde3db ("riscv: add ARCH_SUPPORTS_DEBUG_PAGEALLOC support") Signed-off-by: Nam Cao <[email protected]> Cc: [email protected]

__kernel_map_pages() is a debug function which clears the valid bit in page table entry for deallocated pages to detect illegal memory accesses to freed pages. This function set/clear the valid bit using __set_memory(). __set_memory() acquires init_mm's semaphore, and this operation may sleep. This is problematic, because __kernel_map_pages() can be called in atomic context, and thus is illegal to sleep. An example warning that this causes: BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1578 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2, name: kthreadd preempt_count: 2, expected: 0 CPU: 0 PID: 2 Comm: kthreadd Not tainted 6.9.0-g1d4c6d784ef6 #37 Hardware name: riscv-virtio,qemu (DT) Call Trace: [<ffffffff800060dc>] dump_backtrace+0x1c/0x24 [<ffffffff8091ef6e>] show_stack+0x2c/0x38 [<ffffffff8092baf8>] dump_stack_lvl+0x5a/0x72 [<ffffffff8092bb24>] dump_stack+0x14/0x1c [<ffffffff8003b7ac>] __might_resched+0x104/0x10e [<ffffffff8003b7f4>] __might_sleep+0x3e/0x62 [<ffffffff8093276a>] down_write+0x20/0x72 [<ffffffff8000cf00>] __set_memory+0x82/0x2fa [<ffffffff8000d324>] __kernel_map_pages+0x5a/0xd4 [<ffffffff80196cca>] __alloc_pages_bulk+0x3b2/0x43a [<ffffffff8018ee82>] __vmalloc_node_range+0x196/0x6ba [<ffffffff80011904>] copy_process+0x72c/0x17ec [<ffffffff80012ab4>] kernel_clone+0x60/0x2fe [<ffffffff80012f62>] kernel_thread+0x82/0xa0 [<ffffffff8003552c>] kthreadd+0x14a/0x1be [<ffffffff809357de>] ret_from_fork+0xe/0x1c Rewrite this function with apply_to_existing_page_range(). It is fine to not have any locking, because __kernel_map_pages() works with pages being allocated/deallocated and those pages are not changed by anyone else in the meantime. Fixes: 5fde3db ("riscv: add ARCH_SUPPORTS_DEBUG_PAGEALLOC support") Signed-off-by: Nam Cao <[email protected]> Cc: [email protected] Reviewed-by: Alexandre Ghiti <[email protected]> Link: https://lore.kernel.org/r/1289ecba9606a19917bc12b6c27da8aa23e1e5ae.1715750938.git.namcao@linutronix.de Signed-off-by: Palmer Dabbelt <[email protected]>

While validating node ids in map_benchmark_ioctl(), node_possible() may be provided with invalid argument outside of [0,MAX_NUMNODES-1] range leading to: BUG: KASAN: wild-memory-access in map_benchmark_ioctl (kernel/dma/map_benchmark.c:214) Read of size 8 at addr 1fffffff8ccb6398 by task dma_map_benchma/971 CPU: 7 PID: 971 Comm: dma_map_benchma Not tainted 6.9.0-rc6 torvalds#37 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:117) kasan_report (mm/kasan/report.c:603) kasan_check_range (mm/kasan/generic.c:189) variable_test_bit (arch/x86/include/asm/bitops.h:227) [inline] arch_test_bit (arch/x86/include/asm/bitops.h:239) [inline] _test_bit at (include/asm-generic/bitops/instrumented-non-atomic.h:142) [inline] node_state (include/linux/nodemask.h:423) [inline] map_benchmark_ioctl (kernel/dma/map_benchmark.c:214) full_proxy_unlocked_ioctl (fs/debugfs/file.c:333) __x64_sys_ioctl (fs/ioctl.c:890) do_syscall_64 (arch/x86/entry/common.c:83) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Compare node ids with sane bounds first. NUMA_NO_NODE is considered a special valid case meaning that benchmarking kthreads won't be bound to a cpuset of a given node. Found by Linux Verification Center (linuxtesting.org). Fixes: 65789da ("dma-mapping: add benchmark support for streaming DMA APIs") Signed-off-by: Fedor Pchelkin <[email protected]> Reviewed-by: Robin Murphy <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>

[ Upstream commit 1ff05e7 ] While validating node ids in map_benchmark_ioctl(), node_possible() may be provided with invalid argument outside of [0,MAX_NUMNODES-1] range leading to: BUG: KASAN: wild-memory-access in map_benchmark_ioctl (kernel/dma/map_benchmark.c:214) Read of size 8 at addr 1fffffff8ccb6398 by task dma_map_benchma/971 CPU: 7 PID: 971 Comm: dma_map_benchma Not tainted 6.9.0-rc6 torvalds#37 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:117) kasan_report (mm/kasan/report.c:603) kasan_check_range (mm/kasan/generic.c:189) variable_test_bit (arch/x86/include/asm/bitops.h:227) [inline] arch_test_bit (arch/x86/include/asm/bitops.h:239) [inline] _test_bit at (include/asm-generic/bitops/instrumented-non-atomic.h:142) [inline] node_state (include/linux/nodemask.h:423) [inline] map_benchmark_ioctl (kernel/dma/map_benchmark.c:214) full_proxy_unlocked_ioctl (fs/debugfs/file.c:333) __x64_sys_ioctl (fs/ioctl.c:890) do_syscall_64 (arch/x86/entry/common.c:83) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Compare node ids with sane bounds first. NUMA_NO_NODE is considered a special valid case meaning that benchmarking kthreads won't be bound to a cpuset of a given node. Found by Linux Verification Center (linuxtesting.org). Fixes: 65789da ("dma-mapping: add benchmark support for streaming DMA APIs") Signed-off-by: Fedor Pchelkin <[email protected]> Reviewed-by: Robin Murphy <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

While validating node ids in map_benchmark_ioctl(), node_possible() may be provided with invalid argument outside of [0,MAX_NUMNODES-1] range leading to: BUG: KASAN: wild-memory-access in map_benchmark_ioctl (kernel/dma/map_benchmark.c:214) Read of size 8 at addr 1fffffff8ccb6398 by task dma_map_benchma/971 CPU: 7 PID: 971 Comm: dma_map_benchma Not tainted 6.9.0-rc6 torvalds#37 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:117) kasan_report (mm/kasan/report.c:603) kasan_check_range (mm/kasan/generic.c:189) variable_test_bit (arch/x86/include/asm/bitops.h:227) [inline] arch_test_bit (arch/x86/include/asm/bitops.h:239) [inline] _test_bit at (include/asm-generic/bitops/instrumented-non-atomic.h:142) [inline] node_state (include/linux/nodemask.h:423) [inline] map_benchmark_ioctl (kernel/dma/map_benchmark.c:214) full_proxy_unlocked_ioctl (fs/debugfs/file.c:333) __x64_sys_ioctl (fs/ioctl.c:890) do_syscall_64 (arch/x86/entry/common.c:83) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Compare node ids with sane bounds first. NUMA_NO_NODE is considered a special valid case meaning that benchmarking kthreads won't be bound to a cpuset of a given node. Found by Linux Verification Center (linuxtesting.org). Fixes: 65789da ("dma-mapping: add benchmark support for streaming DMA APIs") Signed-off-by: Fedor Pchelkin <[email protected]> Reviewed-by: Robin Murphy <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]>

[ Upstream commit 1ff05e7 ] While validating node ids in map_benchmark_ioctl(), node_possible() may be provided with invalid argument outside of [0,MAX_NUMNODES-1] range leading to: BUG: KASAN: wild-memory-access in map_benchmark_ioctl (kernel/dma/map_benchmark.c:214) Read of size 8 at addr 1fffffff8ccb6398 by task dma_map_benchma/971 CPU: 7 PID: 971 Comm: dma_map_benchma Not tainted 6.9.0-rc6 torvalds#37 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:117) kasan_report (mm/kasan/report.c:603) kasan_check_range (mm/kasan/generic.c:189) variable_test_bit (arch/x86/include/asm/bitops.h:227) [inline] arch_test_bit (arch/x86/include/asm/bitops.h:239) [inline] _test_bit at (include/asm-generic/bitops/instrumented-non-atomic.h:142) [inline] node_state (include/linux/nodemask.h:423) [inline] map_benchmark_ioctl (kernel/dma/map_benchmark.c:214) full_proxy_unlocked_ioctl (fs/debugfs/file.c:333) __x64_sys_ioctl (fs/ioctl.c:890) do_syscall_64 (arch/x86/entry/common.c:83) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Compare node ids with sane bounds first. NUMA_NO_NODE is considered a special valid case meaning that benchmarking kthreads won't be bound to a cpuset of a given node. Found by Linux Verification Center (linuxtesting.org). Fixes: 65789da ("dma-mapping: add benchmark support for streaming DMA APIs") Signed-off-by: Fedor Pchelkin <[email protected]> Reviewed-by: Robin Murphy <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

BugLink: https://bugs.launchpad.net/bugs/2065400 [ Upstream commit f7442a6 ] The mlxbf_gige driver encounters a NULL pointer exception in mlxbf_gige_open() when kdump is enabled. The sequence to reproduce the exception is as follows: a) enable kdump b) trigger kdump via "echo c > /proc/sysrq-trigger" c) kdump kernel executes d) kdump kernel loads mlxbf_gige module e) the mlxbf_gige module runs its open() as the the "oob_net0" interface is brought up f) mlxbf_gige module will experience an exception during its open(), something like: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 Mem abort info: ESR = 0x0000000086000004 EC = 0x21: IABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x04: level 0 translation fault user pgtable: 4k pages, 48-bit VAs, pgdp=00000000e29a4000 [0000000000000000] pgd=0000000000000000, p4d=0000000000000000 Internal error: Oops: 0000000086000004 [#1] SMP CPU: 0 PID: 812 Comm: NetworkManager Tainted: G OE 5.15.0-1035-bluefield torvalds#37-Ubuntu Hardware name: https://www.mellanox.com BlueField-3 SmartNIC Main Card/BlueField-3 SmartNIC Main Card, BIOS 4.6.0.13024 Jan 19 2024 pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : 0x0 lr : __napi_poll+0x40/0x230 sp : ffff800008003e00 x29: ffff800008003e00 x28: 0000000000000000 x27: 00000000ffffffff x26: ffff000066027238 x25: ffff00007cedec00 x24: ffff800008003ec8 x23: 000000000000012c x22: ffff800008003eb7 x21: 0000000000000000 x20: 0000000000000001 x19: ffff000066027238 x18: 0000000000000000 x17: ffff578fcb450000 x16: ffffa870b083c7c0 x15: 0000aaab010441d0 x14: 0000000000000001 x13: 00726f7272655f65 x12: 6769675f6662786c x11: 0000000000000000 x10: 0000000000000000 x9 : ffffa870b0842398 x8 : 0000000000000004 x7 : fe5a48b9069706ea x6 : 17fdb11fc84ae0d2 x5 : d94a82549d594f35 x4 : 0000000000000000 x3 : 0000000000400100 x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000066027238 Call trace: 0x0 net_rx_action+0x178/0x360 __do_softirq+0x15c/0x428 __irq_exit_rcu+0xac/0xec irq_exit+0x18/0x2c handle_domain_irq+0x6c/0xa0 gic_handle_irq+0xec/0x1b0 call_on_irq_stack+0x20/0x2c do_interrupt_handler+0x5c/0x70 el1_interrupt+0x30/0x50 el1h_64_irq_handler+0x18/0x2c el1h_64_irq+0x7c/0x80 __setup_irq+0x4c0/0x950 request_threaded_irq+0xf4/0x1bc mlxbf_gige_request_irqs+0x68/0x110 [mlxbf_gige] mlxbf_gige_open+0x5c/0x170 [mlxbf_gige] __dev_open+0x100/0x220 __dev_change_flags+0x16c/0x1f0 dev_change_flags+0x2c/0x70 do_setlink+0x220/0xa40 __rtnl_newlink+0x56c/0x8a0 rtnl_newlink+0x58/0x84 rtnetlink_rcv_msg+0x138/0x3c4 netlink_rcv_skb+0x64/0x130 rtnetlink_rcv+0x20/0x30 netlink_unicast+0x2ec/0x360 netlink_sendmsg+0x278/0x490 __sock_sendmsg+0x5c/0x6c ____sys_sendmsg+0x290/0x2d4 ___sys_sendmsg+0x84/0xd0 __sys_sendmsg+0x70/0xd0 __arm64_sys_sendmsg+0x2c/0x40 invoke_syscall+0x78/0x100 el0_svc_common.constprop.0+0x54/0x184 do_el0_svc+0x30/0xac el0_svc+0x48/0x160 el0t_64_sync_handler+0xa4/0x12c el0t_64_sync+0x1a4/0x1a8 Code: bad PC value ---[ end trace 7d1c3f3bf9d81885 ]--- Kernel panic - not syncing: Oops: Fatal exception in interrupt Kernel Offset: 0x2870a7a00000 from 0xffff800008000000 PHYS_OFFSET: 0x80000000 CPU features: 0x0,000005c1,a3332a5a Memory Limit: none ---[ end Kernel panic - not syncing: Oops: Fatal exception in interrupt ]--- The exception happens because there is a pending RX interrupt before the call to request_irq(RX IRQ) executes. Then, the RX IRQ handler fires immediately after this request_irq() completes. The RX IRQ handler runs "napi_schedule()" before NAPI is fully initialized via "netif_napi_add()" and "napi_enable()", both which happen later in the open() logic. The logic in mlxbf_gige_open() must fully initialize NAPI before any calls to request_irq() execute. Fixes: f92e186 ("Add Mellanox BlueField Gigabit Ethernet driver") Signed-off-by: David Thompson <[email protected]> Reviewed-by: Asmaa Mnebhi <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> Signed-off-by: Sasha Levin <[email protected]> Signed-off-by: Manuel Diewald <[email protected]> Signed-off-by: Roxana Nicolescu <[email protected]>

commit fb1cf08 upstream. __kernel_map_pages() is a debug function which clears the valid bit in page table entry for deallocated pages to detect illegal memory accesses to freed pages. This function set/clear the valid bit using __set_memory(). __set_memory() acquires init_mm's semaphore, and this operation may sleep. This is problematic, because __kernel_map_pages() can be called in atomic context, and thus is illegal to sleep. An example warning that this causes: BUG: sleeping function called from invalid context at kernel/locking/rwsem.c:1578 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 2, name: kthreadd preempt_count: 2, expected: 0 CPU: 0 PID: 2 Comm: kthreadd Not tainted 6.9.0-g1d4c6d784ef6 torvalds#37 Hardware name: riscv-virtio,qemu (DT) Call Trace: [<ffffffff800060dc>] dump_backtrace+0x1c/0x24 [<ffffffff8091ef6e>] show_stack+0x2c/0x38 [<ffffffff8092baf8>] dump_stack_lvl+0x5a/0x72 [<ffffffff8092bb24>] dump_stack+0x14/0x1c [<ffffffff8003b7ac>] __might_resched+0x104/0x10e [<ffffffff8003b7f4>] __might_sleep+0x3e/0x62 [<ffffffff8093276a>] down_write+0x20/0x72 [<ffffffff8000cf00>] __set_memory+0x82/0x2fa [<ffffffff8000d324>] __kernel_map_pages+0x5a/0xd4 [<ffffffff80196cca>] __alloc_pages_bulk+0x3b2/0x43a [<ffffffff8018ee82>] __vmalloc_node_range+0x196/0x6ba [<ffffffff80011904>] copy_process+0x72c/0x17ec [<ffffffff80012ab4>] kernel_clone+0x60/0x2fe [<ffffffff80012f62>] kernel_thread+0x82/0xa0 [<ffffffff8003552c>] kthreadd+0x14a/0x1be [<ffffffff809357de>] ret_from_fork+0xe/0x1c Rewrite this function with apply_to_existing_page_range(). It is fine to not have any locking, because __kernel_map_pages() works with pages being allocated/deallocated and those pages are not changed by anyone else in the meantime. Fixes: 5fde3db ("riscv: add ARCH_SUPPORTS_DEBUG_PAGEALLOC support") Signed-off-by: Nam Cao <[email protected]> Cc: [email protected] Reviewed-by: Alexandre Ghiti <[email protected]> Link: https://lore.kernel.org/r/1289ecba9606a19917bc12b6c27da8aa23e1e5ae.1715750938.git.namcao@linutronix.de Signed-off-by: Palmer Dabbelt <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>

[ Upstream commit 1ff05e7 ] While validating node ids in map_benchmark_ioctl(), node_possible() may be provided with invalid argument outside of [0,MAX_NUMNODES-1] range leading to: BUG: KASAN: wild-memory-access in map_benchmark_ioctl (kernel/dma/map_benchmark.c:214) Read of size 8 at addr 1fffffff8ccb6398 by task dma_map_benchma/971 CPU: 7 PID: 971 Comm: dma_map_benchma Not tainted 6.9.0-rc6 torvalds#37 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996) Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:117) kasan_report (mm/kasan/report.c:603) kasan_check_range (mm/kasan/generic.c:189) variable_test_bit (arch/x86/include/asm/bitops.h:227) [inline] arch_test_bit (arch/x86/include/asm/bitops.h:239) [inline] _test_bit at (include/asm-generic/bitops/instrumented-non-atomic.h:142) [inline] node_state (include/linux/nodemask.h:423) [inline] map_benchmark_ioctl (kernel/dma/map_benchmark.c:214) full_proxy_unlocked_ioctl (fs/debugfs/file.c:333) __x64_sys_ioctl (fs/ioctl.c:890) do_syscall_64 (arch/x86/entry/common.c:83) entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:130) Compare node ids with sane bounds first. NUMA_NO_NODE is considered a special valid case meaning that benchmarking kthreads won't be bound to a cpuset of a given node. Found by Linux Verification Center (linuxtesting.org). Fixes: 65789da ("dma-mapping: add benchmark support for streaming DMA APIs") Signed-off-by: Fedor Pchelkin <[email protected]> Reviewed-by: Robin Murphy <[email protected]> Signed-off-by: Christoph Hellwig <[email protected]> Signed-off-by: Sasha Levin <[email protected]>

Update ctx.h

236bcd3

nmtigor closed this Jun 19, 2013

nmtigor deleted the patch-1 branch June 19, 2013 21:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update ctx.h #37

Update ctx.h #37

nmtigor commented Jun 19, 2013

Update ctx.h #37

Update ctx.h #37

Conversation

nmtigor commented Jun 19, 2013