Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update resize.c #74

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Update resize.c #74

wants to merge 1 commit into from

Conversation

thedj21
Copy link

@thedj21 thedj21 commented Feb 25, 2014

Update resize.c

@yahuarkuntur
Copy link

Unfortunately Linus does not merge github PRs, read this: #17

@skx
Copy link

skx commented Feb 26, 2014

Also see the comment starting at line 652.

i.e. The code should read:

 // current power of three, init to 3^0
 unsigned three = 1;

Regardless your patch is broken and wrong.

tmlind pushed a commit to networkimprov/linux that referenced this pull request Feb 27, 2014
Turn it into (for example):

[    0.073380] x86: Booting SMP configuration:
[    0.074005] .... node   #0, CPUs:          #1   #2   #3   #4   #5   #6   #7
[    0.603005] .... node   #1, CPUs:     #8   #9  torvalds#10  torvalds#11  torvalds#12  torvalds#13  torvalds#14  torvalds#15
[    1.200005] .... node   #2, CPUs:    torvalds#16  torvalds#17  torvalds#18  torvalds#19  torvalds#20  torvalds#21  torvalds#22  torvalds#23
[    1.796005] .... node   #3, CPUs:    torvalds#24  torvalds#25  torvalds#26  torvalds#27  torvalds#28  torvalds#29  torvalds#30  torvalds#31
[    2.393005] .... node   #4, CPUs:    torvalds#32  torvalds#33  torvalds#34  torvalds#35  torvalds#36  torvalds#37  torvalds#38  torvalds#39
[    2.996005] .... node   #5, CPUs:    torvalds#40  torvalds#41  torvalds#42  torvalds#43  torvalds#44  torvalds#45  torvalds#46  torvalds#47
[    3.600005] .... node   #6, CPUs:    torvalds#48  torvalds#49  torvalds#50  torvalds#51  #52  #53  torvalds#54  torvalds#55
[    4.202005] .... node   #7, CPUs:    torvalds#56  torvalds#57  #58  torvalds#59  torvalds#60  torvalds#61  torvalds#62  torvalds#63
[    4.811005] .... node   #8, CPUs:    torvalds#64  torvalds#65  torvalds#66  torvalds#67  torvalds#68  torvalds#69  #70  torvalds#71
[    5.421006] .... node   #9, CPUs:    torvalds#72  torvalds#73  torvalds#74  torvalds#75  torvalds#76  torvalds#77  torvalds#78  torvalds#79
[    6.032005] .... node  torvalds#10, CPUs:    torvalds#80  torvalds#81  torvalds#82  torvalds#83  torvalds#84  torvalds#85  torvalds#86  torvalds#87
[    6.648006] .... node  torvalds#11, CPUs:    torvalds#88  torvalds#89  torvalds#90  torvalds#91  torvalds#92  torvalds#93  torvalds#94  torvalds#95
[    7.262005] .... node  torvalds#12, CPUs:    torvalds#96  torvalds#97  torvalds#98  torvalds#99 torvalds#100 torvalds#101 torvalds#102 torvalds#103
[    7.865005] .... node  torvalds#13, CPUs:   torvalds#104 torvalds#105 torvalds#106 torvalds#107 torvalds#108 torvalds#109 torvalds#110 torvalds#111
[    8.466005] .... node  torvalds#14, CPUs:   torvalds#112 torvalds#113 torvalds#114 torvalds#115 torvalds#116 torvalds#117 torvalds#118 torvalds#119
[    9.073006] .... node  torvalds#15, CPUs:   torvalds#120 torvalds#121 torvalds#122 torvalds#123 torvalds#124 torvalds#125 torvalds#126 torvalds#127
[    9.679901] x86: Booted up 16 nodes, 128 CPUs

and drop useless elements.

Change num_digits() to hpa's division-avoiding, cell-phone-typed
version which he went at great lengths and pains to submit on a
Saturday evening.

Signed-off-by: Borislav Petkov <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Linus Torvalds <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
tom3q pushed a commit to tom3q/linux that referenced this pull request Sep 22, 2014
This is meant to avoid a recusive hang caused by underlying filesystem trying
to grab a free page and causing a write-out.

INFO: task kworker/u30:7:28375 blocked for more than 120 seconds.
      Not tainted 3.15.0-virtual torvalds#74
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u30:7   D 0000000000000000     0 28375      2 0x00000000
Workqueue: fscache_operation fscache_op_work_func [fscache]
 ffff88000b147148 0000000000000046 0000000000000000 ffff88000b1471c8
 ffff8807aa031820 0000000000014040 ffff88000b147fd8 0000000000014040
 ffff880f0c50c860 ffff8807aa031820 ffff88000b147158 ffff88007be59cd0
Call Trace:
 [<ffffffff815930e9>] schedule+0x29/0x70
 [<ffffffffa018bed5>] __fscache_wait_on_page_write+0x55/0x90 [fscache]
 [<ffffffff810a4350>] ? __wake_up_sync+0x20/0x20
 [<ffffffffa018c135>] __fscache_maybe_release_page+0x65/0x1e0 [fscache]
 [<ffffffffa02ad813>] ceph_releasepage+0x83/0x100 [ceph]
 [<ffffffff811635b0>] ? anon_vma_fork+0x130/0x130
 [<ffffffff8112cdd2>] try_to_release_page+0x32/0x50
 [<ffffffff81140096>] shrink_page_list+0x7e6/0x9d0
 [<ffffffff8113f278>] ? isolate_lru_pages.isra.73+0x78/0x1e0
 [<ffffffff81140932>] shrink_inactive_list+0x252/0x4c0
 [<ffffffff811412b1>] shrink_lruvec+0x3e1/0x670
 [<ffffffff8114157f>] shrink_zone+0x3f/0x110
 [<ffffffff81141b06>] do_try_to_free_pages+0x1d6/0x450
 [<ffffffff8114a939>] ? zone_statistics+0x99/0xc0
 [<ffffffff81141e44>] try_to_free_pages+0xc4/0x180
 [<ffffffff81136982>] __alloc_pages_nodemask+0x6b2/0xa60
 [<ffffffff811c1d4e>] ? __find_get_block+0xbe/0x250
 [<ffffffff810a405e>] ? wake_up_bit+0x2e/0x40
 [<ffffffff811740c3>] alloc_pages_current+0xb3/0x180
 [<ffffffff8112cf07>] __page_cache_alloc+0xb7/0xd0
 [<ffffffff8112da6c>] grab_cache_page_write_begin+0x7c/0xe0
 [<ffffffff81214072>] ? ext4_mark_inode_dirty+0x82/0x220
 [<ffffffff81214a89>] ext4_da_write_begin+0x89/0x2d0
 [<ffffffff8112c6ee>] generic_perform_write+0xbe/0x1d0
 [<ffffffff811a96b1>] ? update_time+0x81/0xc0
 [<ffffffff811ad4c2>] ? mnt_clone_write+0x12/0x30
 [<ffffffff8112e80e>] __generic_file_aio_write+0x1ce/0x3f0
 [<ffffffff8112ea8e>] generic_file_aio_write+0x5e/0xe0
 [<ffffffff8120b94f>] ext4_file_write+0x9f/0x410
 [<ffffffff8120af56>] ? ext4_file_open+0x66/0x180
 [<ffffffff8118f0da>] do_sync_write+0x5a/0x90
 [<ffffffffa025c6c9>] cachefiles_write_page+0x149/0x430 [cachefiles]
 [<ffffffff812cf439>] ? radix_tree_gang_lookup_tag+0x89/0xd0
 [<ffffffffa018c512>] fscache_write_op+0x222/0x3b0 [fscache]
 [<ffffffffa018b35a>] fscache_op_work_func+0x3a/0x100 [fscache]
 [<ffffffff8107bfe9>] process_one_work+0x179/0x4a0
 [<ffffffff8107d47b>] worker_thread+0x11b/0x370
 [<ffffffff8107d360>] ? manage_workers.isra.21+0x2e0/0x2e0
 [<ffffffff81083d69>] kthread+0xc9/0xe0
 [<ffffffff81010000>] ? ftrace_raw_event_xen_mmu_release_ptpage+0x70/0x90
 [<ffffffff81083ca0>] ? flush_kthread_worker+0xb0/0xb0
 [<ffffffff8159eefc>] ret_from_fork+0x7c/0xb0
 [<ffffffff81083ca0>] ? flush_kthread_worker+0xb0/0xb0

Signed-off-by: Milosz Tanski <[email protected]>
Signed-off-by: David Howells <[email protected]>
koct9i pushed a commit to koct9i/linux that referenced this pull request Sep 23, 2014
ERROR: code indent should use tabs where possible
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: please, no spaces at the start of a line
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: Prefer [subsystem eg: netdev]_alert([subsystem]dev, ... then dev_alert(dev, ... then pr_alert(...  to printk(KERN_ALERT ...
torvalds#74: FILE: mm/debug.c:171:
+	printk(KERN_ALERT

total: 6 errors, 7 warnings, 109 lines checked

NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
      scripts/cleanfile

./patches/mm-introduce-vm_bug_on_mm.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Sasha Levin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
koct9i referenced this pull request in koct9i/linux Sep 23, 2014
GIT 1b28f1c3d6821c20f42c22e977999fffbf0c0331

commit 78cbcabd472b197dc8ae7abd11f197efe611211a
Author: Peter Foley <[email protected]>
Date:   Mon Sep 22 09:31:10 2014 +1000

    Documentation: disable vdso_test to avoid breakage with old glibc
    
    glibc versions older than 2.16 don't include sys/auxv.h which this
    executable uses.
    Since we don't have a good way to test for specific glibc versions in
    kbuild, just disable it for now.
    
    Signed-off-by: Peter Foley <[email protected]>
    Signed-off-by: Randy Dunlap <[email protected]>

commit c5a967ad6aba3adc9b61f28d799be4fdf815e6bf
Author: Peter Foley <[email protected]>
Date:   Mon Sep 22 09:31:10 2014 +1000

    Documentation: update vDSO makefile to build portable examples
    
    Signed-off-by: Peter Foley <[email protected]>
    Signed-off-by: Randy Dunlap <[email protected]>

commit dee40f0c69658d15a49a3dbca4f105410f561ad4
Author: Peter Foley <[email protected]>
Date:   Mon Sep 22 09:31:09 2014 +1000

    Documentation: update .gitignore files
    
    Add some missing files to .gitignore.
    Push Documentation/.gitignore down into subdirectories.
    
    Signed-off-by: Peter Foley <[email protected]>
    Signed-off-by: Randy Dunlap <[email protected]>

commit 7f73b38710908162de63e9c940e1a0c26810dd19
Author: Peter Foley <[email protected]>
Date:   Mon Sep 22 09:31:09 2014 +1000

    Documentation: support glibc versions without htole macros
    
    glibc 2.9 introduced the htole<16/32/64> macros, add them to
    tools/include to support older versions of glibc.
    
    Reported-by: Andrew Morton <[email protected]>
    Signed-off-by: Peter Foley <[email protected]>
    Signed-off-by: Randy Dunlap <[email protected]>

commit c06fccd3288d690700b0d2824485ba925d09abd4
Author: Mark Brown <[email protected]>
Date:   Mon Sep 22 09:31:08 2014 +1000

    v4l2-pci-skeleton: Only build if PCI is available
    
    Currently arm64 does not support PCI but it does support v4l2. Since the
    PCI skeleton driver is built unconditionally as a module with no dependency
    on PCI this causes build failures for arm64 allmodconfig. Fix this by
    defining a symbol VIDEO_PCI_SKELETON for the skeleton and conditionalising
    the build on that.
    
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Randy Dunlap <[email protected]> [added VIDEO dependencies]

commit c735483de1a2cd5d6c6b67bf49cfb2991eae6ea6
Author: Helge Deller <[email protected]>
Date:   Sun Sep 21 22:31:08 2014 +0200

    parisc: pdc_stable.c: Avoid potential stack overflows
    
    Signed-off-by: Helge Deller <[email protected]>

commit 94c457deff2a211f8372f69a4d7b0d288183756a
Author: Rickard Strandqvist <[email protected]>
Date:   Sun Sep 14 18:02:12 2014 +0200

    parisc: pdc_stable.c: Cleaning up unnecessary use of memset in conjunction with strncpy
    
    Using memset before strncpy just to ensure a trailing null character is
    an unnecessary double writing of a string
    
    Patch modified by Helge Deller to additionally reduce stack usage.
    
    Signed-off-by: Rickard Strandqvist <[email protected]>
    Signed-off-by: Helge Deller <[email protected]>

commit fe5c873459a973e59854bd235a7e6b3eaa8e5fe0
Author: Helge Deller <[email protected]>
Date:   Sun Sep 21 21:01:15 2014 +0200

    parisc: ptrace: use secure_computing_strict()
    
    Signed-off-by: Helge Deller <[email protected]>

commit 5466112f0935f079e225514905c57d5e5285a9b6
Author: Trond Myklebust <[email protected]>
Date:   Thu Sep 18 17:03:46 2014 -0400

    pnfs/blocklayout: Fix a 64-bit division/remainder issue in bl_map_stripe
    
    kbuild test robot reports:
    
       fs/built-in.o: In function `bl_map_stripe':
       >> :(.text+0x965b4): undefined reference to `__aeabi_uldivmod'
       >> :(.text+0x965cc): undefined reference to `__aeabi_uldivmod'
       >> :(.text+0x96604): undefined reference to `__aeabi_uldivmod'
    
    Fixes: 5c83746a0cf2 (pnfs/blocklayout: in-kernel GETDEVICEINFO XDR parsing)
    Cc: Stephen Rothwell <[email protected]>
    Cc: Christoph Hellwig <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Trond Myklebust <[email protected]>

commit 9c58c79a8a76c510cd3a5012c536d4fe3c81ec3b
Author: Zhihui Zhang <[email protected]>
Date:   Sat Sep 20 21:24:36 2014 -0400

    sched: Clean up some typos and grammatical errors in code/comments
    
    Signed-off-by: Zhihui Zhang <[email protected]>
    Cc: [email protected]
    Link: http://lkml.kernel.org/r/[email protected]
    Signed-off-by: Ingo Molnar <[email protected]>

commit 6a40281ab5c1ed8ba2253857118a5d400a2d084b
Author: Chuck Ebbert <[email protected]>
Date:   Sat Sep 20 10:17:51 2014 -0500

    sched: Fix end_of_stack() and location of stack canary for architectures using CONFIG_STACK_GROWSUP
    
    Aaron Tomlin recently posted patches [1] to enable checking the
    stack canary on every task switch. Looking at the canary code, I
    realized that every arch (except ia64, which adds some space for
    register spill above the stack) shares a definition of
    end_of_stack() that makes it the first long after the
    threadinfo.
    
    For stacks that grow down, this low address is correct because
    the stack starts at the end of the thread area and grows toward
    lower addresses. However, for stacks that grow up, toward higher
    addresses, this is wrong. (The stack actually grows away from
    the canary.) On these archs end_of_stack() should return the
    address of the last long, at the highest possible address for the stack.
    
    [1] http://lkml.org/lkml/2014/9/12/293
    
    Signed-off-by: Chuck Ebbert <[email protected]>
    Link: http://lkml.kernel.org/r/20140920101751.6c5166b6@as
    Signed-off-by: Ingo Molnar <[email protected]>
    Tested-by: James Hogan <[email protected]> [metag]
    Acked-by: James Hogan <[email protected]>
    Acked-by: Aaron Tomlin <[email protected]>

commit 0c7bf3e8cab7900e17ce7f97104c39927d835469
Author: Zefan Li <[email protected]>
Date:   Sat Sep 20 14:49:10 2014 +0800

    cgroup: remove redundant variable in cgroup_mount()
    
    Both pinned_sb and new_sb indicate if a new superblock is needed,
    so we can just remove new_sb.
    
    Note now we must check if kernfs_tryget_sb() returns NULL, because
    when it returns NULL, kernfs_mount() may still re-use an existing
    superblock, which is just allocated by another concurent mount.
    
    Suggested-by: Tejun Heo <[email protected]>
    Signed-off-by: Zefan Li <[email protected]>
    Signed-off-by: Tejun Heo <[email protected]>

commit 3e2cd91ab92665148616a80dc0745c499d2746a7
Author: Zefan Li <[email protected]>
Date:   Sat Sep 20 14:35:43 2014 +0800

    cgroup: fix missing unlock in cgroup_release_agent()
    
    The patch 971ff4935538: "cgroup: use a per-cgroup work for release
    agent" from Sep 18, 2014, leads to the following static checker
    warning:
    
    	kernel/cgroup.c:5310 cgroup_release_agent()
    	warn: 'mutex:&cgroup_mutex' is sometimes locked here and sometimes unlocked.
    
    Reported-by: Dan Carpenter <[email protected]>
    Signed-off-by: Zefan Li <[email protected]>
    Signed-off-by: Tejun Heo <[email protected]>

commit 93b8877471796c04c16fdef755d4e5c0f521509f
Author: Alexander Shiyan <[email protected]>
Date:   Sat Sep 20 09:34:45 2014 +0400

    tty: serial_mctrl_gpio: Fix COMPILE_TEST build for architectures with custom termios.h
    
    This patch fixes COMPILE_TEST build of serial_mctrl_gpio module for
    architectures with custom termios.h header.
    
    sparc64:allmodconfig:
    
    In file included from drivers/tty/serial/serial_mctrl_gpio.c:21:0:
    include/uapi/asm-generic/termios.h:22:8: error: redefinition of 'struct termio'
    ./arch/sparc/include/uapi/asm/termbits.h:16:8: note: originally defined here
    make[3]: *** [drivers/tty/serial/serial_mctrl_gpio.o] Error 1
    
    Reported-by: Guenter Roeck <[email protected]>
    Signed-off-by: Alexander Shiyan <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit d07fe967189ff7c32f5a78b4f28c2ccbab850091
Author: Chen-Yu Tsai <[email protected]>
Date:   Thu Sep 18 11:24:40 2014 +0800

    ARM: dts: sun8i: Add DMA controller node
    
    Add the DMA controller node and DMA bindings to the supported devices.
    
    Signed-off-by: Chen-Yu Tsai <[email protected]>
    Signed-off-by: Maxime Ripard <[email protected]>

commit e625305b390790717cf2cccf61efb81299647028
Author: Tejun Heo <[email protected]>
Date:   Sat Sep 20 01:27:25 2014 -0400

    percpu-refcount: make percpu_ref based on longs instead of ints
    
    percpu_ref is currently based on ints and the number of refs it can
    cover is (1 << 31).  This makes it impossible to use a percpu_ref to
    count memory objects or pages on 64bit machines as it may overflow.
    This forces those users to somehow aggregate the references before
    contributing to the percpu_ref which is often cumbersome and sometimes
    challenging to get the same level of performance as using the
    percpu_ref directly.
    
    While using ints for the percpu counters makes them pack tighter on
    64bit machines, the possible gain from using ints instead of longs is
    extremely small compared to the overall gain from per-cpu operation.
    This patch makes percpu_ref based on longs so that it can be used to
    directly count memory objects or pages.
    
    Signed-off-by: Tejun Heo <[email protected]>
    Cc: Kent Overstreet <[email protected]>
    Cc: Johannes Weiner <[email protected]>

commit 4843c3320c3d23ab4ecf520f5eaf485aff8c7252
Author: Tejun Heo <[email protected]>
Date:   Sat Sep 20 01:27:24 2014 -0400

    percpu-refcount: improve WARN messages
    
    percpu_ref's WARN messages can be a lot more helpful by indicating
    who's the culprit.  Make them report the release function that the
    offending percpu-refcount is associated with.  This should make it a
    lot easier to track down the reported invalid refcnting operations.
    
    Signed-off-by: Tejun Heo <[email protected]>
    Cc: Kent Overstreet <[email protected]>

commit 6d967f8789249628a6388a3a4314c5fef423f36a
Author: Andy Zhou <[email protected]>
Date:   Fri Sep 19 18:02:53 2014 -0700

    udp_tunnel: Only build ip6_udp_tunnel.c when IPV6 is selected
    
    Functions supplied in ip6_udp_tunnel.c are only needed when IPV6 is
    selected. When IPV6 is not selected, those functions are stubbed out
    in udp_tunnel.h.
    
    ==================================================================
     net/ipv6/ip6_udp_tunnel.c:15:5: error: redefinition of 'udp_sock_create6'
         int udp_sock_create6(struct net *net, struct udp_port_cfg *cfg,
     In file included from net/ipv6/ip6_udp_tunnel.c:9:0:
          include/net/udp_tunnel.h:36:19: note: previous definition of 'udp_sock_create6' was here
           static inline int udp_sock_create6(struct net *net, struct udp_port_cfg *cfg,
    ==================================================================
    
    Fixes:  fd384412e udp_tunnel: Seperate ipv6 functions into its own file
    Reported-by: kbuild test robot <[email protected]>
    Signed-off-by: Andy Zhou <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 3f76a4ea5383ba2f9e76f9625f77ff246907a134
Author: Mahati Chamarthy <[email protected]>
Date:   Thu Sep 18 19:27:09 2014 +0530

    Staging: rtl8192e: Fix __constant_htons to htons style warning
    
    This fixes the following checkpatch.pl warning:
    WARNING: __constant_htons should be htons
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 288903f6b91e759b0a813219acd376426cbb8f14
Author: Catalina Mocanu <[email protected]>
Date:   Fri Sep 19 15:55:05 2014 -0700

    staging: iio: cdc: Don't put an else right after a return
    
    This fixes the following checkpatch.pl warning:
    WARNING: else is not generally useful after a break or return.
    
    While at it, remove new line for symmetry with the rest of the code.
    
    Signed-off-by: Catalina Mocanu <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 0a5fcc6b2efdc86619af793e0216a508469cfaa4
Author: Gulsah Kose <[email protected]>
Date:   Sat Sep 20 23:32:05 2014 +0300

    staging: octeon: Fix quoted string split warning.
    
    This patch fixes "quoted string split across lines" checkpatch.pl
    warning in ethernet.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 39bc7513aa92b38c391dbe9649841f9f9dfcd0ac
Author: Gulsah Kose <[email protected]>
Date:   Sat Sep 20 23:27:39 2014 +0300

    staging: octeon: Fix missing blank line warning.
    
    Fixes "Missing a blank line after declarations" checkpatch.pl warning in
    ethernet.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 1ff99b312f9c94516acb38bad7421ba1d74abeb2
Author: Roberta Dobrescu <[email protected]>
Date:   Fri Sep 19 23:34:36 2014 +0300

    staging: emxx_udc: Replace __constant_cpu_to_le16 with cpu_to_le16
    
    This fixes the following checkpatch.pl warning:
    WARNING: __constant_cpu_to_le16 should be cpu_to_le16
    Additionally, it removes the space between function name and (.
    
    Signed-off-by: Roberta Dobrescu <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 113f5f24c6be6f7d888946320d01b51b81aa213d
Author: Aybuke Ozdemir <[email protected]>
Date:   Fri Sep 19 00:31:44 2014 +0300

    Staging: rtl8821ae: Fix warnings of no space before tabs.
    
    This patch fixes these warning messages found by checkpatch.pl:
    WARNING: please, no space before tabs.
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit a45cbb78147e8f57250f1687f5b61470b8343a20
Author: Aybuke Ozdemir <[email protected]>
Date:   Thu Sep 18 23:56:13 2014 +0300

    Staging: rtl8821ae: Fix "foo * bar" warning.
    
    This patch fixes these error messages found by checkpatch.pl:
    ERROR: "foo* bar" should be "foo *bar"
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 34c376fe07342e06f531504b01d3b953962e456c
Author: Aybuke Ozdemir <[email protected]>
Date:   Thu Sep 18 01:03:28 2014 +0300

    Staging: wlan-ng: Fix return in void function warning
    
    This fixes checkpatch.pl warning:
    WARNING: void function return statements are not generally useful
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit fcf1b73d08cd15912205f3b259ea81ccfde11970
Author: Aybuke Ozdemir <[email protected]>
Date:   Thu Sep 18 00:54:04 2014 +0300

    Staging: media: cxd2099: Missing a blank line after declarations
    
    Fix checkpatch.pl issues with missing a blank
    line after declarations in cxd2099.c
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit c2e91542695270452ea7b5d3266ad0e9b5dc7bdb
Author: Aybuke Ozdemir <[email protected]>
Date:   Wed Sep 17 23:43:15 2014 +0300

    Staging: octeon: Missing a blank line after declarations
    
    Fix checkpatch.pl issues with missing a blank
    line after declarations in ethernet-sgmii.c
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 05fd349b1750d456423538e69c3c1d4d8a10f1c8
Author: Aybuke Ozdemir <[email protected]>
Date:   Wed Sep 17 16:10:36 2014 +0300

    staging: gs_fpgaboot Fix trailing whitespace.
    
    Fix checkpatch.pl issues with trailing
    whitespace in README.
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit de77c125f57a308250cfaec945541fd8abe0e054
Author: Aybuke Ozdemir <[email protected]>
Date:   Wed Sep 17 15:33:25 2014 +0300

    staging: bcm: Fix line over 80 characters
    
    Fix checkpatch.pl issues with
    line over 80 characters in HandleControlPacket.c
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 5ad6ae1acfd883d8f4c8998b4e5bc9d4aea7985f
Author: Gulsah Kose <[email protected]>
Date:   Sun Sep 21 01:20:44 2014 +0300

    staging: media: lirc: Fixes missing blank line warning.
    
    Fixes "Missing a blank line after declarations" checkpatch.pl warning in
    lirc_serial.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>.
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit a5613fe8967534ce626875fab4bcface70d366b4
Author: Gulsah Kose <[email protected]>
Date:   Sun Sep 21 01:26:03 2014 +0300

    staging: media: lirc: Fixes unnecessary return warning.
    
    This patch fixes "void function return statements are not generally
    useful" checkpatch.pl warning in lirc_zilog.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit a17ec4c9fd07d3f4760cc6545b54f8323ea6ccb4
Author: Gulsah Kose <[email protected]>
Date:   Sun Sep 21 01:06:55 2014 +0300

    staging: media: lirc: Fix missing blank line warning.
    
    Fixes "Missing a blank line after declarations" checkpatch.pl warning in
    lirc_bt829.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 3f8028023c3f6804751a920d97e9c8dffc575cc0
Author: Gulsah Kose <[email protected]>
Date:   Sun Sep 21 01:02:21 2014 +0300

    staging: media: lirc: Fix missing blank line warning.
    
    Fixes "Missing a blank line after declarations" checkpatch.pl warning in
    lirc_sasem.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit a87ba73ed10266dba8278b2a6b89da597a38092a
Author: Gulsah Kose <[email protected]>
Date:   Sun Sep 21 00:59:11 2014 +0300

    staging: media: lirc: Fix unnecessary return warning.
    
    This patch fixes "void function return statements are not generally
    useful" checkpatch.pl warning in lirc_sasem.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit fd8392f3097140a9db7b0903a63635e652b6eb45
Author: Gulsah Kose <[email protected]>
Date:   Sat Sep 20 22:46:57 2014 +0300

    staging: media: lirc: Fix missing blank line warning.
    
    Fixes "Missing a blank line after declarations" checkpatch.pl warning in
    lirc_zilog.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 3170f3277b1809c19fe4a45914cffa0e09471973
Author: Tina Johnson <[email protected]>
Date:   Wed Sep 17 03:14:52 2014 +0530

    Staging: media: lirc: lirc_imon: Removed unnecessary variable to simplify return variable handling
    
    Variable rc was removed after merging its assignment statement with
    immediately following return statement. Variable retval is not used
    at all other that to return its initial value.Hence replaced retval
    with its initial value in the return statement and removed the variable.
    
    This patch was done using Coccinelle script and the following semantic
    patch was used:
    
    @rule1@
    identifier ret;
    expression e;
    @@
    
    -int ret = 0;
     ... when != ret
    (
    -ret = e;
    +return e;
    -return ret;
    |
    -return ret;
    +return 0;
    )
    
    Signed-off-by: Tina Johnson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 8ad5360ad81a32b4e9fdc956e7c453308050a97d
Author: Gulsah Kose <[email protected]>
Date:   Sat Sep 20 21:39:46 2014 +0300

    staging: lustre: lnet: lnet: Fixed quoted string split warning.
    
    This patch fixes "quoted string split across lines" checkpatch.pl
    warning in api-ni.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 70b694c32e405cff8e2640b3943ed9598d97f75e
Author: Gulsah Kose <[email protected]>
Date:   Sat Sep 20 21:48:00 2014 +0300

    staging: lustre: lnet: lnet: Fix missing line warning.
    
    This patch fixes "Fixes "Missing a blank line after declarations"
    checkpatch.pl warning in api-ni.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit a446b47d5d815865c2715da8fab1a7c06f1338ca
Author: Gulsah Kose <[email protected]>
Date:   Sat Sep 20 21:53:31 2014 +0300

    staging: lustre: lnet: lnet: Fix quoted string split warning.
    
    This patch fixes "quoted string split across lines" checkpatch.pl
    warning in lib-eq.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 3e9cc5b0450a40be3442a82a5a5293f85ca06c7d
Author: Darshana Padmadas <[email protected]>
Date:   Wed Sep 17 20:58:43 2014 +0530

    Staging: lustre: Fix return in void function warning
    
    This fixes checkpatch.pl warning:
    
    WARNING: void function return statements are not generally useful
    
    Signed-off-by: Darshana Padmadas <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 6606a77f92821f8bfd4b1b6ba296da662fecb640
Author: Darshana Padmadas <[email protected]>
Date:   Wed Sep 17 20:28:54 2014 +0530

    Staging: lustre: place open brace following struct on same line
    
    This patch fixes checkpatch.pl warning:
    
    WARNING: open brace following struct goes on the same line.
    
    Signed-off-by: Darshana Padmadas <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 4467a945fc08c0d6624b1dd64cfcc2cbd3b3dee3
Author: Darshana Padmadas <[email protected]>
Date:   Wed Sep 17 18:14:45 2014 +0530

    Staging: lustre: libcfs: fix checkpatch warning else after return statement
    
    Fix checkpatch warning by removing unnecessary else after return statement.
    
    Signed-off-by: Darshana Padmadas <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit f5740b2e7e74fa9ba915aa74bfba7cf849dce8a7
Author: Darshana Padmadas <[email protected]>
Date:   Tue Sep 16 13:24:13 2014 +0530

    Staging: lustre: include: libcfs: removed else before return statement in libcfs_crypto.h
    
    This is a patch to libcfs_crypto.h that fixes warning on unnecessary else before return statement found by checkpatch.pl tool.
    
    Signed-off-by: Darshana Padmadas <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 225557bf274ed1519362865815da7425533191d1
Author: Roxana Blaj <[email protected]>
Date:   Mon Sep 15 14:58:44 2014 +0300

    staging: speakup: fix checkpatch warning
    
    This fixes the checkpatch warning:
    WARNING: line over 80 characters
    
    Signed-off-by: Roxana Blaj <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 0a3a725adb2c421ea79089ea12004a007fb371ce
Author: Roxana Blaj <[email protected]>
Date:   Sun Sep 14 20:28:53 2014 +0300

    staging: speakup: fix checkpatch warning
    
    This fixes the cheackpatch warning:
    WARNING: Missing a blank line after declarations
    
    Signed-off-by: Roxana Blaj <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 472fe30efd52fde30249a04971a62151e0606c1d
Author: Nicoleta Birsan <[email protected]>
Date:   Sun Sep 14 03:38:34 2014 -0700

    Staging: speakup: fix checkpatch warning
    
    This fixes the following checkpatch.pl warning:
    WARNING: Missing a blank line after declarations
    
    Signed-off-by: Nicoleta Birsan <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 297cbdaeca2b68aaae6bbb7affa4533430e8e91a
Author: Blaj Roxana <[email protected]>
Date:   Tue Sep 16 20:13:28 2014 +0300

    staging: skein: replace spaces with tabs
    
    This fixes the error and warning:
    ERROR: code indent should use tabs where possible
    WARNING: please, no spaces at the start of a line
    
    Signed-off-by: Blaj Roxana <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit fb33aa47a00edc789d17d80174cd3ed8a1c82c66
Author: Roberta Dobrescu <[email protected]>
Date:   Sat Sep 20 00:01:39 2014 +0300

    staging: dgnc: Check sscanf return value
    
    This fixes the following checkpatch.pl warnings:
    WARNING: unchecked sscanf return value
    
    Signed-off-by: Roberta Dobrescu <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit f23e875fd26a05a0850db7c5e090030c80b4f583
Author: Gulsah Kose <[email protected]>
Date:   Sat Sep 20 19:34:45 2014 +0300

    staging: dgnc: Fix unnecessary space warning.
    
    Fixed "Unnecessary space before function pointer argument" checkpatch.pl
    warning in dgnc_driver.h
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit e8756d4a51d1246be36c5621827c288eb2d5e9b7
Author: Gulsah Kose <[email protected]>
Date:   Sat Sep 20 19:31:15 2014 +0300

    staging: dgnc: Fix missing blank line warning.
    
    Fixes "Missing a blank line after declarations" checkpatch.pl warning in
    dgnc_sysfs.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 3dfe7557809e5867306c7a0614b9d1c6036cbe4d
Author: Vaishali Thakkar <[email protected]>
Date:   Fri Sep 19 10:30:59 2014 +0530

    Staging: dgnc: Merge lines and remove unused variable for immediate return
    
    This patch merges two lines in a single line if immediate
    return is found. It also removes unnecessory variable rc
    as it is no longer needed.
    
    This is done using Coccinelle. Semantic patch used for this
    is as follows:
    
    @@
    type T;
    identifier i;
    identifier f;
    constant C;
    @@
    - T i;
      ...when != i
         when strict
    (
      return -C;
    |
    - i =
    + return
         f(...);
    - return i;
    )
    
    Signed-off-by: Vaishali Thakkar <[email protected]>
    Reviewed-by: Julia Lawall <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 10352c2a69f4aa2724f007a4922518c9ece7bf89
Author: Roberta Dobrescu <[email protected]>
Date:   Thu Sep 18 21:38:04 2014 +0300

    staging: dgnc: Move open brace on previous line
    
    This fixes the following checkpatch.pl errors:
    ERROR: that open brace { should be on the previous line
    
    Signed-off-by: Roberta Dobrescu <[email protected]>
    Acked-by: Julia Lawall <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 05a70e14035438e6866d7fcf8a79c67b8e1425e1
Author: Roberta Dobrescu <[email protected]>
Date:   Tue Sep 16 20:33:03 2014 +0300

    staging: dgnc: Do not initialise statics to 0 or NULL
    
    This fixes the following checkpatch.pl error:
    ERROR: do not initialise statics to 0 or NULL
    
    Signed-off-by: Roberta Dobrescu <[email protected]>
    Acked-by: Daniel Baluta <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit b051017fb4e593998fc46ec9a991ad390c9114b5
Author: Roberta Dobrescu <[email protected]>
Date:   Mon Sep 15 21:32:59 2014 +0300

    staging: dgnc: Replace kzalloc with kcalloc
    
    This fixes the following checkpatch.pl warnings:
    WARNING: Prefer kcalloc over kzalloc with multiply
    
    Signed-off-by: Roberta Dobrescu <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit f3dadd29f7197d93d0441391f5e3815bf008cce1
Author: Roberta Dobrescu <[email protected]>
Date:   Sun Sep 14 23:13:20 2014 +0300

    staging: dgnc: Fix warnings relating to printk()
    
    This fixes the following checkpatch.pl warnings:
    WARNING: printk() should include KERN_ facility level
    It replaces printk() with dev_dbg() in order to avoid the warning that a more
    specific function should be used.
    
    Signed-off-by: Roberta Dobrescu <[email protected]>
    Acked-by: Julia Lawall <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 2be13f7b7c63cecc439876c8c06a5b30afdf46f9
Author: Mahati Chamarthy <[email protected]>
Date:   Sat Sep 20 04:59:59 2014 +0530

    Staging: rtl8192ee: rtl8192ee: Fix missing blank line warning
    
    This fixes the following checkpatch.pl warnings:
    WARNING: Missing a blank line after declarations
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit b9209a93edbccafb6c2f860bc0ddfe9eda1e3ccd
Author: Mahati Chamarthy <[email protected]>
Date:   Sat Sep 20 04:49:43 2014 +0530

    Staging: rtl8192ee: Fix else not useful style warning
    
    This fixes the following checkpatch.pl warnings:
    WARNING: else is not generally useful after a break or return
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 1709a582e1f8977de040f02d9e9e52ec89f8603f
Author: Mahati Chamarthy <[email protected]>
Date:   Sat Sep 20 04:03:36 2014 +0530

    Staging: rtl8192ee: Fix break is not useful warning
    
    This fixes the following checkpatch.pl warnings:
    WARNING: break is not useful after a goto or return
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit fe6dc85eaf8bb180ad3510a57bd69f3b8f9c2dbb
Author: Mahati Chamarthy <[email protected]>
Date:   Sat Sep 20 03:42:01 2014 +0530

    Staging: rtl8192ee: Fix else is not useful warning
    
    This fixes the following checkpatch.pl warnings:
    WARNING: else is not generally useful after a break or return
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit f41788b7c933127863435f72f456ec46ed5540b2
Author: Mahati Chamarthy <[email protected]>
Date:   Sat Sep 20 03:29:19 2014 +0530

    Staging: rtl8192ee: Fix missing blank line warning
    
    This fixes the following checkpatch.pl warnings:
    WARNING: Missing a blank line after declarations
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit ad39fe743419d58f9bc29373189c93ba2251e675
Author: Mahati Chamarthy <[email protected]>
Date:   Sat Sep 20 02:43:26 2014 +0530

    Staging: rtl8192e: Fix printk debug style warning
    
    This fixes the following checkpatch.pl warnings:
    WARNING: Prefer [subsystem eg: netdev]_dbg([subsystem]dev, ... then dev_dbg(dev,
     ... then pr_debug(...  to printk(KERN_DEBUG ...
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 4344672830d8500eac97d82976b03e41580c3a04
Author: Mahati Chamarthy <[email protected]>
Date:   Sat Sep 20 02:40:43 2014 +0530

    Staging: rtl8192e: Fix printk style warning
    
    This fixes the following checkpatch.pl warnings:
    WARNING: Prefer [subsystem eg: netdev]_info([subsystem]dev, ... then dev_info(dev, ... then pr_info(...  to printk(KERN_INFO ...
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 6af197672f2330045c171aed3ea90fb93d89ecc6
Author: Mahati Chamarthy <[email protected]>
Date:   Sat Sep 20 02:38:39 2014 +0530

    Staging: rtl8192e: Fix space before semicolon warning
    
    This fixes the following checkpatch.pl warning:
    WARNING: space prohibited before semicolon
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 13402f7b76223e7f50ab42c82aac4788940c8277
Author: Mahati Chamarthy <[email protected]>
Date:   Sat Sep 20 02:36:31 2014 +0530

    Staging: rtl8192e: Fix else is not useful warning
    
    This fixes the following checkpatch.pl warning:
    WARNING: else is not generally useful after a break or return
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 5110e40260d03fdb2d93a94fec06a31b81d57b0b
Author: Mahati Chamarthy <[email protected]>
Date:   Fri Sep 19 23:56:02 2014 +0530

    Staging: rtl8192e: Fix void function return statements style warning
    
    This fixes the following checkpatch.pl warnings:
    WARNING:  void function return statements are not generally useful
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 59422a74b55c616d500c3be721077ff0d00f7fb0
Author: Mahati Chamarthy <[email protected]>
Date:   Fri Sep 19 23:12:53 2014 +0530

    Staging: rtl8192e: Fix else is not useful style warning
    
    This fixes the following checkpatch.pl warnings:
    WARNING: else is not generally useful after a break or return
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 1f921b9f61b1a324366c8f6a02c5a8e89164ed52
Author: Vaishali Thakkar <[email protected]>
Date:   Fri Sep 19 22:22:19 2014 +0530

    Staging: rtl8192e: Fixed style warning relating to printk()
    
    This patch fixes following checkpatch.pl warning in file rtl_dm.c:
    
    WARNING: Prefer [subsystem eg: netdev]_info([subsystem]dev, ... then dev_info(dev, ... then pr_info(...  to printk(KERN_INFO .
    
    Signed-off-by: Vaishali Thakkar <[email protected]>
    Acked-by: Julia Lawall <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 52e93b8ab435978bc12280aa4418ef25fd6e74f2
Author: Mahati Chamarthy <[email protected]>
Date:   Fri Sep 19 05:22:33 2014 +0530

    Staging: rtl8192e: Fix unnecessary parentheses style warning
    
    This fixes the following checkpatch.pl warning:
    WARNING: Unnecessary parentheses - maybe == should be = ?
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit fee9d3e61d04422628a3d22ed5eb8370dcef259b
Author: Chris J Arges <[email protected]>
Date:   Wed Aug 27 13:26:53 2014 -0500

    ktest: add ability to skip during BISECT_MANUAL
    
    When doing a manual bisect, a build can fail or a test can be inconclusive.
    In these cases it would be helpful to be able to skip the test entirely.
    
    Link: http://lkml.kernel.org/r/[email protected]
    
    Reviewed-by: Satoru Takeuchi <[email protected]>
    Signed-off-by: Chris J Arges <[email protected]>
    Signed-off-by: Steven Rostedt <[email protected]>

commit 4af409f6c38029e1eda0a5e7bbf15e9b1b7d7fab
Author: Benedict Boerger <[email protected]>
Date:   Thu Sep 18 17:46:23 2014 +0200

    staging: rtl8192u: delete unused function CAM_read_entry
    
    Fix the sparse warning: symbol 'CAM_read_entry' was not declared. Should it be static?
    
    The function CAM_read_entry is not used and therefore deleted.
    
    Signed-off-by: Benedict Boerger <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 77baad9e4d71e75d7ad6ee83454113d4a6a7b04d
Author: Ragnar B. Johannsson <[email protected]>
Date:   Thu Sep 18 14:33:25 2014 +0000

    staging: rtl8192u: Move ieee80211_crypto_* declarations to ieee80211/ieee80211.h
    
    Move ieee80211_crypto*_init and _exit prototype declarations from r8192U_core.c to ieee80211/ieee80211.h. This fixes the following sparse warnings:
    
    drivers/staging/rtl8192u/ieee80211/ieee80211_crypt.c:203:12: warning: symbol 'ieee80211_crypto_init' was not declared. Should it be static?
    drivers/staging/rtl8192u/ieee80211/ieee80211_crypt.c:223:13: warning: symbol 'ieee80211_crypto_deinit' was not declared. Should it be static?
    drivers/staging/rtl8192u/ieee80211/ieee80211_crypt_tkip.c:764:12: warning: symbol 'ieee80211_crypto_tkip_init' was not declared. Should it be static?
    drivers/staging/rtl8192u/ieee80211/ieee80211_crypt_tkip.c:769:13: warning: symbol 'ieee80211_crypto_tkip_exit' was not declared. Should it be static?
    drivers/staging/rtl8192u/ieee80211/ieee80211_crypt_ccmp.c:467:12: warning: symbol 'ieee80211_crypto_ccmp_init' was not declared. Should it be static?
    drivers/staging/rtl8192u/ieee80211/ieee80211_crypt_ccmp.c:472:13: warning: symbol 'ieee80211_crypto_ccmp_exit' was not declared. Should it be static?
    drivers/staging/rtl8192u/ieee80211/ieee80211_crypt_wep.c:281:12: warning: symbol 'ieee80211_crypto_wep_init' was not declared. Should it be static?
    drivers/staging/rtl8192u/ieee80211/ieee80211_crypt_wep.c:286:13: warning: symbol 'ieee80211_crypto_wep_exit' was not declared. Should it be static?
    
    Signed-off-by: Ragnar B. Johannsson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 5635b82a553620c511dc6bc8cb0990c0a791e21e
Author: Mahati Chamarthy <[email protected]>
Date:   Thu Sep 18 15:43:53 2014 +0530

    Staging: rtl8192e: Fix style warnings relating to printk(KERN_DEBUG
    
    This fixes the following checkpatch.pl warnings:
    WARNING: Prefer [subsystem eg: netdev]_dbg([subsystem]dev, ... then dev_dbg(dev, ... then pr_debug(...  to printk(KERN_DEBUG ...
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit fe40a0b361de10ea794116160308cc7fd0b7fbeb
Author: Vaishali Thakkar <[email protected]>
Date:   Wed Sep 17 08:35:24 2014 +0530

    Staging: rtl8192e: rtl8192e: Remove unnecessory braces and space
    
    This patch removes following checkpatch.pl warnings in rtl_core.c file:
    
    WARNING: Braces {} are not necessary for single statement blocks
    WARNING: Space prohibited before semicolon
    
    Signed-off-by: Vaishali Thakkar <[email protected]>
    Acked-by: Julia Lawall <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 5c8b3961da9a55762ea5481e8f9412c0d18dc684
Author: Vaishali Thakkar <[email protected]>
Date:   Wed Sep 17 08:02:43 2014 +0530

    Staging: rtl8192e: rtl8192e: Remove unnecessory variable
    
    This patch removes unnecessory variable in file ret_core.c
    using coccinelle script.Semantic patch for this is as follows:
    
    @@
    identifier ret;
    @@
    
    -int ret = 0;
     ... when != ret
         when strict
    -return ret;
    +return 0;
    
    Signed-off-by: Vaishali Thakkar <[email protected]>
    Acked-by: Julia Lawall <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 23a0e1611b880bd8d94bbebcb3577c9f78029435
Author: Steven Rostedt (Red Hat) <[email protected]>
Date:   Fri Sep 19 20:10:39 2014 -0400

    ktest: Add PATCHCHECK_CHERRY
    
    Add a way to run a patchcheck test on the commits that are in one branch
    but not in another. This uses git cherry to find a list of commits to
    test each one with.
    
    Signed-off-by: Steven Rostedt <[email protected]>

commit 4309635f692192ddcc540964189d92cad0ade249
Author: Rajbinder Brar <[email protected]>
Date:   Tue Sep 16 11:25:31 2014 +0530

    Staging: vt6655: Break 80 character long line to remove checkpatch error
    
    This removes checkpatch.pl warning
    WARNING: line over 80 characters
    
    Signed-off-by: Rajbinder Brar <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit b377ed4cce004d7c3dbd92cffdbf2aa21d28e2e6
Author: Rajbinder Brar <[email protected]>
Date:   Wed Sep 17 21:27:03 2014 +0530

    Staging: vt6656: Removing else after break statement to fix warning
    
    This patch fixes the checkpatch.pl warning in baseband.c file
    WARNING: else is not useful after a break or return
    
    Signed-off-by: Rajbinder Brar <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit dbc6ee63d4355a51fd84ee8ebf127763180b1585
Author: Aybuke Ozdemir <[email protected]>
Date:   Fri Sep 19 19:49:33 2014 +0300

    Staging: vt6655: Fix C99 style commenting.
    
    This patch fixes these error messages found by checkpatch.pl:
    ERROR: do not use C99 // comments
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit a600f4589fdbb51a0ad885408f996ec0f1f90be9
Author: Abel Moyo <[email protected]>
Date:   Thu Sep 18 21:49:10 2014 +0200

    Staging: gdm724x: gdm_usb: added error checking in do_tx()
    
    Added error checking for alloc_tx_struct in do_tx()
    
    Signed-off-by: Abel Moyo <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 37d963fb80d2fd944bd0124570b2adc5b826ccef
Author: Gulsah Kose <[email protected]>
Date:   Sat Sep 20 20:43:53 2014 +0300

    staging: gdm724x: Fix missing blank line warning.
    
    Fixes "Missing a blank line after declarations" checkpatch.pl warning in
    gdm_mux.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 492a1e7be585c88a04ba763bb77fc865700e209d
Author: Daeseok Youn <[email protected]>
Date:   Tue Sep 16 16:19:06 2014 +0900

    staging: dgap: use schedule_timeout_interruptible() instead of dgap_ms_sleep()
    
    Using schedule_timeout_interruptible() is exactly same as
    setting a status of current process and calling  schedule_timeout().
    
    Removes dgap_ms_sleep(), because this function is used
    only when closing tty channel on dgap_tty_close().
    And also removes ch_close_delay that is always set to 250
    on dgap_tty_init().
    
    Signed-off-by: Daeseok Youn <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 335d9c85be096cf492cb3eaeef160b45e1f25d8d
Author: Ankita Patil <[email protected]>
Date:   Thu Sep 18 12:31:00 2014 +0530

    Staging: dgap: Remove unnecessary variable.
    
    This patch removes unnecessary variable in file dgap.c
    using Coccinelle. Semantic patch for this is as follows:
    
    @@
    expression ret;
    identifier f;
    @@
    
    -ret =
    +return
         f(...);
    -return ret;
    
    Also removed the unneeded variable manually.
    
    Signed-off-by: Ankita Patil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 50d0a21b61f22b38f881fa21d2ada6ab4a61f93f
Author: Purnendu Kapadia <[email protected]>
Date:   Mon Sep 15 13:06:36 2014 +0100

    staging: android: sw_sync: checkpatch fixes
    
       - no space after cast
       - allignment should match open parenthesis
       - remove unnecessary new line
    
    Signed-off-by: Purnendu Kapadia <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 1f0f6c9862b687db36f5e853402f76bc118ff0bf
Author: Aybuke Ozdemir <[email protected]>
Date:   Fri Sep 19 01:58:29 2014 +0300

    Staging: rtl8723au: hal: Space prohibited before semicolon
    
    This patch fixes these warning messages found by checkpatch.pl:
    WARNING: Space prohibited before semicolon.
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 8c09757d91703ccbf0da9fc67764de9714c9e615
Author: Aybuke Ozdemir <[email protected]>
Date:   Fri Sep 19 02:23:53 2014 +0300

    Staging: rtl8723au: core: Fix unnecassary braces warning.
    
    This patch fixes these warning messages found by checkpatch.pl:
    WARNING: braces {} are not necessary for single statement blocks
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 867ce1bd68fb1eadb70b82bcda1e451b27ff824a
Author: Aybuke Ozdemir <[email protected]>
Date:   Fri Sep 19 02:33:38 2014 +0300

    Staging: rtl8723au: core: Fix "foo * bar" warning.
    
    This patch fixes these error messages found by checkpatch.pl:
    ERROR: "foo* bar" should be "foo *bar"
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit c895a5df852ca9bbac1dee413747303a61aa4ebd
Author: Greg Donald <[email protected]>
Date:   Tue Sep 16 18:37:41 2014 -0500

    drivers: staging: rtl8723au: Fix "space required after that ','" errors
    
    Fix checkpatch.pl "space required after that ','" errors
    
    Signed-off-by: Greg Donald <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit f78c0710cd60cd108d436490955909983f309c62
Author: Kieron Browne <[email protected]>
Date:   Tue Sep 16 23:28:09 2014 +0100

    staging: rtl8723au: fix sparse incorrect type assignment warnings
    
    Use cpu_to_le16 to cast int for assignment to __le16 members
    
    Signed-off-by: Kieron Browne <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit adabff85c9486c09ec700d835328e18ccfc9abf0
Author: MihaelaGaman <[email protected]>
Date:   Sun Sep 14 12:56:43 2014 +0300

    staging: rtl8723au: Fix checkpatch errors
    
    Fix checkpatch.pl "spaces required around":
    >, =, =, =, =, +=, >, >, <, <, :, <  errors.
    
    Signed-off-by: MihaelaGaman <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 1703c17b8a02b7d1dd3080c4ce9d41a83e95a071
Author: Vaishali Thakkar <[email protected]>
Date:   Sun Sep 14 13:46:37 2014 +0530

    Staging: rtl8188eu: os_dep: Compression of lines for immediate return
    
    This patch compresses two lines in to a single line in file rtw_android.c
    if immediate return statement is found. It also removes variable bytes_written as
    it is no longer needed.
    
    It is done using script Coccinelle. And coccinelle uses following semantic
    patch for this compression function:
    
    @@
    expression ret;
    identifier f;
    @@
    
    -ret =
    +return
         f(...);
    -return ret;
    
    Signed-off-by: Vaishali Thakkar<[email protected]>
    Acked-by: Julia Lawall <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 16e614e85025d69c87e9ce80b9e1b5238f0f4479
Author: Gulsah Kose <[email protected]>
Date:   Sun Sep 21 00:13:29 2014 +0300

    staging: rtl8188eu: core: Fixed wrong space error.
    
    This patch fixes "foo     * bar" should be "foo   *bar" checkpatch.pl error in rtw_cmd.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 69869c01ff148ef22d0ea1adec27b4543789792b
Author: Catalina Mocanu <[email protected]>
Date:   Fri Sep 19 14:54:54 2014 -0700

    staging: iio: impedance-analyzer: add blank line after declaration
    
    This fixes the following checkpatch.pl warning:
    WARNING: Missing a blank line after declarations.
    
    Signed-off-by: Catalina Mocanu <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 714ab9bdd350413f48ad401bd25e11b3e9f257ab
Author: Catalina Mocanu <[email protected]>
Date:   Fri Sep 19 14:32:09 2014 -0700

    staging: iio: trigger: add blank lines after declarations
    
    This fixes the following checkpatch.pl warning:
    WARNING: Missing a blank line after declarations.
    
    Signed-off-by: Catalina Mocanu <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 8a689c114796d8a3801c2bf3e25d3e21d6816036
Author: Aybuke Ozdemir <[email protected]>
Date:   Fri Sep 19 18:48:05 2014 +0300

    Staging: iio: resolver: Missing a blank line after declarations
    
    This patch fixes these warning messages found by checkpatch.pl:
    WARNING : Missing a blank line after declarations
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 4b4c727519b510ab9d9b33de51ea41fc34b9ef27
Author: Catalina Mocanu <[email protected]>
Date:   Thu Sep 18 14:55:06 2014 -0700

    staging: iio: dummy: add blank lines after declarations.
    
    This fixes the following checkpatch.pl warning:
    WARNING: Missing a blank line after declarations.
    
    Signed-off-by: Catalina Mocanu <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit b581c3d9a90772613e05e659b4e8defc81704212
Author: Tina Johnson <[email protected]>
Date:   Sat Sep 13 15:46:15 2014 +0530

    Staging: iio: meter: ade7753: Fixed checkpatch.pl warnings
    
    Clean-up patch to fix the following checkpatch.pl warnings:
    
    ade7753.c:325: WARNING: Missing a blank line after declarations
    ade7753.c:383: WARNING: Missing a blank line after declarations
    
    Signed-off-by: Tina Johnson<[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 9034720a54738bbaf96b619f34f887199ac7efed
Author: Tina Johnson <[email protected]>
Date:   Sun Sep 14 16:30:05 2014 +0530

    Staging: iio: meter: ade7753: Merged assignment with immediately following return statement
    
    Saved one line of code by merging the assigning and return statements
    of variable ret. And thus removed variable len which was no longer useful.
    
    This patch was done using Coccinelle script and the following semantic
    patch was used:
    
    @@
    expression ret;
    identifier f;
    @@
    
    -ret =
    +return
          f(...);
    -return ret;
    
    Signed-off-by: Tina Johnson <[email protected]>
    Acked-by: Julia Lawall <[email protected]>
    Acked-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 18f340f90e087c078c634d5c4fed5e0d632d4fb6
Author: Paul Zimmerman <[email protected]>
Date:   Fri Sep 19 14:49:36 2014 -0700

    usb: dwc2: add T: line to MAINTAINERS showing Felipe's tree
    
    Starting with v3.18-rc, patches for dwc2 will go through Felipe's
    tree. Add a T: line to MAINTAINERS to document this.
    
    Signed-off-by: Paul Zimmerman <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 5dce95554a1866339de039060ecd7122056a9d71
Author: Paul Zimmerman <[email protected]>
Date:   Tue Sep 16 13:47:27 2014 -0700

    usb: dwc2: handle DMA buffer unmapping sanely
    
    The driver's handling of DMA buffers for non-aligned transfers
    was kind of nuts. For IN transfers, it left the URB DMA buffer
    mapped until the transfer completed, then synced it, copied the
    data from the bounce buffer, then synced it again.
    
    Instead of that, just call usb_hcd_unmap_urb_for_dma() to unmap
    the buffer before starting the transfer. Then no syncing is
    required when doing the copy. This should also allow handling of
    other types of mappings besides just dma_map_single() ones.
    
    Also reduce the size of the bounce buffer allocation for Isoc
    endpoints to 3K, since that's the largest possible transfer size.
    
    Tested on Raspberry Pi and Altera SOCFPGA.
    
    Signed-off-by: Paul Zimmerman <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit e8f8c14d9da7ab1b8a7b0f769cd7148ca2cc7d10
Author: Paul Zimmerman <[email protected]>
Date:   Tue Sep 16 13:47:26 2014 -0700

    usb: dwc2: clip max_transfer_size to 65535
    
    Clip max_transfer_size to 65535 for host. dwc2_hc_setup_align_buf()
    allocates coherent buffers with this size, and if it's too large we
    can exhaust the coherent DMA pool.
    
    Tested on Raspberry Pi and Altera SOCFPGA.
    
    Signed-off-by: Paul Zimmerman <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit d00b41428042e72d9dc2557d9147434a4e3d631f
Author: Robert Baldyga <[email protected]>
Date:   Tue Sep 9 10:44:57 2014 +0200

    usb: dwc2/gadget: disable clock when it's not needed
    
    When device is stopped or suspended clock is not needed so we
    can disable it for this time.
    
    Signed-off-by: Robert Baldyga <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit b203d0a2e32dd28e87780078f0789322862e4da8
Author: Robert Baldyga <[email protected]>
Date:   Tue Sep 9 10:44:56 2014 +0200

    usb: dwc2/gadget: assign TX FIFO dynamically
    
    Because we have not enough memory to have each TX FIFO of size at least
    3072 bytes (the maximum single packet size with 3 transactions per
    microframe), we create four FIFOs of lenght 1024, and four of length
    3072 bytes, and assing them to endpoints dynamically according to
    maxpacket size value of given endpoint.
    
    Up to now there were initialized 16 TX FIFOs, but we use only 8 IN
    endpoints, so we can split available memory for 8 FIFOs to have more
    memory for each one.
    
    It needed to do some small modifications in few places in code, because
    there was assumption that TX FIFO numbers assigned to endpoints are the
    same as the endpoint numbers, which is not true since we have dynamic
    FIFO assigning.
    
    Signed-off-by: Robert Baldyga <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit cff9eb756e18a7763d7ab9c574c0ab191e712341
Author: Marek Szyprowski <[email protected]>
Date:   Tue Sep 9 10:44:55 2014 +0200

    usb: dwc2/gadget: ensure that all fifos have correct memory buffers
    
    Print warning if FIFOs are configured in such a way that they don't fit
    into the SPRAM available on the s3c hsotg module.
    
    Signed-off-by: Marek Szyprowski <[email protected]>
    Signed-off-by: Robert Baldyga <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 1e01129373f757925a652ea4ea5b278f8c2b9222
Author: Marek Szyprowski <[email protected]>
Date:   Tue Sep 9 10:44:54 2014 +0200

    usb: dwc2/gadget: hide some not really needed debug messages
    
    Some DWC2/s3c-hsotg debug messages are really useless for typical user,
    so hide them behind dev_dbg().
    
    Signed-off-by: Marek Szyprowski <[email protected]>
    Signed-off-by: Robert Baldyga <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit d784f1e50977e58db23a79181971c3c0f62452e5
Author: Andrzej Pietrasiewicz <[email protected]>
Date:   Tue Sep 9 10:44:53 2014 +0200

    usb: dwc2/gadget: Fix comment text
    
    Adjust the debug text to the name of the printed variable.
    
    Signed-off-by: Andrzej Pietrasiewicz <[email protected]>
    Signed-off-by: Robert Baldyga <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 496a51bd64eb15f14cee3519f5b75b28d09567e3
Author: Julia Lawall <[email protected]>
Date:   Thu Sep 18 22:24:02 2014 +0200

    staging: lustre: llite: Use kzalloc and rewrite null tests
    
    This patch removes some kzalloc-related macros and rewrites the
    associated null tests to use !x rather than x == NULL.
    
    A simplified version of the semantic patch that makes this change is as
    follows: (http://coccinelle.lip6.fr/)
    
    // <smpl>
    @@
    expression ptr;
    statement S,S1;
    @@
    
      \(OBD_ALLOC\|OBD_ALLOC_WAIT\|OBD_ALLOC_PTR\|OBD_ALLOC_PTR_WAIT\)(ptr,...);
      if (
    +     !
          ptr
    -      == NULL
         ) S else S1
    
    @@
    expression ptr,size;
    @@
    
    - OBD_ALLOC(ptr,size)
    + ptr = kzalloc(size, GFP_NOFS)
    
    @@
    expression ptr,size;
    @@
    
    - OBD_ALLOC_WAIT(ptr,size)
    + ptr = kzalloc(size, GFP_KERNEL)
    
    @@
    expression ptr,size;
    @@
    
    - OBD_ALLOC_PTR(ptr)
    + ptr = kzalloc(sizeof(*ptr), GFP_NOFS)
    
    @@
    expression ptr,size;
    @@
    
    - OBD_ALLOC_PTR_WAIT(ptr,size)
    + ptr = kzalloc(sizeof(*ptr), GFP_KERNEL)
    // </smpl>
    
    Signed-off-by: Julia Lawall <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit cdbcd3305293d18f7ae73b2766699bddf634bb06
Author: Martin Kelly <[email protected]>
Date:   Mon Sep 15 21:16:15 2014 -0700

    Staging/bcm: Fix whitespace/comments in Ioctl.h
    
    Cleanup whitespace and comments in Ioctl.h in a few ways:
    - > 80 character cleanup
    - Comment clarification
    - More consistent vertical alignment
    
    Signed-off-by: Martin Kelly <[email protected]>
    Reviewed-by: Matthias Beyer <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 33b443e467f6c92c4cc797f5acf6a933fcfe9ec3
Author: Fabien Malfoy <[email protected]>
Date:   Mon Sep 15 09:02:36 2014 +0200

    staging: rtl8821ae: Remove space after unary operator in efuse.c
    
    Several pointer declaration syntax have been fixed to match the coding style.
    
    Signed-off-by: Fabien Malfoy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit c24cdca05edb9c5435529afa37ce8c9c25ac4c5e
Author: Merlin Chlosta <[email protected]>
Date:   Mon Sep 15 01:56:10 2014 +0200

    staging: rtl8192u: sparse warnings: declare ieee80211_TURBO_Info static
    
    Declare ieee80211_TURBO_Info static to fix a sparse "symbol was not declared" warning.
    
    Signed-off-by: Merlin Chlosta <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 5b1ebbffc0b2dd47a45380ba68da36f792a2977e
Author: Vincenzo Scotti <[email protected]>
Date:   Sat Sep 13 13:39:20 2014 +0200

    staging: emxx_udc: fix compile warnings: discarding const qualifier
    
    Signed-off-by: Vincenzo Scotti <[email protected]>
    Reported-by: kbuild test robot <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit f02935c575cb00f2a164282866324816a1f52fc1
Author: Masanari Iida <[email protected]>
Date:   Sat Sep 13 01:14:30 2014 +0900

    staging: exxx_udc: Convert pr_warning to pr_warn
    
    This patch Convert pr_warning to pr_warn.
    
    Signed-off-by: Masanari Iida <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 3aa2ec581903747d926765850212278c7c24be77
Author: Sudip Mukherjee <[email protected]>
Date:   Fri Sep 12 17:57:26 2014 +0530

    staging: unisys: uislib: uislib.c: sparse warning of context imbalance
    
    fixed sparse warning : context imbalance in 'destroy_device'
                            unexpected unlock
    this patch will generate warning from checkpatch for
    lines over 80 character , but since those are user-visible strings
    so it was not modified.
    
    Signed-off-by: Sudip Mukherjee <[email protected]>
    Tested-by: Benjamin Romer <[email protected]>
    Acked-by: Benjamin Romer <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 635ecc5f36438cdf8cf3b88421321ee7443eb2d1
Author: Luke Hart <[email protected]>
Date:   Fri Sep 12 10:48:33 2014 +0100

    staging: unisys: Fix sparse error - accessing __iomem directly
    
    Copy the channel type into a temporary buffer so that code will work
    for architectures that don't support MMIO. This now works in same way
    as other tests in same function.
    
    Signed-off-by: Luke Hart <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit cec78b98df2f87a396890c802dccbf0e604c6829
Author: Mark Einon <[email protected]>
Date:   Sun Sep 14 16:59:05 2014 +0100

    staging: et131x: logical continuations should be on the previous line
    
    Fix two occurrences of the checkpatch check:
    
    CHECK: Logical continuations should be on the previous line
    
    Signed-off-by: Mark Einon <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit d855b8935e211b285aa6eb3d42e2ea810b03e043
Author: Mark Einon <[email protected]>
Date:   Sun Sep 14 16:59:04 2014 +0100

    staging: et131x: Fix 'else is not generally useful after a break or return'
    
    Fix this checkpatch warning:
    
    WARNING: else is not generally useful after a break or return
    
    Signed-off-by: Mark Einon <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit b6cb966074d6863293b774327ca5738bb27a9b3a
Author: Mark Einon <[email protected]>
Date:   Sun Sep 14 16:59:08 2014 +0100

    staging: et131x: Use variable names instead of types in sizeof
    
    A few calls to sizeof() in et131x.c give the type as a parameter
    - use the equivalent variable name instead.
    
    Signed-off-by: Mark Einon <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit ee60c8ec323167a02de357e9d9b44af850052ee3
Author: Mark Einon <[email protected]>
Date:   Sun Sep 14 16:59:07 2014 +0100

    staging: et131x: Use braces on all arms of if/else statements
    
    In some places in et131x.c, one arm of am if/else statement has braces
    and the other not - put braces on both arms where this happens.
    
    Signed-off-by: Mark Einon <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit c13756784a6a16fb5d25585a4058dd6d284fd033
Author: Mark Einon <[email protected]>
Date:   Sun Sep 14 16:59:06 2014 +0100

    staging: et131x: Remove spaces after casts
    
    In three places in et131x.c, spaces exist after a cast. Remove them.
    
    Signed-off-by: Mark Einon <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 48c8f78914720b39b9de27c6e58134abdf1f1a4c
Author: Mark Einon <[email protected]>
Date:   Sun Sep 14 16:59:02 2014 +0100

    staging: et131x: Add spinlock definition comments
    
    Checkpatch --strict advises that spinlocks should be described when
    defined, seems a good idea so this change does that.
    
    Signed-off-by: Mark Einon <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 0c55fe2018f7f84e3620e85e4b0d5d06274862da
Author: Mark Einon <[email protected]>
Date:   Sun Sep 14 16:59:01 2014 +0100

    staging: et131x: Remove useless assignment to NULL
    
    The stack variable skb is no longer used after it's set to
    NULL. Don't set it to NULL.
    
    Reported-by: Dan Carpenter <[email protected]>
    Signed-off-by: Mark Einon <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit bacb71edb48050b46244a66ec8d49c55a89eec34
Author: Mark Einon <[email protected]>
Date:   Sun Sep 14 16:59:00 2014 +0100

    staging: et131x: Remove send_hw_lock spinlock
    
    We don't need to use this lock - the tx path is protected by the
    networking subsystem xmit_lock, so we don't also need it in
    nic_send_packet().
    
    The other use of this spinlock in et1310_enable_phy_coma() t…
aryabinin pushed a commit to aryabinin/linux that referenced this pull request Sep 24, 2014
ERROR: code indent should use tabs where possible
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: please, no spaces at the start of a line
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: Prefer [subsystem eg: netdev]_alert([subsystem]dev, ... then dev_alert(dev, ... then pr_alert(...  to printk(KERN_ALERT ...
torvalds#74: FILE: mm/debug.c:171:
+	printk(KERN_ALERT

total: 6 errors, 7 warnings, 109 lines checked

NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
      scripts/cleanfile

./patches/mm-introduce-vm_bug_on_mm.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Sasha Levin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
aryabinin referenced this pull request in aryabinin/linux Sep 24, 2014
GIT 1b28f1c3d6821c20f42c22e977999fffbf0c0331

commit 78cbcabd472b197dc8ae7abd11f197efe611211a
Author: Peter Foley <[email protected]>
Date:   Mon Sep 22 09:31:10 2014 +1000

    Documentation: disable vdso_test to avoid breakage with old glibc
    
    glibc versions older than 2.16 don't include sys/auxv.h which this
    executable uses.
    Since we don't have a good way to test for specific glibc versions in
    kbuild, just disable it for now.
    
    Signed-off-by: Peter Foley <[email protected]>
    Signed-off-by: Randy Dunlap <[email protected]>

commit c5a967ad6aba3adc9b61f28d799be4fdf815e6bf
Author: Peter Foley <[email protected]>
Date:   Mon Sep 22 09:31:10 2014 +1000

    Documentation: update vDSO makefile to build portable examples
    
    Signed-off-by: Peter Foley <[email protected]>
    Signed-off-by: Randy Dunlap <[email protected]>

commit dee40f0c69658d15a49a3dbca4f105410f561ad4
Author: Peter Foley <[email protected]>
Date:   Mon Sep 22 09:31:09 2014 +1000

    Documentation: update .gitignore files
    
    Add some missing files to .gitignore.
    Push Documentation/.gitignore down into subdirectories.
    
    Signed-off-by: Peter Foley <[email protected]>
    Signed-off-by: Randy Dunlap <[email protected]>

commit 7f73b38710908162de63e9c940e1a0c26810dd19
Author: Peter Foley <[email protected]>
Date:   Mon Sep 22 09:31:09 2014 +1000

    Documentation: support glibc versions without htole macros
    
    glibc 2.9 introduced the htole<16/32/64> macros, add them to
    tools/include to support older versions of glibc.
    
    Reported-by: Andrew Morton <[email protected]>
    Signed-off-by: Peter Foley <[email protected]>
    Signed-off-by: Randy Dunlap <[email protected]>

commit c06fccd3288d690700b0d2824485ba925d09abd4
Author: Mark Brown <[email protected]>
Date:   Mon Sep 22 09:31:08 2014 +1000

    v4l2-pci-skeleton: Only build if PCI is available
    
    Currently arm64 does not support PCI but it does support v4l2. Since the
    PCI skeleton driver is built unconditionally as a module with no dependency
    on PCI this causes build failures for arm64 allmodconfig. Fix this by
    defining a symbol VIDEO_PCI_SKELETON for the skeleton and conditionalising
    the build on that.
    
    Signed-off-by: Mark Brown <[email protected]>
    Signed-off-by: Randy Dunlap <[email protected]> [added VIDEO dependencies]

commit c735483de1a2cd5d6c6b67bf49cfb2991eae6ea6
Author: Helge Deller <[email protected]>
Date:   Sun Sep 21 22:31:08 2014 +0200

    parisc: pdc_stable.c: Avoid potential stack overflows
    
    Signed-off-by: Helge Deller <[email protected]>

commit 94c457deff2a211f8372f69a4d7b0d288183756a
Author: Rickard Strandqvist <[email protected]>
Date:   Sun Sep 14 18:02:12 2014 +0200

    parisc: pdc_stable.c: Cleaning up unnecessary use of memset in conjunction with strncpy
    
    Using memset before strncpy just to ensure a trailing null character is
    an unnecessary double writing of a string
    
    Patch modified by Helge Deller to additionally reduce stack usage.
    
    Signed-off-by: Rickard Strandqvist <[email protected]>
    Signed-off-by: Helge Deller <[email protected]>

commit fe5c873459a973e59854bd235a7e6b3eaa8e5fe0
Author: Helge Deller <[email protected]>
Date:   Sun Sep 21 21:01:15 2014 +0200

    parisc: ptrace: use secure_computing_strict()
    
    Signed-off-by: Helge Deller <[email protected]>

commit 5466112f0935f079e225514905c57d5e5285a9b6
Author: Trond Myklebust <[email protected]>
Date:   Thu Sep 18 17:03:46 2014 -0400

    pnfs/blocklayout: Fix a 64-bit division/remainder issue in bl_map_stripe
    
    kbuild test robot reports:
    
       fs/built-in.o: In function `bl_map_stripe':
       >> :(.text+0x965b4): undefined reference to `__aeabi_uldivmod'
       >> :(.text+0x965cc): undefined reference to `__aeabi_uldivmod'
       >> :(.text+0x96604): undefined reference to `__aeabi_uldivmod'
    
    Fixes: 5c83746a0cf2 (pnfs/blocklayout: in-kernel GETDEVICEINFO XDR parsing)
    Cc: Stephen Rothwell <[email protected]>
    Cc: Christoph Hellwig <[email protected]>
    Reviewed-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Trond Myklebust <[email protected]>

commit 9c58c79a8a76c510cd3a5012c536d4fe3c81ec3b
Author: Zhihui Zhang <[email protected]>
Date:   Sat Sep 20 21:24:36 2014 -0400

    sched: Clean up some typos and grammatical errors in code/comments
    
    Signed-off-by: Zhihui Zhang <[email protected]>
    Cc: [email protected]
    Link: http://lkml.kernel.org/r/[email protected]
    Signed-off-by: Ingo Molnar <[email protected]>

commit 6a40281ab5c1ed8ba2253857118a5d400a2d084b
Author: Chuck Ebbert <[email protected]>
Date:   Sat Sep 20 10:17:51 2014 -0500

    sched: Fix end_of_stack() and location of stack canary for architectures using CONFIG_STACK_GROWSUP
    
    Aaron Tomlin recently posted patches [1] to enable checking the
    stack canary on every task switch. Looking at the canary code, I
    realized that every arch (except ia64, which adds some space for
    register spill above the stack) shares a definition of
    end_of_stack() that makes it the first long after the
    threadinfo.
    
    For stacks that grow down, this low address is correct because
    the stack starts at the end of the thread area and grows toward
    lower addresses. However, for stacks that grow up, toward higher
    addresses, this is wrong. (The stack actually grows away from
    the canary.) On these archs end_of_stack() should return the
    address of the last long, at the highest possible address for the stack.
    
    [1] http://lkml.org/lkml/2014/9/12/293
    
    Signed-off-by: Chuck Ebbert <[email protected]>
    Link: http://lkml.kernel.org/r/20140920101751.6c5166b6@as
    Signed-off-by: Ingo Molnar <[email protected]>
    Tested-by: James Hogan <[email protected]> [metag]
    Acked-by: James Hogan <[email protected]>
    Acked-by: Aaron Tomlin <[email protected]>

commit 0c7bf3e8cab7900e17ce7f97104c39927d835469
Author: Zefan Li <[email protected]>
Date:   Sat Sep 20 14:49:10 2014 +0800

    cgroup: remove redundant variable in cgroup_mount()
    
    Both pinned_sb and new_sb indicate if a new superblock is needed,
    so we can just remove new_sb.
    
    Note now we must check if kernfs_tryget_sb() returns NULL, because
    when it returns NULL, kernfs_mount() may still re-use an existing
    superblock, which is just allocated by another concurent mount.
    
    Suggested-by: Tejun Heo <[email protected]>
    Signed-off-by: Zefan Li <[email protected]>
    Signed-off-by: Tejun Heo <[email protected]>

commit 3e2cd91ab92665148616a80dc0745c499d2746a7
Author: Zefan Li <[email protected]>
Date:   Sat Sep 20 14:35:43 2014 +0800

    cgroup: fix missing unlock in cgroup_release_agent()
    
    The patch 971ff4935538: "cgroup: use a per-cgroup work for release
    agent" from Sep 18, 2014, leads to the following static checker
    warning:
    
    	kernel/cgroup.c:5310 cgroup_release_agent()
    	warn: 'mutex:&cgroup_mutex' is sometimes locked here and sometimes unlocked.
    
    Reported-by: Dan Carpenter <[email protected]>
    Signed-off-by: Zefan Li <[email protected]>
    Signed-off-by: Tejun Heo <[email protected]>

commit 93b8877471796c04c16fdef755d4e5c0f521509f
Author: Alexander Shiyan <[email protected]>
Date:   Sat Sep 20 09:34:45 2014 +0400

    tty: serial_mctrl_gpio: Fix COMPILE_TEST build for architectures with custom termios.h
    
    This patch fixes COMPILE_TEST build of serial_mctrl_gpio module for
    architectures with custom termios.h header.
    
    sparc64:allmodconfig:
    
    In file included from drivers/tty/serial/serial_mctrl_gpio.c:21:0:
    include/uapi/asm-generic/termios.h:22:8: error: redefinition of 'struct termio'
    ./arch/sparc/include/uapi/asm/termbits.h:16:8: note: originally defined here
    make[3]: *** [drivers/tty/serial/serial_mctrl_gpio.o] Error 1
    
    Reported-by: Guenter Roeck <[email protected]>
    Signed-off-by: Alexander Shiyan <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit d07fe967189ff7c32f5a78b4f28c2ccbab850091
Author: Chen-Yu Tsai <[email protected]>
Date:   Thu Sep 18 11:24:40 2014 +0800

    ARM: dts: sun8i: Add DMA controller node
    
    Add the DMA controller node and DMA bindings to the supported devices.
    
    Signed-off-by: Chen-Yu Tsai <[email protected]>
    Signed-off-by: Maxime Ripard <[email protected]>

commit e625305b390790717cf2cccf61efb81299647028
Author: Tejun Heo <[email protected]>
Date:   Sat Sep 20 01:27:25 2014 -0400

    percpu-refcount: make percpu_ref based on longs instead of ints
    
    percpu_ref is currently based on ints and the number of refs it can
    cover is (1 << 31).  This makes it impossible to use a percpu_ref to
    count memory objects or pages on 64bit machines as it may overflow.
    This forces those users to somehow aggregate the references before
    contributing to the percpu_ref which is often cumbersome and sometimes
    challenging to get the same level of performance as using the
    percpu_ref directly.
    
    While using ints for the percpu counters makes them pack tighter on
    64bit machines, the possible gain from using ints instead of longs is
    extremely small compared to the overall gain from per-cpu operation.
    This patch makes percpu_ref based on longs so that it can be used to
    directly count memory objects or pages.
    
    Signed-off-by: Tejun Heo <[email protected]>
    Cc: Kent Overstreet <[email protected]>
    Cc: Johannes Weiner <[email protected]>

commit 4843c3320c3d23ab4ecf520f5eaf485aff8c7252
Author: Tejun Heo <[email protected]>
Date:   Sat Sep 20 01:27:24 2014 -0400

    percpu-refcount: improve WARN messages
    
    percpu_ref's WARN messages can be a lot more helpful by indicating
    who's the culprit.  Make them report the release function that the
    offending percpu-refcount is associated with.  This should make it a
    lot easier to track down the reported invalid refcnting operations.
    
    Signed-off-by: Tejun Heo <[email protected]>
    Cc: Kent Overstreet <[email protected]>

commit 6d967f8789249628a6388a3a4314c5fef423f36a
Author: Andy Zhou <[email protected]>
Date:   Fri Sep 19 18:02:53 2014 -0700

    udp_tunnel: Only build ip6_udp_tunnel.c when IPV6 is selected
    
    Functions supplied in ip6_udp_tunnel.c are only needed when IPV6 is
    selected. When IPV6 is not selected, those functions are stubbed out
    in udp_tunnel.h.
    
    ==================================================================
     net/ipv6/ip6_udp_tunnel.c:15:5: error: redefinition of 'udp_sock_create6'
         int udp_sock_create6(struct net *net, struct udp_port_cfg *cfg,
     In file included from net/ipv6/ip6_udp_tunnel.c:9:0:
          include/net/udp_tunnel.h:36:19: note: previous definition of 'udp_sock_create6' was here
           static inline int udp_sock_create6(struct net *net, struct udp_port_cfg *cfg,
    ==================================================================
    
    Fixes:  fd384412e udp_tunnel: Seperate ipv6 functions into its own file
    Reported-by: kbuild test robot <[email protected]>
    Signed-off-by: Andy Zhou <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 3f76a4ea5383ba2f9e76f9625f77ff246907a134
Author: Mahati Chamarthy <[email protected]>
Date:   Thu Sep 18 19:27:09 2014 +0530

    Staging: rtl8192e: Fix __constant_htons to htons style warning
    
    This fixes the following checkpatch.pl warning:
    WARNING: __constant_htons should be htons
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 288903f6b91e759b0a813219acd376426cbb8f14
Author: Catalina Mocanu <[email protected]>
Date:   Fri Sep 19 15:55:05 2014 -0700

    staging: iio: cdc: Don't put an else right after a return
    
    This fixes the following checkpatch.pl warning:
    WARNING: else is not generally useful after a break or return.
    
    While at it, remove new line for symmetry with the rest of the code.
    
    Signed-off-by: Catalina Mocanu <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 0a5fcc6b2efdc86619af793e0216a508469cfaa4
Author: Gulsah Kose <[email protected]>
Date:   Sat Sep 20 23:32:05 2014 +0300

    staging: octeon: Fix quoted string split warning.
    
    This patch fixes "quoted string split across lines" checkpatch.pl
    warning in ethernet.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 39bc7513aa92b38c391dbe9649841f9f9dfcd0ac
Author: Gulsah Kose <[email protected]>
Date:   Sat Sep 20 23:27:39 2014 +0300

    staging: octeon: Fix missing blank line warning.
    
    Fixes "Missing a blank line after declarations" checkpatch.pl warning in
    ethernet.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 1ff99b312f9c94516acb38bad7421ba1d74abeb2
Author: Roberta Dobrescu <[email protected]>
Date:   Fri Sep 19 23:34:36 2014 +0300

    staging: emxx_udc: Replace __constant_cpu_to_le16 with cpu_to_le16
    
    This fixes the following checkpatch.pl warning:
    WARNING: __constant_cpu_to_le16 should be cpu_to_le16
    Additionally, it removes the space between function name and (.
    
    Signed-off-by: Roberta Dobrescu <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 113f5f24c6be6f7d888946320d01b51b81aa213d
Author: Aybuke Ozdemir <[email protected]>
Date:   Fri Sep 19 00:31:44 2014 +0300

    Staging: rtl8821ae: Fix warnings of no space before tabs.
    
    This patch fixes these warning messages found by checkpatch.pl:
    WARNING: please, no space before tabs.
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit a45cbb78147e8f57250f1687f5b61470b8343a20
Author: Aybuke Ozdemir <[email protected]>
Date:   Thu Sep 18 23:56:13 2014 +0300

    Staging: rtl8821ae: Fix "foo * bar" warning.
    
    This patch fixes these error messages found by checkpatch.pl:
    ERROR: "foo* bar" should be "foo *bar"
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 34c376fe07342e06f531504b01d3b953962e456c
Author: Aybuke Ozdemir <[email protected]>
Date:   Thu Sep 18 01:03:28 2014 +0300

    Staging: wlan-ng: Fix return in void function warning
    
    This fixes checkpatch.pl warning:
    WARNING: void function return statements are not generally useful
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit fcf1b73d08cd15912205f3b259ea81ccfde11970
Author: Aybuke Ozdemir <[email protected]>
Date:   Thu Sep 18 00:54:04 2014 +0300

    Staging: media: cxd2099: Missing a blank line after declarations
    
    Fix checkpatch.pl issues with missing a blank
    line after declarations in cxd2099.c
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit c2e91542695270452ea7b5d3266ad0e9b5dc7bdb
Author: Aybuke Ozdemir <[email protected]>
Date:   Wed Sep 17 23:43:15 2014 +0300

    Staging: octeon: Missing a blank line after declarations
    
    Fix checkpatch.pl issues with missing a blank
    line after declarations in ethernet-sgmii.c
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 05fd349b1750d456423538e69c3c1d4d8a10f1c8
Author: Aybuke Ozdemir <[email protected]>
Date:   Wed Sep 17 16:10:36 2014 +0300

    staging: gs_fpgaboot Fix trailing whitespace.
    
    Fix checkpatch.pl issues with trailing
    whitespace in README.
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit de77c125f57a308250cfaec945541fd8abe0e054
Author: Aybuke Ozdemir <[email protected]>
Date:   Wed Sep 17 15:33:25 2014 +0300

    staging: bcm: Fix line over 80 characters
    
    Fix checkpatch.pl issues with
    line over 80 characters in HandleControlPacket.c
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 5ad6ae1acfd883d8f4c8998b4e5bc9d4aea7985f
Author: Gulsah Kose <[email protected]>
Date:   Sun Sep 21 01:20:44 2014 +0300

    staging: media: lirc: Fixes missing blank line warning.
    
    Fixes "Missing a blank line after declarations" checkpatch.pl warning in
    lirc_serial.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>.
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit a5613fe8967534ce626875fab4bcface70d366b4
Author: Gulsah Kose <[email protected]>
Date:   Sun Sep 21 01:26:03 2014 +0300

    staging: media: lirc: Fixes unnecessary return warning.
    
    This patch fixes "void function return statements are not generally
    useful" checkpatch.pl warning in lirc_zilog.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit a17ec4c9fd07d3f4760cc6545b54f8323ea6ccb4
Author: Gulsah Kose <[email protected]>
Date:   Sun Sep 21 01:06:55 2014 +0300

    staging: media: lirc: Fix missing blank line warning.
    
    Fixes "Missing a blank line after declarations" checkpatch.pl warning in
    lirc_bt829.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 3f8028023c3f6804751a920d97e9c8dffc575cc0
Author: Gulsah Kose <[email protected]>
Date:   Sun Sep 21 01:02:21 2014 +0300

    staging: media: lirc: Fix missing blank line warning.
    
    Fixes "Missing a blank line after declarations" checkpatch.pl warning in
    lirc_sasem.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit a87ba73ed10266dba8278b2a6b89da597a38092a
Author: Gulsah Kose <[email protected]>
Date:   Sun Sep 21 00:59:11 2014 +0300

    staging: media: lirc: Fix unnecessary return warning.
    
    This patch fixes "void function return statements are not generally
    useful" checkpatch.pl warning in lirc_sasem.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit fd8392f3097140a9db7b0903a63635e652b6eb45
Author: Gulsah Kose <[email protected]>
Date:   Sat Sep 20 22:46:57 2014 +0300

    staging: media: lirc: Fix missing blank line warning.
    
    Fixes "Missing a blank line after declarations" checkpatch.pl warning in
    lirc_zilog.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 3170f3277b1809c19fe4a45914cffa0e09471973
Author: Tina Johnson <[email protected]>
Date:   Wed Sep 17 03:14:52 2014 +0530

    Staging: media: lirc: lirc_imon: Removed unnecessary variable to simplify return variable handling
    
    Variable rc was removed after merging its assignment statement with
    immediately following return statement. Variable retval is not used
    at all other that to return its initial value.Hence replaced retval
    with its initial value in the return statement and removed the variable.
    
    This patch was done using Coccinelle script and the following semantic
    patch was used:
    
    @rule1@
    identifier ret;
    expression e;
    @@
    
    -int ret = 0;
     ... when != ret
    (
    -ret = e;
    +return e;
    -return ret;
    |
    -return ret;
    +return 0;
    )
    
    Signed-off-by: Tina Johnson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 8ad5360ad81a32b4e9fdc956e7c453308050a97d
Author: Gulsah Kose <[email protected]>
Date:   Sat Sep 20 21:39:46 2014 +0300

    staging: lustre: lnet: lnet: Fixed quoted string split warning.
    
    This patch fixes "quoted string split across lines" checkpatch.pl
    warning in api-ni.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 70b694c32e405cff8e2640b3943ed9598d97f75e
Author: Gulsah Kose <[email protected]>
Date:   Sat Sep 20 21:48:00 2014 +0300

    staging: lustre: lnet: lnet: Fix missing line warning.
    
    This patch fixes "Fixes "Missing a blank line after declarations"
    checkpatch.pl warning in api-ni.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit a446b47d5d815865c2715da8fab1a7c06f1338ca
Author: Gulsah Kose <[email protected]>
Date:   Sat Sep 20 21:53:31 2014 +0300

    staging: lustre: lnet: lnet: Fix quoted string split warning.
    
    This patch fixes "quoted string split across lines" checkpatch.pl
    warning in lib-eq.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 3e9cc5b0450a40be3442a82a5a5293f85ca06c7d
Author: Darshana Padmadas <[email protected]>
Date:   Wed Sep 17 20:58:43 2014 +0530

    Staging: lustre: Fix return in void function warning
    
    This fixes checkpatch.pl warning:
    
    WARNING: void function return statements are not generally useful
    
    Signed-off-by: Darshana Padmadas <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 6606a77f92821f8bfd4b1b6ba296da662fecb640
Author: Darshana Padmadas <[email protected]>
Date:   Wed Sep 17 20:28:54 2014 +0530

    Staging: lustre: place open brace following struct on same line
    
    This patch fixes checkpatch.pl warning:
    
    WARNING: open brace following struct goes on the same line.
    
    Signed-off-by: Darshana Padmadas <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 4467a945fc08c0d6624b1dd64cfcc2cbd3b3dee3
Author: Darshana Padmadas <[email protected]>
Date:   Wed Sep 17 18:14:45 2014 +0530

    Staging: lustre: libcfs: fix checkpatch warning else after return statement
    
    Fix checkpatch warning by removing unnecessary else after return statement.
    
    Signed-off-by: Darshana Padmadas <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit f5740b2e7e74fa9ba915aa74bfba7cf849dce8a7
Author: Darshana Padmadas <[email protected]>
Date:   Tue Sep 16 13:24:13 2014 +0530

    Staging: lustre: include: libcfs: removed else before return statement in libcfs_crypto.h
    
    This is a patch to libcfs_crypto.h that fixes warning on unnecessary else before return statement found by checkpatch.pl tool.
    
    Signed-off-by: Darshana Padmadas <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 225557bf274ed1519362865815da7425533191d1
Author: Roxana Blaj <[email protected]>
Date:   Mon Sep 15 14:58:44 2014 +0300

    staging: speakup: fix checkpatch warning
    
    This fixes the checkpatch warning:
    WARNING: line over 80 characters
    
    Signed-off-by: Roxana Blaj <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 0a3a725adb2c421ea79089ea12004a007fb371ce
Author: Roxana Blaj <[email protected]>
Date:   Sun Sep 14 20:28:53 2014 +0300

    staging: speakup: fix checkpatch warning
    
    This fixes the cheackpatch warning:
    WARNING: Missing a blank line after declarations
    
    Signed-off-by: Roxana Blaj <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 472fe30efd52fde30249a04971a62151e0606c1d
Author: Nicoleta Birsan <[email protected]>
Date:   Sun Sep 14 03:38:34 2014 -0700

    Staging: speakup: fix checkpatch warning
    
    This fixes the following checkpatch.pl warning:
    WARNING: Missing a blank line after declarations
    
    Signed-off-by: Nicoleta Birsan <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 297cbdaeca2b68aaae6bbb7affa4533430e8e91a
Author: Blaj Roxana <[email protected]>
Date:   Tue Sep 16 20:13:28 2014 +0300

    staging: skein: replace spaces with tabs
    
    This fixes the error and warning:
    ERROR: code indent should use tabs where possible
    WARNING: please, no spaces at the start of a line
    
    Signed-off-by: Blaj Roxana <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit fb33aa47a00edc789d17d80174cd3ed8a1c82c66
Author: Roberta Dobrescu <[email protected]>
Date:   Sat Sep 20 00:01:39 2014 +0300

    staging: dgnc: Check sscanf return value
    
    This fixes the following checkpatch.pl warnings:
    WARNING: unchecked sscanf return value
    
    Signed-off-by: Roberta Dobrescu <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit f23e875fd26a05a0850db7c5e090030c80b4f583
Author: Gulsah Kose <[email protected]>
Date:   Sat Sep 20 19:34:45 2014 +0300

    staging: dgnc: Fix unnecessary space warning.
    
    Fixed "Unnecessary space before function pointer argument" checkpatch.pl
    warning in dgnc_driver.h
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit e8756d4a51d1246be36c5621827c288eb2d5e9b7
Author: Gulsah Kose <[email protected]>
Date:   Sat Sep 20 19:31:15 2014 +0300

    staging: dgnc: Fix missing blank line warning.
    
    Fixes "Missing a blank line after declarations" checkpatch.pl warning in
    dgnc_sysfs.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 3dfe7557809e5867306c7a0614b9d1c6036cbe4d
Author: Vaishali Thakkar <[email protected]>
Date:   Fri Sep 19 10:30:59 2014 +0530

    Staging: dgnc: Merge lines and remove unused variable for immediate return
    
    This patch merges two lines in a single line if immediate
    return is found. It also removes unnecessory variable rc
    as it is no longer needed.
    
    This is done using Coccinelle. Semantic patch used for this
    is as follows:
    
    @@
    type T;
    identifier i;
    identifier f;
    constant C;
    @@
    - T i;
      ...when != i
         when strict
    (
      return -C;
    |
    - i =
    + return
         f(...);
    - return i;
    )
    
    Signed-off-by: Vaishali Thakkar <[email protected]>
    Reviewed-by: Julia Lawall <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 10352c2a69f4aa2724f007a4922518c9ece7bf89
Author: Roberta Dobrescu <[email protected]>
Date:   Thu Sep 18 21:38:04 2014 +0300

    staging: dgnc: Move open brace on previous line
    
    This fixes the following checkpatch.pl errors:
    ERROR: that open brace { should be on the previous line
    
    Signed-off-by: Roberta Dobrescu <[email protected]>
    Acked-by: Julia Lawall <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 05a70e14035438e6866d7fcf8a79c67b8e1425e1
Author: Roberta Dobrescu <[email protected]>
Date:   Tue Sep 16 20:33:03 2014 +0300

    staging: dgnc: Do not initialise statics to 0 or NULL
    
    This fixes the following checkpatch.pl error:
    ERROR: do not initialise statics to 0 or NULL
    
    Signed-off-by: Roberta Dobrescu <[email protected]>
    Acked-by: Daniel Baluta <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit b051017fb4e593998fc46ec9a991ad390c9114b5
Author: Roberta Dobrescu <[email protected]>
Date:   Mon Sep 15 21:32:59 2014 +0300

    staging: dgnc: Replace kzalloc with kcalloc
    
    This fixes the following checkpatch.pl warnings:
    WARNING: Prefer kcalloc over kzalloc with multiply
    
    Signed-off-by: Roberta Dobrescu <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit f3dadd29f7197d93d0441391f5e3815bf008cce1
Author: Roberta Dobrescu <[email protected]>
Date:   Sun Sep 14 23:13:20 2014 +0300

    staging: dgnc: Fix warnings relating to printk()
    
    This fixes the following checkpatch.pl warnings:
    WARNING: printk() should include KERN_ facility level
    It replaces printk() with dev_dbg() in order to avoid the warning that a more
    specific function should be used.
    
    Signed-off-by: Roberta Dobrescu <[email protected]>
    Acked-by: Julia Lawall <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 2be13f7b7c63cecc439876c8c06a5b30afdf46f9
Author: Mahati Chamarthy <[email protected]>
Date:   Sat Sep 20 04:59:59 2014 +0530

    Staging: rtl8192ee: rtl8192ee: Fix missing blank line warning
    
    This fixes the following checkpatch.pl warnings:
    WARNING: Missing a blank line after declarations
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit b9209a93edbccafb6c2f860bc0ddfe9eda1e3ccd
Author: Mahati Chamarthy <[email protected]>
Date:   Sat Sep 20 04:49:43 2014 +0530

    Staging: rtl8192ee: Fix else not useful style warning
    
    This fixes the following checkpatch.pl warnings:
    WARNING: else is not generally useful after a break or return
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 1709a582e1f8977de040f02d9e9e52ec89f8603f
Author: Mahati Chamarthy <[email protected]>
Date:   Sat Sep 20 04:03:36 2014 +0530

    Staging: rtl8192ee: Fix break is not useful warning
    
    This fixes the following checkpatch.pl warnings:
    WARNING: break is not useful after a goto or return
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit fe6dc85eaf8bb180ad3510a57bd69f3b8f9c2dbb
Author: Mahati Chamarthy <[email protected]>
Date:   Sat Sep 20 03:42:01 2014 +0530

    Staging: rtl8192ee: Fix else is not useful warning
    
    This fixes the following checkpatch.pl warnings:
    WARNING: else is not generally useful after a break or return
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit f41788b7c933127863435f72f456ec46ed5540b2
Author: Mahati Chamarthy <[email protected]>
Date:   Sat Sep 20 03:29:19 2014 +0530

    Staging: rtl8192ee: Fix missing blank line warning
    
    This fixes the following checkpatch.pl warnings:
    WARNING: Missing a blank line after declarations
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit ad39fe743419d58f9bc29373189c93ba2251e675
Author: Mahati Chamarthy <[email protected]>
Date:   Sat Sep 20 02:43:26 2014 +0530

    Staging: rtl8192e: Fix printk debug style warning
    
    This fixes the following checkpatch.pl warnings:
    WARNING: Prefer [subsystem eg: netdev]_dbg([subsystem]dev, ... then dev_dbg(dev,
     ... then pr_debug(...  to printk(KERN_DEBUG ...
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 4344672830d8500eac97d82976b03e41580c3a04
Author: Mahati Chamarthy <[email protected]>
Date:   Sat Sep 20 02:40:43 2014 +0530

    Staging: rtl8192e: Fix printk style warning
    
    This fixes the following checkpatch.pl warnings:
    WARNING: Prefer [subsystem eg: netdev]_info([subsystem]dev, ... then dev_info(dev, ... then pr_info(...  to printk(KERN_INFO ...
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 6af197672f2330045c171aed3ea90fb93d89ecc6
Author: Mahati Chamarthy <[email protected]>
Date:   Sat Sep 20 02:38:39 2014 +0530

    Staging: rtl8192e: Fix space before semicolon warning
    
    This fixes the following checkpatch.pl warning:
    WARNING: space prohibited before semicolon
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 13402f7b76223e7f50ab42c82aac4788940c8277
Author: Mahati Chamarthy <[email protected]>
Date:   Sat Sep 20 02:36:31 2014 +0530

    Staging: rtl8192e: Fix else is not useful warning
    
    This fixes the following checkpatch.pl warning:
    WARNING: else is not generally useful after a break or return
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 5110e40260d03fdb2d93a94fec06a31b81d57b0b
Author: Mahati Chamarthy <[email protected]>
Date:   Fri Sep 19 23:56:02 2014 +0530

    Staging: rtl8192e: Fix void function return statements style warning
    
    This fixes the following checkpatch.pl warnings:
    WARNING:  void function return statements are not generally useful
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 59422a74b55c616d500c3be721077ff0d00f7fb0
Author: Mahati Chamarthy <[email protected]>
Date:   Fri Sep 19 23:12:53 2014 +0530

    Staging: rtl8192e: Fix else is not useful style warning
    
    This fixes the following checkpatch.pl warnings:
    WARNING: else is not generally useful after a break or return
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 1f921b9f61b1a324366c8f6a02c5a8e89164ed52
Author: Vaishali Thakkar <[email protected]>
Date:   Fri Sep 19 22:22:19 2014 +0530

    Staging: rtl8192e: Fixed style warning relating to printk()
    
    This patch fixes following checkpatch.pl warning in file rtl_dm.c:
    
    WARNING: Prefer [subsystem eg: netdev]_info([subsystem]dev, ... then dev_info(dev, ... then pr_info(...  to printk(KERN_INFO .
    
    Signed-off-by: Vaishali Thakkar <[email protected]>
    Acked-by: Julia Lawall <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 52e93b8ab435978bc12280aa4418ef25fd6e74f2
Author: Mahati Chamarthy <[email protected]>
Date:   Fri Sep 19 05:22:33 2014 +0530

    Staging: rtl8192e: Fix unnecessary parentheses style warning
    
    This fixes the following checkpatch.pl warning:
    WARNING: Unnecessary parentheses - maybe == should be = ?
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit fee9d3e61d04422628a3d22ed5eb8370dcef259b
Author: Chris J Arges <[email protected]>
Date:   Wed Aug 27 13:26:53 2014 -0500

    ktest: add ability to skip during BISECT_MANUAL
    
    When doing a manual bisect, a build can fail or a test can be inconclusive.
    In these cases it would be helpful to be able to skip the test entirely.
    
    Link: http://lkml.kernel.org/r/[email protected]
    
    Reviewed-by: Satoru Takeuchi <[email protected]>
    Signed-off-by: Chris J Arges <[email protected]>
    Signed-off-by: Steven Rostedt <[email protected]>

commit 4af409f6c38029e1eda0a5e7bbf15e9b1b7d7fab
Author: Benedict Boerger <[email protected]>
Date:   Thu Sep 18 17:46:23 2014 +0200

    staging: rtl8192u: delete unused function CAM_read_entry
    
    Fix the sparse warning: symbol 'CAM_read_entry' was not declared. Should it be static?
    
    The function CAM_read_entry is not used and therefore deleted.
    
    Signed-off-by: Benedict Boerger <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 77baad9e4d71e75d7ad6ee83454113d4a6a7b04d
Author: Ragnar B. Johannsson <[email protected]>
Date:   Thu Sep 18 14:33:25 2014 +0000

    staging: rtl8192u: Move ieee80211_crypto_* declarations to ieee80211/ieee80211.h
    
    Move ieee80211_crypto*_init and _exit prototype declarations from r8192U_core.c to ieee80211/ieee80211.h. This fixes the following sparse warnings:
    
    drivers/staging/rtl8192u/ieee80211/ieee80211_crypt.c:203:12: warning: symbol 'ieee80211_crypto_init' was not declared. Should it be static?
    drivers/staging/rtl8192u/ieee80211/ieee80211_crypt.c:223:13: warning: symbol 'ieee80211_crypto_deinit' was not declared. Should it be static?
    drivers/staging/rtl8192u/ieee80211/ieee80211_crypt_tkip.c:764:12: warning: symbol 'ieee80211_crypto_tkip_init' was not declared. Should it be static?
    drivers/staging/rtl8192u/ieee80211/ieee80211_crypt_tkip.c:769:13: warning: symbol 'ieee80211_crypto_tkip_exit' was not declared. Should it be static?
    drivers/staging/rtl8192u/ieee80211/ieee80211_crypt_ccmp.c:467:12: warning: symbol 'ieee80211_crypto_ccmp_init' was not declared. Should it be static?
    drivers/staging/rtl8192u/ieee80211/ieee80211_crypt_ccmp.c:472:13: warning: symbol 'ieee80211_crypto_ccmp_exit' was not declared. Should it be static?
    drivers/staging/rtl8192u/ieee80211/ieee80211_crypt_wep.c:281:12: warning: symbol 'ieee80211_crypto_wep_init' was not declared. Should it be static?
    drivers/staging/rtl8192u/ieee80211/ieee80211_crypt_wep.c:286:13: warning: symbol 'ieee80211_crypto_wep_exit' was not declared. Should it be static?
    
    Signed-off-by: Ragnar B. Johannsson <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 5635b82a553620c511dc6bc8cb0990c0a791e21e
Author: Mahati Chamarthy <[email protected]>
Date:   Thu Sep 18 15:43:53 2014 +0530

    Staging: rtl8192e: Fix style warnings relating to printk(KERN_DEBUG
    
    This fixes the following checkpatch.pl warnings:
    WARNING: Prefer [subsystem eg: netdev]_dbg([subsystem]dev, ... then dev_dbg(dev, ... then pr_debug(...  to printk(KERN_DEBUG ...
    
    Signed-off-by: Mahati Chamarthy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit fe40a0b361de10ea794116160308cc7fd0b7fbeb
Author: Vaishali Thakkar <[email protected]>
Date:   Wed Sep 17 08:35:24 2014 +0530

    Staging: rtl8192e: rtl8192e: Remove unnecessory braces and space
    
    This patch removes following checkpatch.pl warnings in rtl_core.c file:
    
    WARNING: Braces {} are not necessary for single statement blocks
    WARNING: Space prohibited before semicolon
    
    Signed-off-by: Vaishali Thakkar <[email protected]>
    Acked-by: Julia Lawall <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 5c8b3961da9a55762ea5481e8f9412c0d18dc684
Author: Vaishali Thakkar <[email protected]>
Date:   Wed Sep 17 08:02:43 2014 +0530

    Staging: rtl8192e: rtl8192e: Remove unnecessory variable
    
    This patch removes unnecessory variable in file ret_core.c
    using coccinelle script.Semantic patch for this is as follows:
    
    @@
    identifier ret;
    @@
    
    -int ret = 0;
     ... when != ret
         when strict
    -return ret;
    +return 0;
    
    Signed-off-by: Vaishali Thakkar <[email protected]>
    Acked-by: Julia Lawall <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 23a0e1611b880bd8d94bbebcb3577c9f78029435
Author: Steven Rostedt (Red Hat) <[email protected]>
Date:   Fri Sep 19 20:10:39 2014 -0400

    ktest: Add PATCHCHECK_CHERRY
    
    Add a way to run a patchcheck test on the commits that are in one branch
    but not in another. This uses git cherry to find a list of commits to
    test each one with.
    
    Signed-off-by: Steven Rostedt <[email protected]>

commit 4309635f692192ddcc540964189d92cad0ade249
Author: Rajbinder Brar <[email protected]>
Date:   Tue Sep 16 11:25:31 2014 +0530

    Staging: vt6655: Break 80 character long line to remove checkpatch error
    
    This removes checkpatch.pl warning
    WARNING: line over 80 characters
    
    Signed-off-by: Rajbinder Brar <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit b377ed4cce004d7c3dbd92cffdbf2aa21d28e2e6
Author: Rajbinder Brar <[email protected]>
Date:   Wed Sep 17 21:27:03 2014 +0530

    Staging: vt6656: Removing else after break statement to fix warning
    
    This patch fixes the checkpatch.pl warning in baseband.c file
    WARNING: else is not useful after a break or return
    
    Signed-off-by: Rajbinder Brar <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit dbc6ee63d4355a51fd84ee8ebf127763180b1585
Author: Aybuke Ozdemir <[email protected]>
Date:   Fri Sep 19 19:49:33 2014 +0300

    Staging: vt6655: Fix C99 style commenting.
    
    This patch fixes these error messages found by checkpatch.pl:
    ERROR: do not use C99 // comments
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit a600f4589fdbb51a0ad885408f996ec0f1f90be9
Author: Abel Moyo <[email protected]>
Date:   Thu Sep 18 21:49:10 2014 +0200

    Staging: gdm724x: gdm_usb: added error checking in do_tx()
    
    Added error checking for alloc_tx_struct in do_tx()
    
    Signed-off-by: Abel Moyo <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 37d963fb80d2fd944bd0124570b2adc5b826ccef
Author: Gulsah Kose <[email protected]>
Date:   Sat Sep 20 20:43:53 2014 +0300

    staging: gdm724x: Fix missing blank line warning.
    
    Fixes "Missing a blank line after declarations" checkpatch.pl warning in
    gdm_mux.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 492a1e7be585c88a04ba763bb77fc865700e209d
Author: Daeseok Youn <[email protected]>
Date:   Tue Sep 16 16:19:06 2014 +0900

    staging: dgap: use schedule_timeout_interruptible() instead of dgap_ms_sleep()
    
    Using schedule_timeout_interruptible() is exactly same as
    setting a status of current process and calling  schedule_timeout().
    
    Removes dgap_ms_sleep(), because this function is used
    only when closing tty channel on dgap_tty_close().
    And also removes ch_close_delay that is always set to 250
    on dgap_tty_init().
    
    Signed-off-by: Daeseok Youn <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 335d9c85be096cf492cb3eaeef160b45e1f25d8d
Author: Ankita Patil <[email protected]>
Date:   Thu Sep 18 12:31:00 2014 +0530

    Staging: dgap: Remove unnecessary variable.
    
    This patch removes unnecessary variable in file dgap.c
    using Coccinelle. Semantic patch for this is as follows:
    
    @@
    expression ret;
    identifier f;
    @@
    
    -ret =
    +return
         f(...);
    -return ret;
    
    Also removed the unneeded variable manually.
    
    Signed-off-by: Ankita Patil <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 50d0a21b61f22b38f881fa21d2ada6ab4a61f93f
Author: Purnendu Kapadia <[email protected]>
Date:   Mon Sep 15 13:06:36 2014 +0100

    staging: android: sw_sync: checkpatch fixes
    
       - no space after cast
       - allignment should match open parenthesis
       - remove unnecessary new line
    
    Signed-off-by: Purnendu Kapadia <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 1f0f6c9862b687db36f5e853402f76bc118ff0bf
Author: Aybuke Ozdemir <[email protected]>
Date:   Fri Sep 19 01:58:29 2014 +0300

    Staging: rtl8723au: hal: Space prohibited before semicolon
    
    This patch fixes these warning messages found by checkpatch.pl:
    WARNING: Space prohibited before semicolon.
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 8c09757d91703ccbf0da9fc67764de9714c9e615
Author: Aybuke Ozdemir <[email protected]>
Date:   Fri Sep 19 02:23:53 2014 +0300

    Staging: rtl8723au: core: Fix unnecassary braces warning.
    
    This patch fixes these warning messages found by checkpatch.pl:
    WARNING: braces {} are not necessary for single statement blocks
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 867ce1bd68fb1eadb70b82bcda1e451b27ff824a
Author: Aybuke Ozdemir <[email protected]>
Date:   Fri Sep 19 02:33:38 2014 +0300

    Staging: rtl8723au: core: Fix "foo * bar" warning.
    
    This patch fixes these error messages found by checkpatch.pl:
    ERROR: "foo* bar" should be "foo *bar"
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit c895a5df852ca9bbac1dee413747303a61aa4ebd
Author: Greg Donald <[email protected]>
Date:   Tue Sep 16 18:37:41 2014 -0500

    drivers: staging: rtl8723au: Fix "space required after that ','" errors
    
    Fix checkpatch.pl "space required after that ','" errors
    
    Signed-off-by: Greg Donald <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit f78c0710cd60cd108d436490955909983f309c62
Author: Kieron Browne <[email protected]>
Date:   Tue Sep 16 23:28:09 2014 +0100

    staging: rtl8723au: fix sparse incorrect type assignment warnings
    
    Use cpu_to_le16 to cast int for assignment to __le16 members
    
    Signed-off-by: Kieron Browne <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit adabff85c9486c09ec700d835328e18ccfc9abf0
Author: MihaelaGaman <[email protected]>
Date:   Sun Sep 14 12:56:43 2014 +0300

    staging: rtl8723au: Fix checkpatch errors
    
    Fix checkpatch.pl "spaces required around":
    >, =, =, =, =, +=, >, >, <, <, :, <  errors.
    
    Signed-off-by: MihaelaGaman <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 1703c17b8a02b7d1dd3080c4ce9d41a83e95a071
Author: Vaishali Thakkar <[email protected]>
Date:   Sun Sep 14 13:46:37 2014 +0530

    Staging: rtl8188eu: os_dep: Compression of lines for immediate return
    
    This patch compresses two lines in to a single line in file rtw_android.c
    if immediate return statement is found. It also removes variable bytes_written as
    it is no longer needed.
    
    It is done using script Coccinelle. And coccinelle uses following semantic
    patch for this compression function:
    
    @@
    expression ret;
    identifier f;
    @@
    
    -ret =
    +return
         f(...);
    -return ret;
    
    Signed-off-by: Vaishali Thakkar<[email protected]>
    Acked-by: Julia Lawall <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 16e614e85025d69c87e9ce80b9e1b5238f0f4479
Author: Gulsah Kose <[email protected]>
Date:   Sun Sep 21 00:13:29 2014 +0300

    staging: rtl8188eu: core: Fixed wrong space error.
    
    This patch fixes "foo     * bar" should be "foo   *bar" checkpatch.pl error in rtw_cmd.c
    
    Signed-off-by: Gulsah Kose <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 69869c01ff148ef22d0ea1adec27b4543789792b
Author: Catalina Mocanu <[email protected]>
Date:   Fri Sep 19 14:54:54 2014 -0700

    staging: iio: impedance-analyzer: add blank line after declaration
    
    This fixes the following checkpatch.pl warning:
    WARNING: Missing a blank line after declarations.
    
    Signed-off-by: Catalina Mocanu <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 714ab9bdd350413f48ad401bd25e11b3e9f257ab
Author: Catalina Mocanu <[email protected]>
Date:   Fri Sep 19 14:32:09 2014 -0700

    staging: iio: trigger: add blank lines after declarations
    
    This fixes the following checkpatch.pl warning:
    WARNING: Missing a blank line after declarations.
    
    Signed-off-by: Catalina Mocanu <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 8a689c114796d8a3801c2bf3e25d3e21d6816036
Author: Aybuke Ozdemir <[email protected]>
Date:   Fri Sep 19 18:48:05 2014 +0300

    Staging: iio: resolver: Missing a blank line after declarations
    
    This patch fixes these warning messages found by checkpatch.pl:
    WARNING : Missing a blank line after declarations
    
    Signed-off-by: Aybuke Ozdemir <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 4b4c727519b510ab9d9b33de51ea41fc34b9ef27
Author: Catalina Mocanu <[email protected]>
Date:   Thu Sep 18 14:55:06 2014 -0700

    staging: iio: dummy: add blank lines after declarations.
    
    This fixes the following checkpatch.pl warning:
    WARNING: Missing a blank line after declarations.
    
    Signed-off-by: Catalina Mocanu <[email protected]>
    Reviewed-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit b581c3d9a90772613e05e659b4e8defc81704212
Author: Tina Johnson <[email protected]>
Date:   Sat Sep 13 15:46:15 2014 +0530

    Staging: iio: meter: ade7753: Fixed checkpatch.pl warnings
    
    Clean-up patch to fix the following checkpatch.pl warnings:
    
    ade7753.c:325: WARNING: Missing a blank line after declarations
    ade7753.c:383: WARNING: Missing a blank line after declarations
    
    Signed-off-by: Tina Johnson<[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 9034720a54738bbaf96b619f34f887199ac7efed
Author: Tina Johnson <[email protected]>
Date:   Sun Sep 14 16:30:05 2014 +0530

    Staging: iio: meter: ade7753: Merged assignment with immediately following return statement
    
    Saved one line of code by merging the assigning and return statements
    of variable ret. And thus removed variable len which was no longer useful.
    
    This patch was done using Coccinelle script and the following semantic
    patch was used:
    
    @@
    expression ret;
    identifier f;
    @@
    
    -ret =
    +return
          f(...);
    -return ret;
    
    Signed-off-by: Tina Johnson <[email protected]>
    Acked-by: Julia Lawall <[email protected]>
    Acked-by: Josh Triplett <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 18f340f90e087c078c634d5c4fed5e0d632d4fb6
Author: Paul Zimmerman <[email protected]>
Date:   Fri Sep 19 14:49:36 2014 -0700

    usb: dwc2: add T: line to MAINTAINERS showing Felipe's tree
    
    Starting with v3.18-rc, patches for dwc2 will go through Felipe's
    tree. Add a T: line to MAINTAINERS to document this.
    
    Signed-off-by: Paul Zimmerman <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 5dce95554a1866339de039060ecd7122056a9d71
Author: Paul Zimmerman <[email protected]>
Date:   Tue Sep 16 13:47:27 2014 -0700

    usb: dwc2: handle DMA buffer unmapping sanely
    
    The driver's handling of DMA buffers for non-aligned transfers
    was kind of nuts. For IN transfers, it left the URB DMA buffer
    mapped until the transfer completed, then synced it, copied the
    data from the bounce buffer, then synced it again.
    
    Instead of that, just call usb_hcd_unmap_urb_for_dma() to unmap
    the buffer before starting the transfer. Then no syncing is
    required when doing the copy. This should also allow handling of
    other types of mappings besides just dma_map_single() ones.
    
    Also reduce the size of the bounce buffer allocation for Isoc
    endpoints to 3K, since that's the largest possible transfer size.
    
    Tested on Raspberry Pi and Altera SOCFPGA.
    
    Signed-off-by: Paul Zimmerman <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit e8f8c14d9da7ab1b8a7b0f769cd7148ca2cc7d10
Author: Paul Zimmerman <[email protected]>
Date:   Tue Sep 16 13:47:26 2014 -0700

    usb: dwc2: clip max_transfer_size to 65535
    
    Clip max_transfer_size to 65535 for host. dwc2_hc_setup_align_buf()
    allocates coherent buffers with this size, and if it's too large we
    can exhaust the coherent DMA pool.
    
    Tested on Raspberry Pi and Altera SOCFPGA.
    
    Signed-off-by: Paul Zimmerman <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit d00b41428042e72d9dc2557d9147434a4e3d631f
Author: Robert Baldyga <[email protected]>
Date:   Tue Sep 9 10:44:57 2014 +0200

    usb: dwc2/gadget: disable clock when it's not needed
    
    When device is stopped or suspended clock is not needed so we
    can disable it for this time.
    
    Signed-off-by: Robert Baldyga <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit b203d0a2e32dd28e87780078f0789322862e4da8
Author: Robert Baldyga <[email protected]>
Date:   Tue Sep 9 10:44:56 2014 +0200

    usb: dwc2/gadget: assign TX FIFO dynamically
    
    Because we have not enough memory to have each TX FIFO of size at least
    3072 bytes (the maximum single packet size with 3 transactions per
    microframe), we create four FIFOs of lenght 1024, and four of length
    3072 bytes, and assing them to endpoints dynamically according to
    maxpacket size value of given endpoint.
    
    Up to now there were initialized 16 TX FIFOs, but we use only 8 IN
    endpoints, so we can split available memory for 8 FIFOs to have more
    memory for each one.
    
    It needed to do some small modifications in few places in code, because
    there was assumption that TX FIFO numbers assigned to endpoints are the
    same as the endpoint numbers, which is not true since we have dynamic
    FIFO assigning.
    
    Signed-off-by: Robert Baldyga <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit cff9eb756e18a7763d7ab9c574c0ab191e712341
Author: Marek Szyprowski <[email protected]>
Date:   Tue Sep 9 10:44:55 2014 +0200

    usb: dwc2/gadget: ensure that all fifos have correct memory buffers
    
    Print warning if FIFOs are configured in such a way that they don't fit
    into the SPRAM available on the s3c hsotg module.
    
    Signed-off-by: Marek Szyprowski <[email protected]>
    Signed-off-by: Robert Baldyga <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 1e01129373f757925a652ea4ea5b278f8c2b9222
Author: Marek Szyprowski <[email protected]>
Date:   Tue Sep 9 10:44:54 2014 +0200

    usb: dwc2/gadget: hide some not really needed debug messages
    
    Some DWC2/s3c-hsotg debug messages are really useless for typical user,
    so hide them behind dev_dbg().
    
    Signed-off-by: Marek Szyprowski <[email protected]>
    Signed-off-by: Robert Baldyga <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit d784f1e50977e58db23a79181971c3c0f62452e5
Author: Andrzej Pietrasiewicz <[email protected]>
Date:   Tue Sep 9 10:44:53 2014 +0200

    usb: dwc2/gadget: Fix comment text
    
    Adjust the debug text to the name of the printed variable.
    
    Signed-off-by: Andrzej Pietrasiewicz <[email protected]>
    Signed-off-by: Robert Baldyga <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 496a51bd64eb15f14cee3519f5b75b28d09567e3
Author: Julia Lawall <[email protected]>
Date:   Thu Sep 18 22:24:02 2014 +0200

    staging: lustre: llite: Use kzalloc and rewrite null tests
    
    This patch removes some kzalloc-related macros and rewrites the
    associated null tests to use !x rather than x == NULL.
    
    A simplified version of the semantic patch that makes this change is as
    follows: (http://coccinelle.lip6.fr/)
    
    // <smpl>
    @@
    expression ptr;
    statement S,S1;
    @@
    
      \(OBD_ALLOC\|OBD_ALLOC_WAIT\|OBD_ALLOC_PTR\|OBD_ALLOC_PTR_WAIT\)(ptr,...);
      if (
    +     !
          ptr
    -      == NULL
         ) S else S1
    
    @@
    expression ptr,size;
    @@
    
    - OBD_ALLOC(ptr,size)
    + ptr = kzalloc(size, GFP_NOFS)
    
    @@
    expression ptr,size;
    @@
    
    - OBD_ALLOC_WAIT(ptr,size)
    + ptr = kzalloc(size, GFP_KERNEL)
    
    @@
    expression ptr,size;
    @@
    
    - OBD_ALLOC_PTR(ptr)
    + ptr = kzalloc(sizeof(*ptr), GFP_NOFS)
    
    @@
    expression ptr,size;
    @@
    
    - OBD_ALLOC_PTR_WAIT(ptr,size)
    + ptr = kzalloc(sizeof(*ptr), GFP_KERNEL)
    // </smpl>
    
    Signed-off-by: Julia Lawall <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit cdbcd3305293d18f7ae73b2766699bddf634bb06
Author: Martin Kelly <[email protected]>
Date:   Mon Sep 15 21:16:15 2014 -0700

    Staging/bcm: Fix whitespace/comments in Ioctl.h
    
    Cleanup whitespace and comments in Ioctl.h in a few ways:
    - > 80 character cleanup
    - Comment clarification
    - More consistent vertical alignment
    
    Signed-off-by: Martin Kelly <[email protected]>
    Reviewed-by: Matthias Beyer <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 33b443e467f6c92c4cc797f5acf6a933fcfe9ec3
Author: Fabien Malfoy <[email protected]>
Date:   Mon Sep 15 09:02:36 2014 +0200

    staging: rtl8821ae: Remove space after unary operator in efuse.c
    
    Several pointer declaration syntax have been fixed to match the coding style.
    
    Signed-off-by: Fabien Malfoy <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit c24cdca05edb9c5435529afa37ce8c9c25ac4c5e
Author: Merlin Chlosta <[email protected]>
Date:   Mon Sep 15 01:56:10 2014 +0200

    staging: rtl8192u: sparse warnings: declare ieee80211_TURBO_Info static
    
    Declare ieee80211_TURBO_Info static to fix a sparse "symbol was not declared" warning.
    
    Signed-off-by: Merlin Chlosta <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 5b1ebbffc0b2dd47a45380ba68da36f792a2977e
Author: Vincenzo Scotti <[email protected]>
Date:   Sat Sep 13 13:39:20 2014 +0200

    staging: emxx_udc: fix compile warnings: discarding const qualifier
    
    Signed-off-by: Vincenzo Scotti <[email protected]>
    Reported-by: kbuild test robot <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit f02935c575cb00f2a164282866324816a1f52fc1
Author: Masanari Iida <[email protected]>
Date:   Sat Sep 13 01:14:30 2014 +0900

    staging: exxx_udc: Convert pr_warning to pr_warn
    
    This patch Convert pr_warning to pr_warn.
    
    Signed-off-by: Masanari Iida <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 3aa2ec581903747d926765850212278c7c24be77
Author: Sudip Mukherjee <[email protected]>
Date:   Fri Sep 12 17:57:26 2014 +0530

    staging: unisys: uislib: uislib.c: sparse warning of context imbalance
    
    fixed sparse warning : context imbalance in 'destroy_device'
                            unexpected unlock
    this patch will generate warning from checkpatch for
    lines over 80 character , but since those are user-visible strings
    so it was not modified.
    
    Signed-off-by: Sudip Mukherjee <[email protected]>
    Tested-by: Benjamin Romer <[email protected]>
    Acked-by: Benjamin Romer <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 635ecc5f36438cdf8cf3b88421321ee7443eb2d1
Author: Luke Hart <[email protected]>
Date:   Fri Sep 12 10:48:33 2014 +0100

    staging: unisys: Fix sparse error - accessing __iomem directly
    
    Copy the channel type into a temporary buffer so that code will work
    for architectures that don't support MMIO. This now works in same way
    as other tests in same function.
    
    Signed-off-by: Luke Hart <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit cec78b98df2f87a396890c802dccbf0e604c6829
Author: Mark Einon <[email protected]>
Date:   Sun Sep 14 16:59:05 2014 +0100

    staging: et131x: logical continuations should be on the previous line
    
    Fix two occurrences of the checkpatch check:
    
    CHECK: Logical continuations should be on the previous line
    
    Signed-off-by: Mark Einon <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit d855b8935e211b285aa6eb3d42e2ea810b03e043
Author: Mark Einon <[email protected]>
Date:   Sun Sep 14 16:59:04 2014 +0100

    staging: et131x: Fix 'else is not generally useful after a break or return'
    
    Fix this checkpatch warning:
    
    WARNING: else is not generally useful after a break or return
    
    Signed-off-by: Mark Einon <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit b6cb966074d6863293b774327ca5738bb27a9b3a
Author: Mark Einon <[email protected]>
Date:   Sun Sep 14 16:59:08 2014 +0100

    staging: et131x: Use variable names instead of types in sizeof
    
    A few calls to sizeof() in et131x.c give the type as a parameter
    - use the equivalent variable name instead.
    
    Signed-off-by: Mark Einon <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit ee60c8ec323167a02de357e9d9b44af850052ee3
Author: Mark Einon <[email protected]>
Date:   Sun Sep 14 16:59:07 2014 +0100

    staging: et131x: Use braces on all arms of if/else statements
    
    In some places in et131x.c, one arm of am if/else statement has braces
    and the other not - put braces on both arms where this happens.
    
    Signed-off-by: Mark Einon <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit c13756784a6a16fb5d25585a4058dd6d284fd033
Author: Mark Einon <[email protected]>
Date:   Sun Sep 14 16:59:06 2014 +0100

    staging: et131x: Remove spaces after casts
    
    In three places in et131x.c, spaces exist after a cast. Remove them.
    
    Signed-off-by: Mark Einon <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 48c8f78914720b39b9de27c6e58134abdf1f1a4c
Author: Mark Einon <[email protected]>
Date:   Sun Sep 14 16:59:02 2014 +0100

    staging: et131x: Add spinlock definition comments
    
    Checkpatch --strict advises that spinlocks should be described when
    defined, seems a good idea so this change does that.
    
    Signed-off-by: Mark Einon <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit 0c55fe2018f7f84e3620e85e4b0d5d06274862da
Author: Mark Einon <[email protected]>
Date:   Sun Sep 14 16:59:01 2014 +0100

    staging: et131x: Remove useless assignment to NULL
    
    The stack variable skb is no longer used after it's set to
    NULL. Don't set it to NULL.
    
    Reported-by: Dan Carpenter <[email protected]>
    Signed-off-by: Mark Einon <[email protected]>
    Signed-off-by: Greg Kroah-Hartman <[email protected]>

commit bacb71edb48050b46244a66ec8d49c55a89eec34
Author: Mark Einon <[email protected]>
Date:   Sun Sep 14 16:59:00 2014 +0100

    staging: et131x: Remove send_hw_lock spinlock
    
    We don't need to use this lock - the tx path is protected by the
    networking subsystem xmit_lock, so we don't also need it in
    nic_send_packet().
    
    The other use of this spinlock in et1310_enable_phy_coma() t…
ddstreet pushed a commit to ddstreet/linux that referenced this pull request Sep 25, 2014
ERROR: code indent should use tabs where possible
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: please, no spaces at the start of a line
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: Prefer [subsystem eg: netdev]_alert([subsystem]dev, ... then dev_alert(dev, ... then pr_alert(...  to printk(KERN_ALERT ...
torvalds#74: FILE: mm/debug.c:171:
+	printk(KERN_ALERT

total: 6 errors, 7 warnings, 109 lines checked

NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
      scripts/cleanfile

./patches/mm-introduce-vm_bug_on_mm.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Sasha Levin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
koct9i referenced this pull request in koct9i/linux Sep 27, 2014
GIT 005f800508eb391480f463dad3d54e5b4ec67d57

commit 35607b02dbef304fa5037236a3b43c1d8ab2aa52
Author: Alexei Starovoitov <[email protected]>
Date:   Tue Sep 23 13:50:10 2014 -0700

    sparc: bpf_jit: fix loads from negative offsets
    
    - fix BPF_LD|ABS|IND from negative offsets:
      make sure to sign extend lower 32 bits in 64-bit register
      before calling C helpers from JITed code, otherwise 'int k'
      argument of bpf_internal_load_pointer_neg_helper() function
      will be added as large unsigned integer, causing packet size
      check to trigger and abort the program.
    
      It's worth noting that JITed code for 'A = A op K' will affect
      upper 32 bits differently depending whether K is simm13 or not.
      Since small constants are sign extended, whereas large constants
      are stored in temp register and zero extended.
      That is ok and we don't have to pay a penalty of sign extension
      for every sethi, since all classic BPF instructions have 32-bit
      semantics and we only need to set correct upper bits when
      transitioning from JITed code into C.
    
    - though instructions 'A &= 0' and 'A *= 0' are odd, JIT compiler
      should not optimize them out
    
    Signed-off-by: Alexei Starovoitov <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit c899c3f36458c6af6daf4bb405a715400de39b84
Author: David S. Miller <[email protected]>
Date:   Wed Sep 24 13:53:53 2014 -0400

    parisc: Update defconfigs which were missing CONFIG_NET.
    
    Commit df568d8e ("scsi: Use 'depends' with LIBFC instead of
    'select'.")  removed what happened to be the only instance of 'select
    NET'. Defconfigs that were relying on the select now lack networking
    support.
    
    Signed-off-by: David S. Miller <[email protected]>

commit 95d77997fd8a2dc1eca9a46cde761ddb0742eec3
Author: David S. Miller <[email protected]>
Date:   Wed Sep 24 13:53:43 2014 -0400

    powerpc: Update defconfigs which were missing CONFIG_NET.
    
    Commit df568d8e ("scsi: Use 'depends' with LIBFC instead of
    'select'.")  removed what happened to be the only instance of 'select
    NET'. Defconfigs that were relying on the select now lack networking
    support.
    
    Signed-off-by: David S. Miller <[email protected]>

commit ff408ba1fc97aef86af5715641760a33f0928423
Author: David S. Miller <[email protected]>
Date:   Wed Sep 24 13:44:32 2014 -0400

    s390: Update defconfigs which were missing CONFIG_NET.
    
    Commit df568d8e ("scsi: Use 'depends' with LIBFC instead of
    'select'.")  removed what happened to be the only instance of 'select
    NET'. Defconfigs that were relying on the select now lack networking
    support.
    
    Signed-off-by: David S. Miller <[email protected]>

commit af4de1b56816fdde40801d9f6c4a0e251f5f7047
Author: David S. Miller <[email protected]>
Date:   Wed Sep 24 13:44:16 2014 -0400

    mips: Update some more defconfigs which were missing CONFIG_NET.
    
    Commit df568d8e ("scsi: Use 'depends' with LIBFC instead of
    'select'.")  removed what happened to be the only instance of 'select
    NET'. Defconfigs that were relying on the select now lack networking
    support.
    
    Signed-off-by: David S. Miller <[email protected]>

commit 1ab0b8b200ae54a03aaf63fa8ae5a241dd0cb499
Author: Michal Marek <[email protected]>
Date:   Tue Sep 23 17:44:04 2014 +0200

    sparc: Set CONFIG_NET=y in defconfigs
    
    Commit 5d6be6a5 ("scsi_netlink : Make SCSI_NETLINK dependent on NET
    instead of selecting NET") removed what happened to be the only instance
    of 'select NET'. Defconfigs that were relying on the select now lack
    networking support.
    
    Reported-by: Stephen Rothwell <[email protected]>
    Cc: [email protected]
    Signed-off-by: Michal Marek <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 925f7fadadd2fcafa6fea19252e4a1de412b9b85
Author: Michal Marek <[email protected]>
Date:   Tue Sep 23 17:44:03 2014 +0200

    sh: Set CONFIG_NET=y in defconfigs
    
    Commit 5d6be6a5 ("scsi_netlink : Make SCSI_NETLINK dependent on NET
    instead of selecting NET") removed what happened to be the only instance
    of 'select NET'. Defconfigs that were relying on the select now lack
    networking support.
    
    Reported-by: Stephen Rothwell <[email protected]>
    Cc: [email protected]
    Signed-off-by: Michal Marek <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 853e3e1d8e2f094cdb29d4a6e2359a96a44be0d8
Author: Michal Marek <[email protected]>
Date:   Tue Sep 23 17:44:02 2014 +0200

    powerpc: Set CONFIG_NET=y in defconfigs
    
    Commit 5d6be6a5 ("scsi_netlink : Make SCSI_NETLINK dependent on NET
    instead of selecting NET") removed what happened to be the only instance
    of 'select NET'. Defconfigs that were relying on the select now lack
    networking support.
    
    Reported-by: Stephen Rothwell <[email protected]>
    Cc: [email protected]
    Signed-off-by: Michal Marek <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 25fee47f9ccd834bfb95c6f95e07033b0f2d5ddf
Author: Michal Marek <[email protected]>
Date:   Tue Sep 23 17:44:01 2014 +0200

    parisc: Set CONFIG_NET=y in defconfigs
    
    Commit 5d6be6a5 ("scsi_netlink : Make SCSI_NETLINK dependent on NET
    instead of selecting NET") removed what happened to be the only instance
    of 'select NET'. Defconfigs that were relying on the select now lack
    networking support.
    
    Reported-by: Stephen Rothwell <[email protected]>
    Cc: [email protected]
    Signed-off-by: Michal Marek <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit d1630f9ef288978fc5a4cd14a5fdcb7be7a703e4
Author: Michal Marek <[email protected]>
Date:   Tue Sep 23 17:44:00 2014 +0200

    mips: Set CONFIG_NET=y in defconfigs
    
    Commit 5d6be6a5 ("scsi_netlink : Make SCSI_NETLINK dependent on NET
    instead of selecting NET") removed what happened to be the only instance
    of 'select NET'. Defconfigs that were relying on the select now lack
    networking support.
    
    Reported-by: Stephen Rothwell <[email protected]>
    Cc: [email protected]
    Signed-off-by: Michal Marek <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 0a30288da1aec914e158c2d7a3482a85f632750f
Author: Tejun Heo <[email protected]>
Date:   Tue Sep 23 15:24:32 2014 -0400

    blk-mq, percpu_ref: implement a kludge for SCSI blk-mq stall during probe
    
    blk-mq uses percpu_ref for its usage counter which tracks the number
    of in-flight commands and used to synchronously drain the queue on
    freeze.  percpu_ref shutdown takes measureable wallclock time as it
    involves a sched RCU grace period.  This means that draining a blk-mq
    takes measureable wallclock time.  One would think that this shouldn't
    matter as queue shutdown should be a rare event which takes place
    asynchronously w.r.t. userland.
    
    Unfortunately, SCSI probing involves synchronously setting up and then
    tearing down a lot of request_queues back-to-back for non-existent
    LUNs.  This means that SCSI probing may take more than ten seconds
    when scsi-mq is used.
    
    This will be properly fixed by implementing a mechanism to keep
    q->mq_usage_counter in atomic mode till genhd registration; however,
    that involves rather big updates to percpu_ref which is difficult to
    apply late in the devel cycle (v3.17-rc6 at the moment).  As a
    stop-gap measure till the proper fix can be implemented in the next
    cycle, this patch introduces __percpu_ref_kill_expedited() and makes
    blk_mq_freeze_queue() use it.  This is heavy-handed but should work
    for testing the experimental SCSI blk-mq implementation.
    
    Signed-off-by: Tejun Heo <[email protected]>
    Reported-by: Christoph Hellwig <[email protected]>
    Link: http://lkml.kernel.org/g/[email protected]
    Fixes: add703fda981 ("blk-mq: use percpu_ref for mq usage count")
    Cc: Kent Overstreet <[email protected]>
    Cc: Jens Axboe <[email protected]>
    Tested-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Jens Axboe <[email protected]>

commit 7da4b29d496b1389d3a29b55d3668efecaa08ebd
Author: Mathias Krause <[email protected]>
Date:   Tue Sep 23 22:31:07 2014 +0200

    crypto: aesni - disable "by8" AVX CTR optimization
    
    The "by8" implementation introduced in commit 22cddcc7df8f ("crypto: aes
    - AES CTR x86_64 "by8" AVX optimization") is failing crypto tests as it
    handles counter block overflows differently. It only accounts the right
    most 32 bit as a counter -- not the whole block as all other
    implementations do. This makes it fail the cryptomgr test #4 that
    specifically tests this corner case.
    
    As we're quite late in the release cycle, just disable the "by8" variant
    for now.
    
    Reported-by: Romain Francoise <[email protected]>
    Signed-off-by: Mathias Krause <[email protected]>
    Cc: Chandramouli Narayanan <[email protected]>
    Signed-off-by: Herbert Xu <[email protected]>

commit c9f21cb6388898bfe69886d001316dae7ecc9a4b
Author: Tom Lendacky <[email protected]>
Date:   Fri Sep 5 10:31:09 2014 -0500

    crypto: ccp - Check for CCP before registering crypto algs
    
    If the ccp is built as a built-in module, then ccp-crypto (whether
    built as a module or a built-in module) will be able to load and
    it will register its crypto algorithms.  If the system does not have
    a CCP this will result in -ENODEV being returned whenever a command
    is attempted to be queued by the registered crypto algorithms.
    
    Add an API, ccp_present(), that checks for the presence of a CCP
    on the system.  The ccp-crypto module can use this to determine if it
    should register it's crypto alogorithms.
    
    Cc: [email protected]
    Reported-by: Scot Doyle <[email protected]>
    Signed-off-by: Tom Lendacky <[email protected]>
    Tested-by: Scot Doyle <[email protected]>
    Signed-off-by: Herbert Xu <[email protected]>

commit d26a7730b5874a5fa6779c62f4ad7c5065a94723
Author: John David Anglin <[email protected]>
Date:   Mon Sep 22 20:54:50 2014 -0400

    parisc: Only use -mfast-indirect-calls option for 32-bit kernel builds
    
    In spite of what the GCC manual says, the -mfast-indirect-calls has
    never been supported in the 64-bit parisc compiler. Indirect calls have
    always been done using function descriptors irrespective of the
    -mfast-indirect-calls option.
    
    Recently, it was noticed that a function descriptor was always requested
    when the -mfast-indirect-calls option was specified. This caused
    problems when the option was used in  application code and doesn't make
    any sense because the whole point of the option is to avoid using a
    function descriptor for indirect calls.
    
    Fixing this broke 64-bit kernel builds.
    
    I will fix GCC but for now we need the attached change. This results in
    the same kernel code as before.
    
    Signed-off-by: John David Anglin <[email protected]>
    Cc: [email protected]  # v3.0+
    Signed-off-by: Helge Deller <[email protected]>

commit e8ee39e227d72823461907156f0046269d72ff15
Author: Tony Luck <[email protected]>
Date:   Mon Sep 22 09:35:11 2014 -0700

    [IA64] refresh arch/ia64/configs/* using "make savedefconfig"
    
    Prompted by a change to drivers/scsi/Kconfig which used to do a
    "select NET" but now does a "depends on NET". This meant that some
    configurations ended up without CONFIG_NET=y
    
    Signed-off-by Tony Luck <[email protected]>

commit f8adaf0ae978252c9f7e29e96aefcd8fcaf806ba
Author: Emil Goode <[email protected]>
Date:   Tue Sep 23 00:49:55 2014 +0200

    brcmfmac: Fix off by one bug in brcmf_count_20mhz_channels()
    
    In the brcmf_count_20mhz_channels function we are looping through a list
    of channels received from firmware. Since the index of the first channel
    is 0 the condition leads to an off by one bug. This is causing us to hit
    the WARN_ON_ONCE(1) calls in the brcmu_d11n_decchspec function, which is
    how I discovered the bug.
    
    Introduced by:
    commit b48d891676f756d48b4d0ee131e4a7a5d43ca417
    ("brcmfmac: rework wiphy structure setup")
    
    Acked-by: Arend van Spriel <[email protected]>
    Signed-off-by: Emil Goode <[email protected]>
    Signed-off-by: John W. Linville <[email protected]>

commit 370ce45b5986118fa496dddbcd7039e1aa1a418f
Author: Alex Deucher <[email protected]>
Date:   Tue Sep 23 10:20:13 2014 -0400

    drm/radeon/cik: use a separate counter for CP init timeout
    
    Otherwise we may fail to init the second compute ring.
    
    Noticed-by: Christian König <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>
    Cc: [email protected]

commit c84db77010877da6c5da119868ed54c43d59e726
Author: Jani Nikula <[email protected]>
Date:   Wed Sep 17 15:34:58 2014 +0300

    drm/i915/hdmi: fix hdmi audio state readout
    
    Check the correct bit for audio. Seems like a copy-paste error from the
    start:
    
    commit 9ed109a7b445e3f073d8ea72f888ec80c0532465
    Author: Daniel Vetter <[email protected]>
    Date:   Thu Apr 24 23:54:52 2014 +0200
    
        drm/i915: Track has_audio in the pipe config
    
    Reported-by: Martin Andersen <[email protected]>
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82756
    Cc: [email protected] # 3.16+
    Cc: Daniel Vetter <[email protected]>
    Reviewed-by: Daniel Vetter <[email protected]>
    Signed-off-by: Jani Nikula <[email protected]>

commit 22cb99af39b5d4aae075a5bc9da615ba245227cd
Author: Brad Volkin <[email protected]>
Date:   Mon Sep 22 08:25:21 2014 -0700

    drm/i915: Don't leak command parser tables on suspend/resume
    
    Ring init and cleanup are not balanced because we re-init the rings on
    resume without having cleaned them up on suspend. This leads to the
    driver leaking the parser's hash tables with a kmemleak signature such
    as this:
    
    unreferenced object 0xffff880405960980 (size 32):
      comm "systemd-udevd", pid 516, jiffies 4294896961 (age 10202.044s)
      hex dump (first 32 bytes):
        d0 85 46 c0 ff ff ff ff 00 00 00 00 00 00 00 00  ..F.............
        98 60 28 04 04 88 ff ff 00 00 00 00 00 00 00 00  .`(.............
      backtrace:
        [<ffffffff81816f9e>] kmemleak_alloc+0x4e/0xb0
        [<ffffffff811fa678>] kmem_cache_alloc_trace+0x168/0x2f0
        [<ffffffffc03e20a5>] i915_cmd_parser_init_ring+0x2a5/0x3e0 [i915]
        [<ffffffffc04088a2>] intel_init_ring_buffer+0x202/0x470 [i915]
        [<ffffffffc040c998>] intel_init_vebox_ring_buffer+0x1e8/0x2b0 [i915]
        [<ffffffffc03eff59>] i915_gem_init_hw+0x2f9/0x3a0 [i915]
        [<ffffffffc03f0057>] i915_gem_init+0x57/0x1d0 [i915]
        [<ffffffffc045e26a>] i915_driver_load+0xc0a/0x10e0 [i915]
        [<ffffffffc02e0d5d>] drm_dev_register+0xad/0x100 [drm]
        [<ffffffffc02e3b9f>] drm_get_pci_dev+0x8f/0x200 [drm]
        [<ffffffffc03c934b>] i915_pci_probe+0x3b/0x60 [i915]
        [<ffffffff81436725>] local_pci_probe+0x45/0xa0
        [<ffffffff81437a69>] pci_device_probe+0xd9/0x130
        [<ffffffff81524f4d>] driver_probe_device+0x12d/0x3e0
        [<ffffffff815252d3>] __driver_attach+0x93/0xa0
        [<ffffffff81522e1b>] bus_for_each_dev+0x6b/0xb0
    
    This patch extends the current convention of checking whether a
    resource is already allocated before allocating it during ring init.
    Longer term it might make sense to only init the rings once.
    
    Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83794
    Tested-by: Kari Suvanto <[email protected]>
    Signed-off-by: Brad Volkin <[email protected]>
    Reviewed-by: Daniel Vetter <[email protected]>
    Cc: [email protected]
    Signed-off-by: Jani Nikula <[email protected]>

commit f3670394c29ff3730638762c1760fd2f624e6d7b
Author: Linus Torvalds <[email protected]>
Date:   Mon Sep 22 23:05:49 2014 -0700

    Revert "x86/efi: Fixup GOT in all boot code paths"
    
    This reverts commit 9cb0e394234d244fe5a97e743ec9dd7ddff7e64b.
    
    It causes my Sony Vaio Pro 11 to immediately reboot at startup.
    
    Acked-by: Ingo Molnar <[email protected]>
    Cc: Peter Anvin <[email protected]>
    Cc: Maarten Lankhorst <[email protected]>
    Cc: Ard Biesheuvel <[email protected]>
    Cc: Matt Fleming <[email protected]>
    Signed-off-by: Linus Torvalds <[email protected]>

commit 7cbeb9f90db8e56856db7568520b735732d34d86
Author: Yinghai Lu <[email protected]>
Date:   Mon Sep 22 20:05:45 2014 -0600

    PCI: pciehp: Fix pcie_wait_cmd() timeout
    
    pcie_poll_cmd() take msecs instead of jiffies, so convert timeout to msecs.
    
    Fixes: 40b960831cfa ("PCI: pciehp: Compute timeout from hotplug command start time")
    Signed-off-by: Yinghai Lu <[email protected]>
    Signed-off-by: Bjorn Helgaas <[email protected]>

commit 4a0c081eff43a11c65dee3ad6c457f7f58bcebe0
Author: Florian Fainelli <[email protected]>
Date:   Mon Sep 22 11:54:43 2014 -0700

    net: bcmgenet: call bcmgenet_dma_teardown in bcmgenet_fini_dma
    
    We should not be manipulaging the DMA_CTRL registers directly by writing
    0 to them to disable DMA. This is an operation that needs to be timed to
    make sure the DMA engines have been properly stopped since their state
    machine stops on a packet boundary, not immediately.
    
    Make sure that tha bcmgenet_fini_dma() calls bcmgenet_dma_teardown() to
    ensure a proper DMA engine state. As a result, we need to reorder the
    function bodies to resolve the use dependency.
    
    Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file")
    Signed-off-by: Florian Fainelli <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 478a010c9235ca92e66cc5058b42e30e33275ad4
Author: Florian Fainelli <[email protected]>
Date:   Mon Sep 22 11:54:42 2014 -0700

    net: bcmgenet: fix TX reclaim accounting for fragments
    
    The GENET driver supports SKB fragments, and succeeds in transmitting
    them properly, but when reclaiming these transmitted fragments, we will
    only update the count of free buffer descriptors by 1, even for SKBs
    with fragments. This leads to the networking stack thinking it has more
    room than the hardware has when pushing new SKBs, and backing off
    consequently because we return NETDEV_TX_BUSY.
    
    Fix this by accounting for the SKB nr_frags plus one (itself) and update
    ring->free_bds accordingly with that value for each iteration loop in
    __bcmgenet_tx_reclaim().
    
    Fixes: 1c1008c793fa ("net: bcmgenet: add main driver file")
    Signed-off-by: Florian Fainelli <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit a35165ca101695aa2cc5a6300ef69ae60be39a49
Author: Eric Dumazet <[email protected]>
Date:   Mon Sep 22 10:38:16 2014 -0700

    ipv4: do not use this_cpu_ptr() in preemptible context
    
    this_cpu_ptr() in preemptible context is generally bad
    
    Sep 22 05:05:55 br kernel: [   94.608310] BUG: using smp_processor_id()
    in
    preemptible [00000000] code: ip/2261
    Sep 22 05:05:55 br kernel: [   94.608316] caller is
    tunnel_dst_set.isra.28+0x20/0x60 [ip_tunnel]
    Sep 22 05:05:55 br kernel: [   94.608319] CPU: 3 PID: 2261 Comm: ip Not
    tainted
    3.17.0-rc5 #82
    
    We can simply use raw_cpu_ptr(), as preemption is safe in these
    contexts.
    
    Should fix https://bugzilla.kernel.org/show_bug.cgi?id=84991
    
    Signed-off-by: Eric Dumazet <[email protected]>
    Reported-by: Joe <[email protected]>
    Fixes: 9a4aa9af447f ("ipv4: Use percpu Cache route in IP tunnels")
    Acked-by: Tom Herbert <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit ff1b129403aad9a5c7cc9a6eaaffe4bd5fc0c67f
Author: Alex Deucher <[email protected]>
Date:   Mon Sep 22 17:28:29 2014 -0400

    drm/radeon: add PX quirk for asus K53TK
    
    Seems to have problems turning the dGPU on/off.
    
    bug:
    https://bugzilla.kernel.org/show_bug.cgi?id=51381
    
    Signed-off-by: Alex Deucher <[email protected]>

commit 8aff6ad5a393b8e2ad00dce4d278ecf41397bf0d
Author: Alex Deucher <[email protected]>
Date:   Wed Sep 17 11:31:02 2014 -0400

    drm/radeon: add a backlight quirk for Amilo Xi 2550
    
    Only the acpi backlight seems to work.  Using the
    radeon backlight controller causes the backlight to
    go off.
    
    bug:
    https://bugs.freedesktop.org/show_bug.cgi?id=81382
    
    Signed-off-by: Alex Deucher <[email protected]>

commit bc13018b5eba26ca229b33763c9e61fac31a1925
Author: Alex Deucher <[email protected]>
Date:   Tue Sep 16 20:57:26 2014 -0400

    drm/radeon: add a module parameter for backlight control (v2)
    
    Add a module parameter to disable the radeon GPU backlight
    controller to override the automatic detection.  Some
    laptops seems to indicate that they use the integrated
    controller, but appear to actually use an external
    controller.
    
    bug:
    https://bugs.freedesktop.org/show_bug.cgi?id=81382
    
    v2: fix module parameter description
    
    Signed-off-by: Alex Deucher <[email protected]>

commit f55e03b975c230758c8f164347dfa10103f60e2c
Author: Michel Dänzer <[email protected]>
Date:   Fri Sep 19 12:22:10 2014 +0900

    drm/radeon: Update IH_RB_RPTR register after each processed interrupt
    
    This might decrease the chance of IH ring buffer overflows.
    
    Signed-off-by: Michel Dänzer <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>

commit 6cc2fda213d7a939e151ea1b5ec8033cce732c08
Author: Michel Dänzer <[email protected]>
Date:   Fri Sep 19 12:22:07 2014 +0900

    drm/radeon: Make IH ring overflow debugging output more useful
    
    Use the same format for all ring indices, and fix the calculation of the
    post-overflow RPTR.
    
    Signed-off-by: Michel Dänzer <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>

commit 11bab0ae9991b165b542294806013d1e06fb3294
Author: Michel Dänzer <[email protected]>
Date:   Fri Sep 19 12:07:11 2014 +0900

    drm/radeon: Clear RB_OVERFLOW bit earlier
    
    Otherwise the bit remains set in rdev->ih.rptr, so the wptr can never
    match that and we still have an infinite loop.
    
    This fix allows me to successfully recover from an IH ring buffer
    overflow.
    
    Signed-off-by: Michel Dänzer <[email protected]>
    Signed-off-by: Alex Deucher <[email protected]>

commit 1f2bb4acc125edc2c06db3ad3e8c699bc075ad52
Author: Christoffer Dall <[email protected]>
Date:   Mon Sep 22 22:10:36 2014 +0200

    arm/arm64: KVM: Fix unaligned access bug on gicv2 access
    
    We were using an atomic bitop on the vgic_v2.vgic_elrsr field which was
    not aligned to the natural size on 64-bit platforms.  This bug showed up
    after QEMU correctly identifies the pl011 line as being level-triggered,
    and not edge-triggered.
    
    These data structures are protected by a spinlock so simply use a
    non-atomic version of the accessor instead.
    
    Tested-by: Joel Schopp <[email protected]>
    Reported-by: Riku Voipio <[email protected]>
    Signed-off-by: Christoffer Dall <[email protected]>

commit 46f341ffcfb5d8530f7d1e60f3be06cce6661b62
Author: Jens Axboe <[email protected]>
Date:   Tue Sep 16 13:38:51 2014 -0600

    genhd: fix leftover might_sleep() in blk_free_devt()
    
    Commit 2da78092 changed the locking from a mutex to a spinlock,
    so we now longer sleep in this context. But there was a leftover
    might_sleep() in there, which now triggers since we do the final
    free from an RCU callback. Get rid of it.
    
    Reported-by: Pontus Fuchs <[email protected]>
    Signed-off-by: Jens Axboe <[email protected]>

commit 09f3756bb9a82835b0c2a9b50f36b47aa42f2c61
Author: Tobias Klauser <[email protected]>
Date:   Fri Sep 19 16:16:25 2014 +0200

    dm9000: Return an ERR_PTR() in all error conditions of dm9000_parse_dt()
    
    In one error condition dm9000_parse_dt() returns NULL, however the
    return value is checked using IS_ERR() in dm9000_probe(), leading to the
    error not being properly propagated if CONFIG_OF is not enabled or the
    device tree data is not available. Fix this by also returning an
    ERR_PTR() in this case.
    
    Fixes: 0b8bf1baabe5 (net: dm9000: Allow instantiation using device tree)
    Signed-off-by: Tobias Klauser <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 2ba7d144d39a596cf5d09390ee7de21cfb69cf2b
Author: Wojciech Dubowik <[email protected]>
Date:   Thu Sep 18 08:30:41 2014 +0200

    ath9k: Fix NULL pointer dereference on early irq
    
    The ah struct might not have been initialized when
    interrupt comes so check for it.
    
    Signed-off-by: Wojciech Dubowik <[email protected]>
    Signed-off-by: John W. Linville <[email protected]>

commit fa5c107cc887886a04ee2dbce05af86de220ae48
Author: Loic Poulain <[email protected]>
Date:   Tue Sep 16 14:53:58 2014 +0200

    net: rfkill: gpio: Fix clock status
    
    Clock is disabled when the device is blocked.
    So, clock_enabled is the logical negation of "blocked".
    
    Signed-off-by: Loic Poulain <[email protected]>
    Signed-off-by: John W. Linville <[email protected]>

commit 85911d71109d3dda8bb35515b78bcc1de6837785
Author: Dan Carpenter <[email protected]>
Date:   Fri Sep 19 13:40:25 2014 +0300

    r8169: fix an if condition
    
    There is an extra semi-colon so __rtl8169_set_features() is called every
    time.
    
    Fixes: 929a031dfd62 ('r8169: adjust __rtl8169_set_features')
    Signed-off-by: Dan Carpenter <[email protected]>
    Acked-by: Hayes Wang <[email protected]>--
    Signed-off-by: David S. Miller <[email protected]>

commit d70b1137233836be1d71bd53ae60bec6c9e7203c
Author: hayeswang <[email protected]>
Date:   Fri Sep 19 15:17:18 2014 +0800

    r8152: disable ALDPS
    
    If the hw is in ALDPS mode, the hw may have no response for accessing
    the most registers. Therefore, the ALDPS should be disabled before
    accessing the hw in rtl_ops.init(), rtl_ops.disable(), rtl_ops.up(),
    and rtl_ops.down(). Regardless of rtl_ops.enable(), because the hw
    wouldn't enter ALDPS mode when linking on. The hw would enter the
    ALDPS mode after several seconds when link down occurs and the ALDPS
    is enabled.
    
    Signed-off-by: Hayes Wang <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit b49fe36208b45f76dfbcfcd3afd952a33fa9f5ce
Author: Eric Dumazet <[email protected]>
Date:   Thu Sep 18 11:00:27 2014 -0700

    ipoib: validate struct ipoib_cb size
    
    To catch future errors sooner.
    
    Signed-off-by: Eric Dumazet <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 257117862634d89de33fec74858b1a0ba5ab444b
Author: Eric Dumazet <[email protected]>
Date:   Thu Sep 18 08:02:05 2014 -0700

    net: sched: shrink struct qdisc_skb_cb to 28 bytes
    
    We cannot make struct qdisc_skb_cb bigger without impacting IPoIB,
    or increasing skb->cb[] size.
    
    Commit e0f31d849867 ("flow_keys: Record IP layer protocol in
    skb_flow_dissect()") broke IPoIB.
    
    Only current offender is sch_choke, and this one do not need an
    absolutely precise flow key.
    
    If we store 17 bytes of flow key, its more than enough. (Its the actual
    size of flow_keys if it was a packed structure, but we might add new
    fields at the end of it later)
    
    Signed-off-by: Eric Dumazet <[email protected]>
    Fixes: e0f31d849867 ("flow_keys: Record IP layer protocol in skb_flow_dissect()")
    Signed-off-by: David S. Miller <[email protected]>

commit 476c18850c6cbaa3f2bb661ae9710645081563b9
Author: Vlad Yasevich <[email protected]>
Date:   Thu Sep 18 10:31:17 2014 -0400

    tg3: Work around HW/FW limitations with vlan encapsulated frames
    
    TG3 appears to have an issue performing TSO and checksum offloading
    correclty when the frame has been vlan encapsulated (non-accelrated).
    In these cases, tcp checksum is not correctly updated.
    
    This patch attempts to work around this issue.  After the patch,
    802.1ad vlans start working correctly over tg3 devices.
    
    CC: Prashant Sreedharan <[email protected]>
    CC: Michael Chan <[email protected]>
    Signed-off-by: Vladislav Yasevich <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 4e66cd13ff9cd7eaae69e2fae0335d8d99d8afdf
Author: sundarjdev <[email protected]>
Date:   Mon Sep 22 10:31:39 2014 -0700

    hwmon: (tmp103) Fix resource leak bug in tmp103 temperature sensor driver
    
    tmp103 temperature sensor driver registers with the hwmon framework by calling
    hwmon_device_register_with_groups but does not have a .remove method to call
    hwmon_device_unregister to unregister from the framework when the device is no
    longer needed. Fix this by calling devm_hwmon_device_register_with_groups.
    
    Signed-off-by: Sundar J Dev <[email protected]>
    Signed-off-by: Guenter Roeck <[email protected]>

commit 07d92d5cc977a7fe1e683e1d4a6f723f7f2778cb
Author: Nicolas Dichtel <[email protected]>
Date:   Wed Sep 17 10:08:08 2014 +0200

    macvlan: allow to enqueue broadcast pkt on virtual device
    
    Since commit 412ca1550cbe ("macvlan: Move broadcasts into a work queue"), the
    driver uses tx_queue_len of the master device as the limit of packets enqueuing.
    Problem is that virtual drivers have this value set to 0, thus all broadcast
    packets were rejected.
    Because tx_queue_len was arbitrarily chosen, I replace it with a static limit
    of 1000 (also arbitrarily chosen).
    
    CC: Herbert Xu <[email protected]>
    Reported-by: Thibaut Collet <[email protected]>
    Suggested-by: Thibaut Collet <[email protected]>
    Tested-by: Thibaut Collet <[email protected]>
    Signed-off-by: Nicolas Dichtel <[email protected]>
    Acked-by: Herbert Xu <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit 8b95741569eabc5eb17da71d1d3668cdb0bef86c
Author: Jens Axboe <[email protected]>
Date:   Fri Sep 19 13:10:29 2014 -0600

    blk-mq: use blk_mq_start_hw_queues() when running requeue work
    
    When requests are retried due to hw or sw resource shortages,
    we often stop the associated hardware queue. So ensure that we
    restart the queues when running the requeue work, otherwise the
    queue run will be a no-op.
    
    Signed-off-by: Jens Axboe <[email protected]>

commit 6b55e1f2d0a5e462e52678278ab749468f1db81c
Author: Jens Axboe <[email protected]>
Date:   Fri Sep 19 08:04:53 2014 -0600

    blk-mq: fix potential oops on out-of-memory in __blk_mq_alloc_rq_maps()
    
    __blk_mq_alloc_rq_maps() can be invoked multiple times, if we scale
    back the queue depth if we are low on memory. So don't clear
    set->tags when we fail, this is handled directly in
    the parent function, blk_mq_alloc_tag_set().
    
    Reported-by: Robert Elliott  <[email protected]>
    Signed-off-by: Jens Axboe <[email protected]>

commit a57a178a490345c7236b0077b3de005754389ed6
Author: Christoph Hellwig <[email protected]>
Date:   Tue Sep 16 14:44:07 2014 -0700

    blk-mq: avoid infinite recursion with the FUA flag
    
    We should not insert requests into the flush state machine from
    blk_mq_insert_request.  All incoming flush requests come through
    blk_{m,s}q_make_request and are handled there, while blk_execute_rq_nowait
    should only be called for BLOCK_PC requests.  All other callers
    deal with requests that already went through the flush statemchine
    and shouldn't be reinserted into it.
    
    Reported-by: Robert Elliott  <[email protected]>
    Debugged-by: Ming Lei <[email protected]>
    Signed-off-by: Christoph Hellwig <[email protected]>
    Signed-off-by: Jens Axboe <[email protected]>

commit 683d0e126232d898a481daa3a4ca032c2b1a9660
Author: David Hildenbrand <[email protected]>
Date:   Thu Sep 18 11:04:31 2014 +0200

    blk-mq: Avoid race condition with uninitialized requests
    
    This patch should fix the bug reported in
    https://lkml.org/lkml/2014/9/11/249.
    
    We have to initialize at least the atomic_flags and the cmd_flags when
    allocating storage for the requests.
    
    Otherwise blk_mq_timeout_check() might dereference uninitialized
    pointers when racing with the creation of a request.
    
    Also move the reset of cmd_flags for the initializing code to the point
    where a request is freed. So we will never end up with pending flush
    request indicators that might trigger dereferences of invalid pointers
    in blk_mq_timeout_check().
    
    Cc: [email protected]
    Signed-off-by: David Hildenbrand <[email protected]>
    Reported-by: Paulo De Rezende Pinatti <[email protected]>
    Tested-by: Paulo De Rezende Pinatti <[email protected]>
    Acked-by: Christian Borntraeger <[email protected]>
    Signed-off-by: Jens Axboe <[email protected]>

commit 538b75341835e3c2041ff066408de10d24fdc830
Author: Jens Axboe <[email protected]>
Date:   Tue Sep 16 10:37:37 2014 -0600

    blk-mq: request deadline must be visible before marking rq as started
    
    When we start the request, we set the deadline and flip the bits
    marking the request as started and non-complete. However, it's
    important that the deadline store is ordered before flipping the
    bits, otherwise we could have a small window where the request is
    marked started but with an invalid deadline. This can confuse the
    timeout handling.
    
    Suggested-by: Ming Lei <[email protected]>
    Signed-off-by: Jens Axboe <[email protected]>

commit 4e5f9ef380b00871995d638c03e7ae7c67244e31
Author: David S. Miller <[email protected]>
Date:   Mon Sep 22 13:25:51 2014 -0400

    pch_gbe: 'select' NET_PTP_CLASSIFY.
    
    Fixes the following randconfig build failure:
    
    > drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c: In function
    > ‘pch_ptp_match’:
    > drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c:130:2: error:
    > implicit declaration of function ‘ptp_classify_raw’
    > [-Werror=implicit-function-declaration]
    >   if (ptp_classify_raw(skb) == PTP_CLASS_NONE)
    >   ^
    > cc1: some warnings being treated as errors
    > make[5]: *** [drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.o] Error 1
    
    Reported-by: Jim Davis <[email protected]>
    Signed-off-by: David S. Miller <[email protected]>

commit df568d8e5250bf24e38c69ad4374baf0f8d279ba
Author: David S. Miller <[email protected]>
Date:   Mon Sep 22 13:14:33 2014 -0400

    scsi: Use 'depends' with LIBFC instead of 'select'.
    
    LIBFC depends upon SCSI_FC_ATTRS and select's CRC32C.
    
    The only alternative would be to 'select' CRC32C and all of
    SCSI_FC_ATTRS direct and indirect dependencies in the Kconfig section
    for every LIBFCOE user which makes little sense.
    
    Subsequently, use 'depends' instead of 'select' for LIBFCOE too.
    
    Signed-off-by: David S. Miller <[email protected]>

commit 25476b0209b2e48dfb689e1b4cf7278875082b1f
Author: Jack Morgenstein <[email protected]>
Date:   Thu Sep 11 14:11:20 2014 +0300

    IB/mlx4: Fix VF mac handling in RoCE
    
    We had several problems here.  First, a race condition on QP1 mac
    handling between mlx4_ib_update_qps and mlx4_ib_modify_qp, which is
    fixed by taking the qp mutex in mlx4_ib_update_qps.
    
    Also, qp->pri.smac_port was not updated in mlx4_ib_update_qps.
    
    Last, in __mlx4_ib_modify_qp we did not properly handle the case where
    the mac is zero, but port is non-zero.
    
    Signed-off-by: Jack Morgenstein <[email protected]>
    Signed-off-by: Or Gerlitz <[email protected]>
    Signed-off-by: Roland Dreier <[email protected]>

commit 3dec48788817fce4a029cbab14f2b407ae78478f
Author: Jack Morgenstein <[email protected]>
Date:   Thu Sep 11 14:11:19 2014 +0300

    IB/mlx4: Do not allow APM under RoCE
    
    Automatic Path Migration is not supported under RoCE. Therefore,
    return a "not-supported" error if the caller attempts to set an
    alternate path in a QP context.
    
    In addition, if there are no IB ports configured, do not report
    APM capability in the device flags returned by mlx4_ib_query_device.
    
    Signed-off-by: Jack Morgenstein <[email protected]>
    Signed-off-by: Or Gerlitz <[email protected]>
    Signed-off-by: Roland Dreier <[email protected]>

commit d24d9f43384b5933867a5786934e130efa8b5c92
Author: Jack Morgenstein <[email protected]>
Date:   Thu Sep 11 14:11:18 2014 +0300

    IB/mlx4: Don't update QP1 in native mode
    
    For native functions (non-SR-IOV), there's no reason to update
    the smac_index, as QP1 is a GSI QP.
    
    Signed-off-by: Jack Morgenstein <[email protected]>
    Signed-off-by: Or Gerlitz <[email protected]>
    Signed-off-by: Roland Dreier <[email protected]>

commit 3e0629cb6c0518423c9e2671bbe8ec15dde5dcaf
Author: Jack Morgenstein <[email protected]>
Date:   Thu Sep 11 14:11:17 2014 +0300

    IB/mlx4: Avoid accessing netdevice when building RoCE qp1 header
    
    The source MAC is needed in RoCE when building the QP1 header.
    
    Currently, this is obtained from the source net device. However, the net
    device may not yet exist, or can be destroyed in parallel to this QP1 send
    operation (e.g through the VPI port change flow) so accessing it may cause
    a kernel crash.
    
    To fix this, we maintain a source MAC cache per port for the net device in
    struct mlx4_ib_roce.  This cached MAC is initialized to be the default MAC
    address obtained during HCA initialization via QUERY_PORT. This cached MAC
    is updated via the netdev event notifier handler.
    
    Since the cached MAC is held in an atomic64 object, we do not need locking
    when accessing it.
    
    Signed-off-by: Jack Morgenstein <[email protected]>
    Signed-off-by: Or Gerlitz <[email protected]>
    Signed-off-by: Roland Dreier <[email protected]>

commit f4fd40b26bd597e203639281859a758402550d62
Author: Jack Morgenstein <[email protected]>
Date:   Thu Sep 11 14:11:16 2014 +0300

    mlx4: Fix mlx4 reg/unreg mac to work properly with 0-mac addresses
    
    There is a chance that the VF mlx4 RoCE driver (mlx4_ib) may see a 0-mac
    as the current default MAC address when a RoCE interface first comes up.
    
    In this case, the RoCE driver registers the 0-mac to get its MAC index --
    used in the INIT2RTR transition when it creates its proxy Q1 qp's.
    
    If we do not allow QP1 to be created, the RoCE driver will not come up.
    If we do not register the 0-mac, but simply use a random mac-index,
    QP1 will attempt to send packets with an someone's else source MAC which
    will get the system into more troubled.
    
    Since a 0-mac was previously used to indicate a free slot, this leads to
    errors, both when the 0-mac is registered and when it is unregistered.
    
    The required fix is to check in addition that the slot containing the
    0-mac has a reference count of zero.
    
    Additionally, when comparing MAC addresses, need to mask out the 2 MSBs
    of the u64 mac on both sides of the comparison.
    
    Note that when the EN driver (mlx4_en) comes up, it set itself a proper
    mac --> the RoCE driver gets to be notified on that and further handing
    is done with the update qp command, as was added by commit 9433c188915c
    ("IB/mlx4: Invoke UPDATE_QP for proxy QP1 on MAC changes").
    
    Signed-off-by: Jack Morgenstein <[email protected]>
    Signed-off-by: Or Gerlitz <[email protected]>
    Signed-off-by: Roland Dreier <[email protected]>

commit a59c5850f09b4c2d6ad2fc47e5e1be8d654529d6
Author: Matan Barak <[email protected]>
Date:   Tue Sep 2 15:32:34 2014 +0300

    IB/core: When marshaling uverbs path, clear unused fields
    
    When marsheling a user path to the kernel struct ib_sa_path, need
    to zero smac, dmac and set the vlan id to the "no vlan" value.
    
    Fixes: dd5f03beb4f7 ("IB/core: Ethernet L2 attributes in verbs/cm structures")
    Reported-by: Aleksey Senin <[email protected]>
    Signed-off-by: Matan Barak <[email protected]>
    Signed-off-by: Or Gerlitz <[email protected]>
    Signed-off-by: Roland Dreier <[email protected]>

commit 4bf9715f184969dc703bde7be94919995024a6a9
Author: Moni Shoua <[email protected]>
Date:   Thu Aug 21 14:28:42 2014 +0300

    IB/mlx4: Avoid executing gid task when device is being removed
    
    When device is being removed (e.g during VPI port link type change
    from ETH to IB), tasks for gid table changes should not be executed.
    
    Flush the current queue of tasks and block further tasks from entering the queue.
    
    Signed-off-by: Moni Shoua <[email protected]>
    Signed-off-by: Or Gerlitz <[email protected]>
    Signed-off-by: Roland Dreier <[email protected]>

commit dba3ad2addcd74ec850e510f3b8a9d046cc24ef3
Author: Jack Morgenstein <[email protected]>
Date:   Thu Aug 21 14:28:41 2014 +0300

    IB/mlx4: Fix lockdep splat for the iboe lock
    
    Chuck Lever reported the following stack trace:
    
        =================================
        [ INFO: inconsistent lock state ]
        3.16.0-rc2-00024-g2e78883 #17 Tainted: G            E
        ---------------------------------
        inconsistent {SOFTIRQ-ON-W} -> {IN-SOFTIRQ-W} usage.
        swapper/0/0 [HC0[0]:SC1[1]:HE1:SE0] takes:
        (&(&iboe->lock)->rlock){+.?...}, at: [<ffffffffa065f68b>] mlx4_ib_addr_event+0xdb/0x1a0 [mlx4_ib]
        {SOFTIRQ-ON-W} state was registered at:
         [<ffffffff810b3110>] mark_irqflags+0x110/0x170
         [<ffffffff810b4806>] __lock_acquire+0x2c6/0x5b0
         [<ffffffff810b4bd9>] lock_acquire+0xe9/0x120
         [<ffffffff815f7f6e>] _raw_spin_lock+0x3e/0x80
         [<ffffffffa0661084>] mlx4_ib_scan_netdevs+0x34/0x260 [mlx4_ib]
         [<ffffffffa06612db>] mlx4_ib_netdev_event+0x2b/0x40 [mlx4_ib]
         [<ffffffff81522219>] register_netdevice_notifier+0x99/0x1e0
         [<ffffffffa06626e3>] mlx4_ib_add+0x743/0xbc0 [mlx4_ib]
         [<ffffffffa05ec168>] mlx4_add_device+0x48/0xa0 [mlx4_core]
         [<ffffffffa05ec2c3>] mlx4_register_interface+0x73/0xb0 [mlx4_core]
         [<ffffffffa05c505e>] cm_req_handler+0x13e/0x460 [ib_cm]
         [<ffffffff810002e2>] do_one_initcall+0x112/0x1c0
         [<ffffffff810e8264>] do_init_module+0x34/0x190
         [<ffffffff810ea62f>] load_module+0x5cf/0x740
         [<ffffffff810ea939>] SyS_init_module+0x99/0xd0
         [<ffffffff815f8fd2>] system_call_fastpath+0x16/0x1b
        irq event stamp: 336142
        hardirqs last  enabled at (336142): [<ffffffff810612f5>] __local_bh_enable_ip+0xb5/0xc0
        hardirqs last disabled at (336141): [<ffffffff81061296>] __local_bh_enable_ip+0x56/0xc0
        softirqs last  enabled at (336004): [<ffffffff8106123a>] _local_bh_enable+0x4a/0x50
        softirqs last disabled at (336005): [<ffffffff810617a4>] irq_exit+0x44/0xd0
    
        other info that might help us debug this:
        Possible unsafe locking scenario:
    
              CPU0
              ----
         lock(&(&iboe->lock)->rlock);
         <Interrupt>
           lock(&(&iboe->lock)->rlock);
    
        *** DEADLOCK ***
    
    The above problem was caused by the spin lock being taken both in the process
    context and in a soft-irq context (in a netdev notifier handler).
    
    The required fix is to use spin_lock/unlock_bh() instead of spin_lock/unlock
    on the iboe lock.
    
    Reported-by: Chuck Lever <[email protected]>
    Signed-off-by: Jack Morgenstein <[email protected]>
    Signed-off-by: Or Gerlitz <[email protected]>
    Signed-off-by: Roland Dreier <[email protected]>

commit bccb84f1dfab92ed180adf09c76cfa9ddc90edb9
Author: Moni Shoua <[email protected]>
Date:   Thu Aug 21 14:28:40 2014 +0300

    IB/mlx4: Get upper dev addresses as RoCE GIDs when port comes up
    
    When a RoCE port becomes active and the netdev of the port has upper
    device (e.g bond/team), GIDs derived from the upper dev should appear
    in the port's RoCE GID table.
    
    Signed-off-by: Moni Shoua <[email protected]>
    Signed-off-by: Or Gerlitz <[email protected]>
    Signed-off-by: Roland Dreier <[email protected]>

commit 655b2aaefc353604f9975c31960d9722e6eda449
Author: Moni Shoua <[email protected]>
Date:   Thu Aug 21 14:28:39 2014 +0300

    IB/mlx4: Reorder steps in RoCE GID table initialization
    
    There's no need to reset the gid table twice and we need to do it only
    for Ethernet ports. Also, no need to actively scan ndetdevs since it's
    being done immediatly after we register netdev notifiers.
    
    Signed-off-by: Moni Shoua <[email protected]>
    Signed-off-by: Or Gerlitz <[email protected]>
    Signed-off-by: Roland Dreier <[email protected]>

commit f5c4834d9328c4ed9fe5dcbec6128d6da16db69a
Author: Moni Shoua <[email protected]>
Date:   Thu Aug 21 14:28:38 2014 +0300

    IB/mlx4: Don't duplicate the default RoCE GID
    
    When reading the IPv6 addresses from the net-device, make sure to
    avoid adding a duplicate entry to the GID table because of equality
    between the default GID we generate and the default IPv6 link-local
    address of the device.
    
    Fixes: acc4fccf4eff ("IB/mlx4: Make sure GID index 0 is always occupied")
    Signed-off-by: Moni Shoua <[email protected]>
    Signed-off-by: Or Gerlitz <[email protected]>
    Signed-off-by: Roland Dreier <[email protected]>

commit e381835cf1b8e3b2857277dbc3b77d8c5350f70a
Author: Moni Shoua <[email protected]>
Date:   Thu Aug 21 14:28:37 2014 +0300

    IB/mlx4: Avoid null pointer dereference in mlx4_ib_scan_netdevs()
    
    When Ethernet netdev is not present for a port (e.g. when the link
    layer type of the port is InfiniBand) it's possible to dereference a
    null pointer when we do netdevice scanning.
    
    To fix that, we move a section of code that needs to run only when
    netdev is present to a proper if () statement.
    
    Fixes: ad4885d279b6 ("IB/mlx4: Build the port IBoE GID table properly under bonding")
    Reported-by: Dan Carpenter <[email protected]>
    Signed-off-by: Moni Shoua <[email protected]>
    Signed-off-by: Or Gerlitz <[email protected]>
    Signed-off-by: Roland Dreier <[email protected]>

commit 61aabb3c91c1b03478ffc1a4a2573f825e7f35f9
Author: Or Gerlitz <[email protected]>
Date:   Tue Sep 2 17:08:43 2014 +0300

    IB/iser: Bump version to 1.4.1
    
    Signed-off-by: Roi Dayan <[email protected]>
    Signed-off-by: Or Gerlitz <[email protected]>
    Signed-off-by: Roland Dreier <[email protected]>

commit 91eb1df39a1fba21bbc28895a84630782cd442ed
Author: Sagi Grimberg <[email protected]>
Date:   Tue Sep 2 17:08:42 2014 +0300

    IB/iser: Allow bind only when connection state is UP
    
    We need to fail the bind operation if the iser connection state != UP
    (started teardown) and this should be done under the state lock.
    
    Signed-off-by: Sagi Grimberg <[email protected]>
    Signed-off-by: Or Gerlitz <[email protected]>
    Signed-off-by: Roland Dreier <[email protected]>

commit c33b15f00bbfb9324dc38e5176f576a0f46e0873
Author: Roi Dayan <[email protected]>
Date:   Tue Sep 2 17:08:41 2014 +0300

    IB/iser: Fix RX/TX CQ resource leak on error flow
    
    When failing to allocate TX CQ we already allocated RX CQ, so we need to make
    sure we release it. Also, when failing to register notification to the RX CQ
    we currently leak both RX and TX CQs of the current index, fix that too.
    
    Signed-off-by: Roi Dayan <[email protected]>
    Signed-off-by: Sagi Grimberg <[email protected]>
    Signed-off-by: Or Gerlitz <[email protected]>
    Signed-off-by: Roland Dreier <[email protected]>

commit f0c2c225dfe9dfb668fe72eadabb8a3ec74ca036
Author: [email protected] <[email protected]>
Date:   Fri Sep 5 15:09:49 2014 +0530

    RDMA/ocrdma: Use right macro in query AH
    
    ocrdma_query_ah() does not use correct macro, and checks the wrong bit
    for the validity of address handle in vector table.  Fix this.
    
    Signed-off-by: Devesh Sharma <[email protected]>
    Signed-off-by: Roland Dreier <[email protected]>

commit 1be528bcb88d0b854dda1d60b31f4f8f7310f034
Author: [email protected] <[email protected]>
Date:   Fri Sep 5 15:09:48 2014 +0530

    RDMA/ocrdma: Resolve L2 address when creating user AH
    
    Because of IP-based GIDs, userspace AHs must have MAC and VLAN ID
    resolved separately.  Presently, user AHs are broken for ocrdma.  This
    patch resolves L2 addresses while creating user AH and obtains the
    right DMAC and VLAN ID before creating AH.
    
    Signed-off-by: Devesh Sharma <[email protected]>
    Signed-off-by: Roland Dreier <[email protected]>

commit 4ff0acca7344c93fd9ed778b4c3ce16d95c594e4
Author: Matan Barak <[email protected]>
Date:   Thu Sep 11 13:18:37 2014 +0300

    mlx4: Correct error flows in rereg_mr
    
    This patch addresses feedback from Sagi Grimberg on the rereg_mr
    implementation of mlx4.  The following are fixed:
    
    1. Set the correct pd_flags
    2. Make sure we change the iova and size MR fields only after
       successful write and allocation of the MTTs.
    3. Make the error checking more robust
    
    Fixes: e630664c8383 ("mlx4_core: Add helper functions to support MR re-registration")
    Signed-off-by: Matan Barak <[email protected]>
    Signed-off-by: Or Gerlitz <[email protected]>
    Signed-off-by: Roland Dreier <[email protected]>

commit f2d5a94436cc7cc0221b9a81bba2276a25187dd3
Author: Anton Altaparmakov <[email protected]>
Date:   Mon Sep 22 01:53:03 2014 +0100

    Fix nasty 32-bit overflow bug in buffer i/o code.
    
    On 32-bit architectures, the legacy buffer_head functions are not always
    handling the sector number with the proper 64-bit types, and will thus
    fail on 4TB+ disks.
    
    Any code that uses __getblk() (and thus bread(), breadahead(),
    sb_bread(), sb_breadahead(), sb_getblk()), and calls it using a 64-bit
    block on a 32-bit arch (where "long" is 32-bit) causes an inifinite loop
    in __getblk_slow() with an infinite stream of errors logged to dmesg
    like this:
    
      __find_get_block_slow() failed. block=6740375944, b_blocknr=2445408648
      b_state=0x00000020, b_size=512
      device sda1 blocksize: 512
    
    Note how in hex block is 0x191C1F988 and b_blocknr is 0x91C1F988 i.e. the
    top 32-bits are missing (in this case the 0x1 at the top).
    
    This is because grow_dev_page() is broken and has a 32-bit overflow due
    to shifting the page index value (a pgoff_t - which is just 32 bits on
    32-bit architectures) left-shifted as the block number.  But the top
    bits to get lost as the pgoff_t is not type cast to sector_t / 64-bit
    before the shift.
    
    This patch fixes this issue by type casting "index" to sector_t before
    doing the left shift.
    
    Note this is not a theoretical bug but has been seen in the field on a
    4TiB hard drive with logical sector size 512 bytes.
    
    This patch has been verified to fix the infinite loop problem on 3.17-rc5
    kernel using a 4TB disk image mounted using "-o loop".  Without this patch
    doing a "find /nt" where /nt is an NTFS volume causes the inifinite loop
    100% reproducibly whilst with the patch it works fine as expected.
    
    Signed-off-by: Anton Altaparmakov <[email protected]>
    Cc: [email protected]
    Signed-off-by: Linus Torvalds <[email protected]>

commit 27fbe64bfa63cfb9da025975b59d96568caa2d53
Author: Sam Bobroff <[email protected]>
Date:   Fri Sep 19 09:40:41 2014 +1000

    KVM: correct null pid check in kvm_vcpu_yield_to()
    
    Correct a simple mistake of checking the wrong variable
    before a dereference, resulting in the dereference not being
    properly protected by rcu_dereference().
    
    Signed-off-by: Sam Bobroff <[email protected]>
    Signed-off-by: Paolo Bonzini <[email protected]>

commit e76bf634870e3c5e3a767ad575f1d404c9f1cab8
Author: Daniel Mack <[email protected]>
Date:   Sun Sep 21 23:55:38 2014 +0200

    ALSA: snd-usb-caiaq: Fix LED commands for Kore controller
    
    KoreController and KoreController2 need an EP1_CMD_DIMM_LEDS command to set
    their LEDs, not EP1_CMD_WRITE_IO.
    
    Signed-off-by: Daniel Mack <[email protected]>
    Reported-and-tested-by: Brad Wilson <[email protected]>
    Signed-off-by: Takashi Iwai <[email protected]>

commit a9960e6a293e6fc3ed414643bb4e4106272e4d0a
Author: Clemens Ladisch <[email protected]>
Date:   Sun Sep 21 22:50:57 2014 +0200

    ALSA: pcm: fix fifo_size frame calculation
    
    The calculated frame size was wrong because snd_pcm_format_physical_width()
    actually returns the number of bits, not bytes.
    
    Use snd_pcm_format_size() instead, which not only returns bytes, but also
    simplifies the calculation.
    
    Fixes: 8bea869c5e56 ("ALSA: PCM midlevel: improve fifo_size handling")
    Signed-off-by: Clemens Ladisch <[email protected]>
    Cc: <[email protected]>
    Signed-off-by: Takashi Iwai <[email protected]>

commit b8cb6b4c121e1bf1963c16ed69e7adcb1bc301cd
Author: NeilBrown <[email protected]>
Date:   Thu Sep 18 11:09:04 2014 +1000

    md/raid1: fix_read_error should act on all non-faulty devices.
    
    If a devices is being recovered it is not InSync and is not Faulty.
    
    If a read error is experienced on that device, fix_read_error()
    will be called, but it ignores non-InSync devices.  So it will
    neither fix the error nor fail the device.
    
    It is incorrect that fix_read_error() ignores non-InSync devices.
    It should only ignore Faulty devices.  So fix it.
    
    This became a bug when we allowed reading from a device that was being
    recovered.  It is suitable for any subsequent -stable kernel.
    
    Fixes: da8840a747c0dbf49506ec906757a6b87b9741e9
    Cc: [email protected] (v3.5+)
    Reported-by: Alexander Lyakas <[email protected]>
    Tested-by: Alexander Lyakas <[email protected]>
    Signed-off-by: NeilBrown <[email protected]>

commit 34e97f170149bfa14979581c4c748bc9b4b79d5b
Author: NeilBrown <[email protected]>
Date:   Tue Sep 16 12:14:14 2014 +1000

    md/raid1: count resync requests in nr_pending.
    
    Both normal IO and resync IO can be retried with reschedule_retry()
    and so be counted into ->nr_queued, but only normal IO gets counted in
    ->nr_pending.
    
    Before the recent improvement to RAID1 resync there could only
    possibly have been one or the other on the queue.  When handling a
    read failure it could only be normal IO.  So when handle_read_error()
    called freeze_array() the fact that freeze_array only compares
    ->nr_queued against ->nr_pending was safe.
    
    But now that these two types can interleave, we can have both normal
    and resync IO requests queued, so we need to count them both in
    nr_pending.
    
    This error can lead to freeze_array() hanging if there is a read
    error, so it is suitable for -stable.
    
    Fixes: 79ef3a8aa1cb1523cc231c9a90a278333c21f761
    cc: [email protected] (v3.13+)
    Reported-by: Brassow Jonathan <[email protected]>
    Signed-off-by: NeilBrown <[email protected]>

commit c2fd4c94deedb89ac1746c4a53219be499372c06
Author: NeilBrown <[email protected]>
Date:   Wed Sep 10 16:01:24 2014 +1000

    md/raid1: update next_resync under resync_lock.
    
    raise_barrier() uses next_resync as part of its calculations, so it
    really should be updated first, instead of afterwards.
    
    next_resync is always used under resync_lock so update it under
    resync lock to, just before it is used.  That is safest.
    
    This could cause normal IO and resync IO to interact badly so
    it suitable for -stable.
    
    Fixes: 79ef3a8aa1cb1523cc231c9a90a278333c21f761
    cc: [email protected] (v3.13+)
    Signed-off-by: NeilBrown <[email protected]>

commit 235549605eb7f1c5a37cef8b09d12e6d412c5cd6
Author: NeilBrown <[email protected]>
Date:   Wed Sep 10 15:56:57 2014 +1000

    md/raid1: Don't use next_resync to determine how far resync has progressed
    
    next_resync is (approximately) the location for the next resync request.
    However it does *not* reliably determine the earliest location
    at which resync might be happening.
    This is because resync requests can complete out of order, and
    we only limit the number of current requests, not the distance
    from the earliest pending request to the latest.
    
    mddev->curr_resync_completed is a reliable indicator of the earliest
    position at which resync could be happening.   It is updated less
    frequently, but is actually reliable which is more important.
    
    So use it to determine if a write request is before the region
    being resynced and so safe from conflict.
    
    This error can allow resync IO to interfere with normal IO which
    could lead to data corruption. Hence: stable.
    
    Fixes: 79ef3a8aa1cb1523cc231c9a90a278333c21f761
    cc: [email protected] (v3.13+)
    Signed-off-by: NeilBrown <[email protected]>

commit 2f73d3c55d09ce60647b96ad2a9b539c95a530ee
Author: NeilBrown <[email protected]>
Date:   Wed Sep 10 15:01:49 2014 +1000

    md/raid1: make sure resync waits for conflicting writes to complete.
    
    The resync/recovery process for raid1 was recently changed
    so that writes could happen in parallel with resync providing
    they were in different regions of the device.
    
    There is a problem though:  While a write request will always
    wait for conflicting resync to complete, a resync request
    will *not* always wait for conflicting writes to complete.
    
    Two changes are needed to fix this:
    
    1/ raise_barrier (which waits until it is safe to do resync)
       must wait until current_window_requests is zero
    2/ wait_battier (which waits at the start of a new write request)
       must update current_window_requests if the request could
       possible conflict with a concurrent resync.
    
    As concurrent writes and resync can lead to data loss,
    this patch is suitable for -stable.
    
    Fixes: 79ef3a8aa1cb1523cc231c9a90a278333c21f761
    Cc: [email protected] (v3.13+)
    Cc: majianpeng <[email protected]>
    Signed-off-by: NeilBrown <[email protected]>

commit 669cc7ba77864e7b1ac39c9f2b2afb8730f341f4
Author: NeilBrown <[email protected]>
Date:   Thu Sep 4 16:30:38 2014 +1000

    md/raid1: clean up request counts properly in close_sync()
    
    If there are outstanding writes when close_sync is called,
    the change to ->start_next_window might cause them to
    decrement the wrong counter when they complete.  Fix this
    by merging the two counters into the one that will be decremented.
    
    Having an incorrect value in a counter can cause raise_barrier()
    to hangs, so this is suitable for -stable.
    
    Fixes: 79ef3a8aa1cb1523cc231c9a90a278333c21f761
    cc: [email protected] (v3.13+)
    Signed-off-by: NeilBrown <[email protected]>

commit 8e2c8717c1812628b5538c05250057b37c66fdbe
Author: Frank Schaefer <[email protected]>
Date:   Thu Sep 18 17:55:45 2014 -0300

    [media] em28xx-v4l: get rid of field "users" in struct em28xx_v4l2"
    
    This reverts commit 747dba7de2a51a3db58b665ed3bc8c07921546ec.
    
    It breaks concurrent vbi and video capturing:
    While v4l2->users is the number of users of the whole device (all device nodes),
    v4l2_fh_is_singular() only checks the number of users of a specific device node.
    As a result. if one device node is open and a second device node is opened
    (closed), the device is reinitialized (streaming is stopped).
    
    Reported-by: Hans Verkuil <[email protected]>
    Tested-by: Mauro Carvalho Chehab <[email protected]>
    Signed-off-by: Frank Schäfer <[email protected]>
    Cc: [email protected]
    Signed-off-by: Mauro Carvalho Chehab <[email protected]>

commit c6d119cf1b5a778e9ed60a006e2a434fcc4471a2
Author: NeilBrown <[email protected]>
Date:   Tue Sep 9 13:49:46 2014 +1000

    md/raid1:  be more cautious where we read-balance during resync.
    
    commit 79ef3a8aa1cb1523cc231c9a90a278333c21f761 made
    it possible for reads to happen concurrently with resync.
    This means that we need to be more careful where read_balancing
    is allowed during resync - we can no longer be sure that any
    resync that has already started will definitely finish.
    
    So keep read_balancing to before recovery_cp, which is conservative
    but safe.
    
    This bug makes it possible to read from a device that doesn't
    have up-to-date data, so it can cause data corruption.
    So it is suitable for any kernel since 3.11.
    
    Fixes: 79ef3a8aa1cb1523cc231c9a90a278333c21f761
    cc: [email protected] (v3.13+)
    Signed-off-by: NeilBrown <[email protected]>

commit f0cc9a057151892b885be21a1d19b0185568281d
Author: NeilBrown <[email protected]>
Date:   Mon Sep 22 10:06:23 2014 +1000

    md/raid1: intialise start_next_window for READ case to avoid hang
    
    r1_bio->start_next_window is not initialised in the READ
    case, so allow_barrier may incorrectly decrement
       conf->current_window_requests
    which can cause raise_barrier() to block forever.
    
    Fixes: 79ef3a8aa1cb1523cc231c9a90a278333c21f761
    cc: [email protected] (v3.13+)
    Reported-by: Brassow Jonathan <[email protected]>
    Signed-off-by: NeilBrown <[email protected]>

commit c7854c2c5d692a329b4d9a9a73bcf36ae137ee7c
Author: Mauro Carvalho Chehab <[email protected]>
Date:   Fri Sep 19 13:02:12 2014 -0300

    [media] em28xx: fix VBI handling logic
    
    When both VBI and video are streaming, and video stream is stopped,
    a subsequent trial to restart it will fail, because S_FMT will
    return -EBUSY.
    
    That prevents applications like zvbi to work properly.
    
    Please notice that, while this fix it fully for zvbi, the
    best is to get rid of streaming_users and res_get logic as a hole.
    
    However, this single-line patch is better to be merged at -stable.
    
    Cc: [email protected]
    Signed-off-by: Mauro Carvalho Chehab <[email protected]>

commit 91235537bc4b53f0b6f953acf963bcbb6215c49c
Author: Hans Verkuil <[email protected]>
Date:   Sat Sep 20 16:16:37 2014 -0300

    [media] DocBook media: improve the poll() documentation
    
    The poll documentation was incomplete: document how events (POLLPRI)
    are handled and fix the documentation of what poll does for display devices
    and streaming I/O.
    
    Signed-off-by: Hans Verkuil <[email protected]>
    Acked-by: Laurent Pinchart <[email protected]>
    Signed-off-by: Mauro Carvalho Chehab <[email protected]>

commit acf92046a0a666051f9c6b4a53d874c618203173
Author: Hans Verkuil <[email protected]>
Date:   Sat Sep 20 16:16:36 2014 -0300

    [media] DocBook media: fix the poll() 'no QBUF' documentation
    
    Clarify what poll() returns if STREAMON was called but not QBUF.
    Make explicit the different behavior for this scenario for
    capture and output devices.
    
    Signed-off-by: Hans Verkuil <[email protected]>
    Acked-by: Laurent Pinchart <[email protected]>
    Signed-off-by: Mauro Carvalho Chehab <[email protected]>

commit 58d75f4b1ce26324b4d809b18f94819843a98731
Author: Hans Verkuil <[email protected]>
Date:   Sat Sep 20 16:16:35 2014 -0300

    [media] vb2: fix VBI/poll regression
    
    The recent conversion of saa7134 to vb2 unconvered a poll() bug that
    broke the teletext applications alevt and mtt. These applications
    expect that calling poll() without having called VIDIOC_STREAMON will
    cause poll() to return POLLERR. That did not happen in vb2.
    
    This patch fixes that behavior. It also fixes what should happen when
    poll() is called when STREAMON is called but no buffers have been
    queued. In that case poll() will also return POLLERR, but only for
    capture queues since output queues will always return POLLOUT
    anyway in that situation.
    
    This brings the vb2 behavior in line with the old videobuf behavior.
    
    Signed-off-by: Hans Verkuil <[email protected]>
    Acked-by: Laurent Pinchart <[email protected]>
    Signed-off-by: Mauro Carvalho …
koct9i pushed a commit to koct9i/linux that referenced this pull request Sep 27, 2014
ERROR: code indent should use tabs where possible
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: please, no spaces at the start of a line
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: Prefer [subsystem eg: netdev]_alert([subsystem]dev, ... then dev_alert(dev, ... then pr_alert(...  to printk(KERN_ALERT ...
torvalds#74: FILE: mm/debug.c:171:
+	printk(KERN_ALERT

total: 6 errors, 7 warnings, 109 lines checked

NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
      scripts/cleanfile

./patches/mm-introduce-vm_bug_on_mm.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Sasha Levin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
tom3q pushed a commit to tom3q/linux that referenced this pull request Oct 2, 2014
ERROR: code indent should use tabs where possible
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: please, no spaces at the start of a line
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: Prefer [subsystem eg: netdev]_alert([subsystem]dev, ... then dev_alert(dev, ... then pr_alert(...  to printk(KERN_ALERT ...
torvalds#74: FILE: mm/debug.c:171:
+	printk(KERN_ALERT

total: 6 errors, 7 warnings, 109 lines checked

NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
      scripts/cleanfile

./patches/mm-introduce-vm_bug_on_mm.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Sasha Levin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
aryabinin pushed a commit to aryabinin/linux that referenced this pull request Oct 3, 2014
ERROR: code indent should use tabs where possible
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: please, no spaces at the start of a line
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: Prefer [subsystem eg: netdev]_alert([subsystem]dev, ... then dev_alert(dev, ... then pr_alert(...  to printk(KERN_ALERT ...
torvalds#74: FILE: mm/debug.c:171:
+	printk(KERN_ALERT

total: 6 errors, 7 warnings, 109 lines checked

NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
      scripts/cleanfile

./patches/mm-introduce-vm_bug_on_mm.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Sasha Levin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
bengal pushed a commit to bengal/linux that referenced this pull request Oct 7, 2014
ERROR: code indent should use tabs where possible
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#37: FILE: include/linux/mmdebug.h:33:
+        do {^I^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#38: FILE: include/linux/mmdebug.h:34:
+                if (unlikely(cond)) {^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#39: FILE: include/linux/mmdebug.h:35:
+                        dump_mm(mm);^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#40: FILE: include/linux/mmdebug.h:36:
+                        BUG();^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

WARNING: please, no spaces at the start of a line
torvalds#41: FILE: include/linux/mmdebug.h:37:
+                }^I^I^I^I^I^I^I\$

ERROR: code indent should use tabs where possible
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: please, no spaces at the start of a line
torvalds#42: FILE: include/linux/mmdebug.h:38:
+        } while (0)$

WARNING: Prefer [subsystem eg: netdev]_alert([subsystem]dev, ... then dev_alert(dev, ... then pr_alert(...  to printk(KERN_ALERT ...
torvalds#74: FILE: mm/debug.c:171:
+	printk(KERN_ALERT

total: 6 errors, 7 warnings, 109 lines checked

NOTE: whitespace errors detected, you may wish to use scripts/cleanpatch or
      scripts/cleanfile

./patches/mm-introduce-vm_bug_on_mm.patch has style problems, please review.

If any of these errors are false positives, please report
them to the maintainer, see CHECKPATCH in MAINTAINERS.

Please run checkpatch prior to sending patches

Cc: Sasha Levin <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Oct 12, 2015
…stress

Prevent faults that would occur during this sequence of activity during
network stress:

    rmmod visornic
    modprobe visornic
    /etc/init.d/network restart

The problem fixed was that the back-end IO partition was holding onto
stale receive buffers after the "rmmod visornic", and erroneously
completing them after a subsequent "modprobe visornic".  This is fixed
in this patch as follows:

* Tell the back-end IO partition that we want it to employ its
  "incarnation mechanism" to ensure it does not complete stale receive
  buffers after the guest virtual device environment changes (e.g., by
  re-loading the driver), by setting the
  ULTRA_IO_DRIVER_SUPPORTS_ENHANCED_RCVBUF_CHECKING feature bit, and
  supplying a unique incarnation number in rcvpost.unique_num for each
  receive buffer posted.

* When visornic loads, make sure we drain and ignore any possible-stale
  data in the channel before beginning network operation.

Prior to this patch, faults like this would occur almost every time if
you attempted to rmmod + modprobe the visornic driver and restart the
network service during heavy network activity:

    BUG: spinlock bad magic on CPU#0, ksoftirqd/0/3
     lock: 0xffff88002d8a56d8, .magic: ffff8800, .owner: <none>/-1,
                               .owner_cpu: 2304
    CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G         C
           4.3.0-rc3-ARCH+ torvalds#74

Signed-off-by: Tim Sell <[email protected]>
Signed-off-by: Benjamin Romer <[email protected]>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Oct 13, 2015
…stress

Prevent faults that would occur during this sequence of activity during
network stress:

    rmmod visornic
    modprobe visornic
    /etc/init.d/network restart

The problem fixed was that the back-end IO partition was holding onto
stale receive buffers after the "rmmod visornic", and erroneously
completing them after a subsequent "modprobe visornic".  This is fixed
in this patch as follows:

* Tell the back-end IO partition that we want it to employ its
  "incarnation mechanism" to ensure it does not complete stale receive
  buffers after the guest virtual device environment changes (e.g., by
  re-loading the driver), by setting the
  ULTRA_IO_DRIVER_SUPPORTS_ENHANCED_RCVBUF_CHECKING feature bit, and
  supplying a unique incarnation number in rcvpost.unique_num for each
  receive buffer posted.

* When visornic loads, make sure we drain and ignore any possible-stale
  data in the channel before beginning network operation.

Prior to this patch, faults like this would occur almost every time if
you attempted to rmmod + modprobe the visornic driver and restart the
network service during heavy network activity:

    BUG: spinlock bad magic on CPU#0, ksoftirqd/0/3
     lock: 0xffff88002d8a56d8, .magic: ffff8800, .owner: <none>/-1,
                               .owner_cpu: 2304
    CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G         C
           4.3.0-rc3-ARCH+ torvalds#74

Signed-off-by: Tim Sell <[email protected]>
Signed-off-by: Benjamin Romer <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Jan 13, 2016
I saw the following BUG_ON triggered in a testcase where a process calls
madvise(MADV_SOFT_OFFLINE) on thps, along with a background process that
calls migratepages command repeatedly (doing ping-pong among different
NUMA nodes) for the first process:

  [   52.556731] Soft offlining page 0x60000 at 0x700000600000
  [   52.592620] __get_any_page: 0x60000 free buddy page
  [   52.593451] page:ffffea0001800000 count:0 mapcount:-127 mapping:          (null) index:0x1
  [   52.594767] flags: 0x1fffc0000000000()
  [   52.595402] page dumped because: VM_BUG_ON_PAGE(atomic_read(&page->_count) == 0)
  [   52.596602] ------------[ cut here ]------------
  [   52.597339] kernel BUG at /src/linux-dev/include/linux/mm.h:342!
  [   52.598284] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
  [   52.599193] Modules linked in: cfg80211 rfkill crc32c_intel serio_raw virtio_balloon i2c_piix4 virtio_blk virtio_net ata_generic pata_acpi
  [   52.600579] CPU: 3 PID: 3035 Comm: test_alloc_gene Tainted: G           O    4.4.0-rc8-v4.4-rc8-160107-1501-00000-rc8+ torvalds#74
  [   52.600579] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  [   52.600579] task: ffff88007c63d5c0 ti: ffff88007c210000 task.ti: ffff88007c210000
  [   52.600579] RIP: 0010:[<ffffffff8118998c>]  [<ffffffff8118998c>] put_page+0x5c/0x60
  [   52.600579] RSP: 0018:ffff88007c213e00  EFLAGS: 00010246
  [   52.600579] RAX: 0000000000000044 RBX: ffffea0001800000 RCX: 0000000000000000
  [   52.600579] RDX: ffff88011f50f570 RSI: 0000000000000000 RDI: ffff88011f50cc18
  [   52.600579] RBP: ffff88007c213e08 R08: 000000000000000a R09: 000000000000149c
  [   52.600579] R10: ffff8800dac927f8 R11: 000000000000149c R12: ffffea0001800000
  [   52.600579] R13: 0000000000060000 R14: ffffea0001800000 R15: 0000000000000065
  [   52.600579] FS:  00007feb79d7d740(0000) GS:ffff88011f500000(0000) knlGS:0000000000000000
  [   52.600579] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  [   52.600579] CR2: 00007f3032cd2000 CR3: 00000000da6c4000 CR4: 00000000000006e0
  [   52.600579] Stack:
  [   52.600579]  ffffea0001800000 ffff88007c213e28 ffffffff811eb2ee ffffea0001800000
  [   52.600579]  00000000fffffffb ffff88007c213e70 ffffffff811eccd1 0000000000000018
  [   52.600579]  ffff88007c213e50 0000700000600000 0000700000601000 0000160000000000
  [   52.600579] Call Trace:
  [   52.600579]  [<ffffffff811eb2ee>] put_hwpoison_page+0x4e/0x80
  [   52.600579]  [<ffffffff811eccd1>] soft_offline_page+0x501/0x520
  [   52.600579]  [<ffffffff811bd18c>] SyS_madvise+0x6bc/0x6f0
  [   52.600579]  [<ffffffff8104d0ac>] ? fpu__restore_sig+0xcc/0x320
  [   52.600579]  [<ffffffff810a0003>] ? do_sigaction+0x73/0x1b0
  [   52.600579]  [<ffffffff8109ceb2>] ? __set_task_blocked+0x32/0x70
  [   52.600579]  [<ffffffff81652757>] entry_SYSCALL_64_fastpath+0x12/0x6a
  [   52.600579] Code: 8b fc ff ff 5b 5d c3 48 89 df e8 b0 fa ff ff 48 89 df 31 f6 e8 c6 7d ff ff 5b 5d c3 48 c7 c6 08 54 a2 81 48 89 df e8 a4 c5 01 00 <0f> 0b 66 90 66 66 66 66 90 55 48 89 e5 41 55 41 54 53 48 8b 47
  [   52.600579] RIP  [<ffffffff8118998c>] put_page+0x5c/0x60
  [   52.600579]  RSP <ffff88007c213e00>

The root cause resides in get_any_page() which retries to get a refcount
of the page to be soft-offlined.  This function calls put_hwpoison_page(),
expecting that the target page is putback to LRU list.  But it can be also
freed to buddy.  So the second check need to care about such case.

Fixes: af8fae7 ("mm/memory-failure.c: clean up soft_offline_page()")
Signed-off-by: Naoya Horiguchi <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Jerome Marchand <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Steve Capper <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: <[email protected]>	[3.9+]
Signed-off-by: Andrew Morton <[email protected]>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Jan 14, 2016
I saw the following BUG_ON triggered in a testcase where a process calls
madvise(MADV_SOFT_OFFLINE) on thps, along with a background process that
calls migratepages command repeatedly (doing ping-pong among different
NUMA nodes) for the first process:

  [   52.556731] Soft offlining page 0x60000 at 0x700000600000
  [   52.592620] __get_any_page: 0x60000 free buddy page
  [   52.593451] page:ffffea0001800000 count:0 mapcount:-127 mapping:          (null) index:0x1
  [   52.594767] flags: 0x1fffc0000000000()
  [   52.595402] page dumped because: VM_BUG_ON_PAGE(atomic_read(&page->_count) == 0)
  [   52.596602] ------------[ cut here ]------------
  [   52.597339] kernel BUG at /src/linux-dev/include/linux/mm.h:342!
  [   52.598284] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
  [   52.599193] Modules linked in: cfg80211 rfkill crc32c_intel serio_raw virtio_balloon i2c_piix4 virtio_blk virtio_net ata_generic pata_acpi
  [   52.600579] CPU: 3 PID: 3035 Comm: test_alloc_gene Tainted: G           O    4.4.0-rc8-v4.4-rc8-160107-1501-00000-rc8+ torvalds#74
  [   52.600579] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  [   52.600579] task: ffff88007c63d5c0 ti: ffff88007c210000 task.ti: ffff88007c210000
  [   52.600579] RIP: 0010:[<ffffffff8118998c>]  [<ffffffff8118998c>] put_page+0x5c/0x60
  [   52.600579] RSP: 0018:ffff88007c213e00  EFLAGS: 00010246
  [   52.600579] RAX: 0000000000000044 RBX: ffffea0001800000 RCX: 0000000000000000
  [   52.600579] RDX: ffff88011f50f570 RSI: 0000000000000000 RDI: ffff88011f50cc18
  [   52.600579] RBP: ffff88007c213e08 R08: 000000000000000a R09: 000000000000149c
  [   52.600579] R10: ffff8800dac927f8 R11: 000000000000149c R12: ffffea0001800000
  [   52.600579] R13: 0000000000060000 R14: ffffea0001800000 R15: 0000000000000065
  [   52.600579] FS:  00007feb79d7d740(0000) GS:ffff88011f500000(0000) knlGS:0000000000000000
  [   52.600579] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  [   52.600579] CR2: 00007f3032cd2000 CR3: 00000000da6c4000 CR4: 00000000000006e0
  [   52.600579] Stack:
  [   52.600579]  ffffea0001800000 ffff88007c213e28 ffffffff811eb2ee ffffea0001800000
  [   52.600579]  00000000fffffffb ffff88007c213e70 ffffffff811eccd1 0000000000000018
  [   52.600579]  ffff88007c213e50 0000700000600000 0000700000601000 0000160000000000
  [   52.600579] Call Trace:
  [   52.600579]  [<ffffffff811eb2ee>] put_hwpoison_page+0x4e/0x80
  [   52.600579]  [<ffffffff811eccd1>] soft_offline_page+0x501/0x520
  [   52.600579]  [<ffffffff811bd18c>] SyS_madvise+0x6bc/0x6f0
  [   52.600579]  [<ffffffff8104d0ac>] ? fpu__restore_sig+0xcc/0x320
  [   52.600579]  [<ffffffff810a0003>] ? do_sigaction+0x73/0x1b0
  [   52.600579]  [<ffffffff8109ceb2>] ? __set_task_blocked+0x32/0x70
  [   52.600579]  [<ffffffff81652757>] entry_SYSCALL_64_fastpath+0x12/0x6a
  [   52.600579] Code: 8b fc ff ff 5b 5d c3 48 89 df e8 b0 fa ff ff 48 89 df 31 f6 e8 c6 7d ff ff 5b 5d c3 48 c7 c6 08 54 a2 81 48 89 df e8 a4 c5 01 00 <0f> 0b 66 90 66 66 66 66 90 55 48 89 e5 41 55 41 54 53 48 8b 47
  [   52.600579] RIP  [<ffffffff8118998c>] put_page+0x5c/0x60
  [   52.600579]  RSP <ffff88007c213e00>

The root cause resides in get_any_page() which retries to get a refcount
of the page to be soft-offlined.  This function calls put_hwpoison_page(),
expecting that the target page is putback to LRU list.  But it can be also
freed to buddy.  So the second check need to care about such case.

Fixes: af8fae7 ("mm/memory-failure.c: clean up soft_offline_page()")
Signed-off-by: Naoya Horiguchi <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Jerome Marchand <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Steve Capper <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: <[email protected]>	[3.9+]
Signed-off-by: Andrew Morton <[email protected]>
0day-ci pushed a commit to 0day-ci/linux that referenced this pull request Jan 15, 2016
I saw the following BUG_ON triggered in a testcase where a process calls
madvise(MADV_SOFT_OFFLINE) on thps, along with a background process that
calls migratepages command repeatedly (doing ping-pong among different
NUMA nodes) for the first process:

  [   52.556731] Soft offlining page 0x60000 at 0x700000600000
  [   52.592620] __get_any_page: 0x60000 free buddy page
  [   52.593451] page:ffffea0001800000 count:0 mapcount:-127 mapping:          (null) index:0x1
  [   52.594767] flags: 0x1fffc0000000000()
  [   52.595402] page dumped because: VM_BUG_ON_PAGE(atomic_read(&page->_count) == 0)
  [   52.596602] ------------[ cut here ]------------
  [   52.597339] kernel BUG at /src/linux-dev/include/linux/mm.h:342!
  [   52.598284] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
  [   52.599193] Modules linked in: cfg80211 rfkill crc32c_intel serio_raw virtio_balloon i2c_piix4 virtio_blk virtio_net ata_generic pata_acpi
  [   52.600579] CPU: 3 PID: 3035 Comm: test_alloc_gene Tainted: G           O    4.4.0-rc8-v4.4-rc8-160107-1501-00000-rc8+ torvalds#74
  [   52.600579] Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  [   52.600579] task: ffff88007c63d5c0 ti: ffff88007c210000 task.ti: ffff88007c210000
  [   52.600579] RIP: 0010:[<ffffffff8118998c>]  [<ffffffff8118998c>] put_page+0x5c/0x60
  [   52.600579] RSP: 0018:ffff88007c213e00  EFLAGS: 00010246
  [   52.600579] RAX: 0000000000000044 RBX: ffffea0001800000 RCX: 0000000000000000
  [   52.600579] RDX: ffff88011f50f570 RSI: 0000000000000000 RDI: ffff88011f50cc18
  [   52.600579] RBP: ffff88007c213e08 R08: 000000000000000a R09: 000000000000149c
  [   52.600579] R10: ffff8800dac927f8 R11: 000000000000149c R12: ffffea0001800000
  [   52.600579] R13: 0000000000060000 R14: ffffea0001800000 R15: 0000000000000065
  [   52.600579] FS:  00007feb79d7d740(0000) GS:ffff88011f500000(0000) knlGS:0000000000000000
  [   52.600579] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  [   52.600579] CR2: 00007f3032cd2000 CR3: 00000000da6c4000 CR4: 00000000000006e0
  [   52.600579] Stack:
  [   52.600579]  ffffea0001800000 ffff88007c213e28 ffffffff811eb2ee ffffea0001800000
  [   52.600579]  00000000fffffffb ffff88007c213e70 ffffffff811eccd1 0000000000000018
  [   52.600579]  ffff88007c213e50 0000700000600000 0000700000601000 0000160000000000
  [   52.600579] Call Trace:
  [   52.600579]  [<ffffffff811eb2ee>] put_hwpoison_page+0x4e/0x80
  [   52.600579]  [<ffffffff811eccd1>] soft_offline_page+0x501/0x520
  [   52.600579]  [<ffffffff811bd18c>] SyS_madvise+0x6bc/0x6f0
  [   52.600579]  [<ffffffff8104d0ac>] ? fpu__restore_sig+0xcc/0x320
  [   52.600579]  [<ffffffff810a0003>] ? do_sigaction+0x73/0x1b0
  [   52.600579]  [<ffffffff8109ceb2>] ? __set_task_blocked+0x32/0x70
  [   52.600579]  [<ffffffff81652757>] entry_SYSCALL_64_fastpath+0x12/0x6a
  [   52.600579] Code: 8b fc ff ff 5b 5d c3 48 89 df e8 b0 fa ff ff 48 89 df 31 f6 e8 c6 7d ff ff 5b 5d c3 48 c7 c6 08 54 a2 81 48 89 df e8 a4 c5 01 00 <0f> 0b 66 90 66 66 66 66 90 55 48 89 e5 41 55 41 54 53 48 8b 47
  [   52.600579] RIP  [<ffffffff8118998c>] put_page+0x5c/0x60
  [   52.600579]  RSP <ffff88007c213e00>

The root cause resides in get_any_page() which retries to get a refcount
of the page to be soft-offlined.  This function calls put_hwpoison_page(),
expecting that the target page is putback to LRU list.  But it can be also
freed to buddy.  So the second check need to care about such case.

Fixes: af8fae7 ("mm/memory-failure.c: clean up soft_offline_page()")
Signed-off-by: Naoya Horiguchi <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Jerome Marchand <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Steve Capper <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: <[email protected]>	[3.9+]
Signed-off-by: Andrew Morton <[email protected]>
torvalds pushed a commit that referenced this pull request Jan 17, 2016
I saw the following BUG_ON triggered in a testcase where a process calls
madvise(MADV_SOFT_OFFLINE) on thps, along with a background process that
calls migratepages command repeatedly (doing ping-pong among different
NUMA nodes) for the first process:

   Soft offlining page 0x60000 at 0x700000600000
   __get_any_page: 0x60000 free buddy page
   page:ffffea0001800000 count:0 mapcount:-127 mapping:          (null) index:0x1
   flags: 0x1fffc0000000000()
   page dumped because: VM_BUG_ON_PAGE(atomic_read(&page->_count) == 0)
   ------------[ cut here ]------------
   kernel BUG at /src/linux-dev/include/linux/mm.h:342!
   invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
   Modules linked in: cfg80211 rfkill crc32c_intel serio_raw virtio_balloon i2c_piix4 virtio_blk virtio_net ata_generic pata_acpi
   CPU: 3 PID: 3035 Comm: test_alloc_gene Tainted: G           O    4.4.0-rc8-v4.4-rc8-160107-1501-00000-rc8+ #74
   Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
   task: ffff88007c63d5c0 ti: ffff88007c210000 task.ti: ffff88007c210000
   RIP: 0010:[<ffffffff8118998c>]  [<ffffffff8118998c>] put_page+0x5c/0x60
   RSP: 0018:ffff88007c213e00  EFLAGS: 00010246
   Call Trace:
     put_hwpoison_page+0x4e/0x80
     soft_offline_page+0x501/0x520
     SyS_madvise+0x6bc/0x6f0
     entry_SYSCALL_64_fastpath+0x12/0x6a
   Code: 8b fc ff ff 5b 5d c3 48 89 df e8 b0 fa ff ff 48 89 df 31 f6 e8 c6 7d ff ff 5b 5d c3 48 c7 c6 08 54 a2 81 48 89 df e8 a4 c5 01 00 <0f> 0b 66 90 66 66 66 66 90 55 48 89 e5 41 55 41 54 53 48 8b 47
   RIP  [<ffffffff8118998c>] put_page+0x5c/0x60
    RSP <ffff88007c213e00>

The root cause resides in get_any_page() which retries to get a refcount
of the page to be soft-offlined.  This function calls
put_hwpoison_page(), expecting that the target page is putback to LRU
list.  But it can be also freed to buddy.  So the second check need to
care about such case.

Fixes: af8fae7 ("mm/memory-failure.c: clean up soft_offline_page()")
Signed-off-by: Naoya Horiguchi <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Jerome Marchand <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Steve Capper <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: <[email protected]>	[3.9+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Noltari pushed a commit to Noltari/linux that referenced this pull request Feb 16, 2016
[ Upstream commit d96b339 ]

I saw the following BUG_ON triggered in a testcase where a process calls
madvise(MADV_SOFT_OFFLINE) on thps, along with a background process that
calls migratepages command repeatedly (doing ping-pong among different
NUMA nodes) for the first process:

   Soft offlining page 0x60000 at 0x700000600000
   __get_any_page: 0x60000 free buddy page
   page:ffffea0001800000 count:0 mapcount:-127 mapping:          (null) index:0x1
   flags: 0x1fffc0000000000()
   page dumped because: VM_BUG_ON_PAGE(atomic_read(&page->_count) == 0)
   ------------[ cut here ]------------
   kernel BUG at /src/linux-dev/include/linux/mm.h:342!
   invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
   Modules linked in: cfg80211 rfkill crc32c_intel serio_raw virtio_balloon i2c_piix4 virtio_blk virtio_net ata_generic pata_acpi
   CPU: 3 PID: 3035 Comm: test_alloc_gene Tainted: G           O    4.4.0-rc8-v4.4-rc8-160107-1501-00000-rc8+ torvalds#74
   Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
   task: ffff88007c63d5c0 ti: ffff88007c210000 task.ti: ffff88007c210000
   RIP: 0010:[<ffffffff8118998c>]  [<ffffffff8118998c>] put_page+0x5c/0x60
   RSP: 0018:ffff88007c213e00  EFLAGS: 00010246
   Call Trace:
     put_hwpoison_page+0x4e/0x80
     soft_offline_page+0x501/0x520
     SyS_madvise+0x6bc/0x6f0
     entry_SYSCALL_64_fastpath+0x12/0x6a
   Code: 8b fc ff ff 5b 5d c3 48 89 df e8 b0 fa ff ff 48 89 df 31 f6 e8 c6 7d ff ff 5b 5d c3 48 c7 c6 08 54 a2 81 48 89 df e8 a4 c5 01 00 <0f> 0b 66 90 66 66 66 66 90 55 48 89 e5 41 55 41 54 53 48 8b 47
   RIP  [<ffffffff8118998c>] put_page+0x5c/0x60
    RSP <ffff88007c213e00>

The root cause resides in get_any_page() which retries to get a refcount
of the page to be soft-offlined.  This function calls
put_hwpoison_page(), expecting that the target page is putback to LRU
list.  But it can be also freed to buddy.  So the second check need to
care about such case.

Fixes: af8fae7 ("mm/memory-failure.c: clean up soft_offline_page()")
Signed-off-by: Naoya Horiguchi <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Jerome Marchand <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Steve Capper <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: <[email protected]>	[3.9+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Noltari pushed a commit to Noltari/linux that referenced this pull request Feb 16, 2016
[ Upstream commit d96b339 ]

I saw the following BUG_ON triggered in a testcase where a process calls
madvise(MADV_SOFT_OFFLINE) on thps, along with a background process that
calls migratepages command repeatedly (doing ping-pong among different
NUMA nodes) for the first process:

   Soft offlining page 0x60000 at 0x700000600000
   __get_any_page: 0x60000 free buddy page
   page:ffffea0001800000 count:0 mapcount:-127 mapping:          (null) index:0x1
   flags: 0x1fffc0000000000()
   page dumped because: VM_BUG_ON_PAGE(atomic_read(&page->_count) == 0)
   ------------[ cut here ]------------
   kernel BUG at /src/linux-dev/include/linux/mm.h:342!
   invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
   Modules linked in: cfg80211 rfkill crc32c_intel serio_raw virtio_balloon i2c_piix4 virtio_blk virtio_net ata_generic pata_acpi
   CPU: 3 PID: 3035 Comm: test_alloc_gene Tainted: G           O    4.4.0-rc8-v4.4-rc8-160107-1501-00000-rc8+ torvalds#74
   Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
   task: ffff88007c63d5c0 ti: ffff88007c210000 task.ti: ffff88007c210000
   RIP: 0010:[<ffffffff8118998c>]  [<ffffffff8118998c>] put_page+0x5c/0x60
   RSP: 0018:ffff88007c213e00  EFLAGS: 00010246
   Call Trace:
     put_hwpoison_page+0x4e/0x80
     soft_offline_page+0x501/0x520
     SyS_madvise+0x6bc/0x6f0
     entry_SYSCALL_64_fastpath+0x12/0x6a
   Code: 8b fc ff ff 5b 5d c3 48 89 df e8 b0 fa ff ff 48 89 df 31 f6 e8 c6 7d ff ff 5b 5d c3 48 c7 c6 08 54 a2 81 48 89 df e8 a4 c5 01 00 <0f> 0b 66 90 66 66 66 66 90 55 48 89 e5 41 55 41 54 53 48 8b 47
   RIP  [<ffffffff8118998c>] put_page+0x5c/0x60
    RSP <ffff88007c213e00>

The root cause resides in get_any_page() which retries to get a refcount
of the page to be soft-offlined.  This function calls
put_hwpoison_page(), expecting that the target page is putback to LRU
list.  But it can be also freed to buddy.  So the second check need to
care about such case.

Fixes: af8fae7 ("mm/memory-failure.c: clean up soft_offline_page()")
Signed-off-by: Naoya Horiguchi <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Jerome Marchand <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Steve Capper <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: <[email protected]>	[3.9+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Noltari pushed a commit to Noltari/linux that referenced this pull request Feb 25, 2016
commit d96b339 upstream.

I saw the following BUG_ON triggered in a testcase where a process calls
madvise(MADV_SOFT_OFFLINE) on thps, along with a background process that
calls migratepages command repeatedly (doing ping-pong among different
NUMA nodes) for the first process:

   Soft offlining page 0x60000 at 0x700000600000
   __get_any_page: 0x60000 free buddy page
   page:ffffea0001800000 count:0 mapcount:-127 mapping:          (null) index:0x1
   flags: 0x1fffc0000000000()
   page dumped because: VM_BUG_ON_PAGE(atomic_read(&page->_count) == 0)
   ------------[ cut here ]------------
   kernel BUG at /src/linux-dev/include/linux/mm.h:342!
   invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
   Modules linked in: cfg80211 rfkill crc32c_intel serio_raw virtio_balloon i2c_piix4 virtio_blk virtio_net ata_generic pata_acpi
   CPU: 3 PID: 3035 Comm: test_alloc_gene Tainted: G           O    4.4.0-rc8-v4.4-rc8-160107-1501-00000-rc8+ torvalds#74
   Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
   task: ffff88007c63d5c0 ti: ffff88007c210000 task.ti: ffff88007c210000
   RIP: 0010:[<ffffffff8118998c>]  [<ffffffff8118998c>] put_page+0x5c/0x60
   RSP: 0018:ffff88007c213e00  EFLAGS: 00010246
   Call Trace:
     put_hwpoison_page+0x4e/0x80
     soft_offline_page+0x501/0x520
     SyS_madvise+0x6bc/0x6f0
     entry_SYSCALL_64_fastpath+0x12/0x6a
   Code: 8b fc ff ff 5b 5d c3 48 89 df e8 b0 fa ff ff 48 89 df 31 f6 e8 c6 7d ff ff 5b 5d c3 48 c7 c6 08 54 a2 81 48 89 df e8 a4 c5 01 00 <0f> 0b 66 90 66 66 66 66 90 55 48 89 e5 41 55 41 54 53 48 8b 47
   RIP  [<ffffffff8118998c>] put_page+0x5c/0x60
    RSP <ffff88007c213e00>

The root cause resides in get_any_page() which retries to get a refcount
of the page to be soft-offlined.  This function calls
put_hwpoison_page(), expecting that the target page is putback to LRU
list.  But it can be also freed to buddy.  So the second check need to
care about such case.

Fixes: af8fae7 ("mm/memory-failure.c: clean up soft_offline_page()")
Signed-off-by: Naoya Horiguchi <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Jerome Marchand <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Steve Capper <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: David Rientjes <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Noltari pushed a commit to Noltari/linux that referenced this pull request Feb 25, 2016
commit d96b339 upstream.

I saw the following BUG_ON triggered in a testcase where a process calls
madvise(MADV_SOFT_OFFLINE) on thps, along with a background process that
calls migratepages command repeatedly (doing ping-pong among different
NUMA nodes) for the first process:

   Soft offlining page 0x60000 at 0x700000600000
   __get_any_page: 0x60000 free buddy page
   page:ffffea0001800000 count:0 mapcount:-127 mapping:          (null) index:0x1
   flags: 0x1fffc0000000000()
   page dumped because: VM_BUG_ON_PAGE(atomic_read(&page->_count) == 0)
   ------------[ cut here ]------------
   kernel BUG at /src/linux-dev/include/linux/mm.h:342!
   invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
   Modules linked in: cfg80211 rfkill crc32c_intel serio_raw virtio_balloon i2c_piix4 virtio_blk virtio_net ata_generic pata_acpi
   CPU: 3 PID: 3035 Comm: test_alloc_gene Tainted: G           O    4.4.0-rc8-v4.4-rc8-160107-1501-00000-rc8+ torvalds#74
   Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
   task: ffff88007c63d5c0 ti: ffff88007c210000 task.ti: ffff88007c210000
   RIP: 0010:[<ffffffff8118998c>]  [<ffffffff8118998c>] put_page+0x5c/0x60
   RSP: 0018:ffff88007c213e00  EFLAGS: 00010246
   Call Trace:
     put_hwpoison_page+0x4e/0x80
     soft_offline_page+0x501/0x520
     SyS_madvise+0x6bc/0x6f0
     entry_SYSCALL_64_fastpath+0x12/0x6a
   Code: 8b fc ff ff 5b 5d c3 48 89 df e8 b0 fa ff ff 48 89 df 31 f6 e8 c6 7d ff ff 5b 5d c3 48 c7 c6 08 54 a2 81 48 89 df e8 a4 c5 01 00 <0f> 0b 66 90 66 66 66 66 90 55 48 89 e5 41 55 41 54 53 48 8b 47
   RIP  [<ffffffff8118998c>] put_page+0x5c/0x60
    RSP <ffff88007c213e00>

The root cause resides in get_any_page() which retries to get a refcount
of the page to be soft-offlined.  This function calls
put_hwpoison_page(), expecting that the target page is putback to LRU
list.  But it can be also freed to buddy.  So the second check need to
care about such case.

Fixes: af8fae7 ("mm/memory-failure.c: clean up soft_offline_page()")
Signed-off-by: Naoya Horiguchi <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Jerome Marchand <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Steve Capper <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: David Rientjes <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Noltari pushed a commit to Noltari/linux that referenced this pull request Feb 25, 2016
commit d96b339 upstream.

I saw the following BUG_ON triggered in a testcase where a process calls
madvise(MADV_SOFT_OFFLINE) on thps, along with a background process that
calls migratepages command repeatedly (doing ping-pong among different
NUMA nodes) for the first process:

   Soft offlining page 0x60000 at 0x700000600000
   __get_any_page: 0x60000 free buddy page
   page:ffffea0001800000 count:0 mapcount:-127 mapping:          (null) index:0x1
   flags: 0x1fffc0000000000()
   page dumped because: VM_BUG_ON_PAGE(atomic_read(&page->_count) == 0)
   ------------[ cut here ]------------
   kernel BUG at /src/linux-dev/include/linux/mm.h:342!
   invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
   Modules linked in: cfg80211 rfkill crc32c_intel serio_raw virtio_balloon i2c_piix4 virtio_blk virtio_net ata_generic pata_acpi
   CPU: 3 PID: 3035 Comm: test_alloc_gene Tainted: G           O    4.4.0-rc8-v4.4-rc8-160107-1501-00000-rc8+ torvalds#74
   Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
   task: ffff88007c63d5c0 ti: ffff88007c210000 task.ti: ffff88007c210000
   RIP: 0010:[<ffffffff8118998c>]  [<ffffffff8118998c>] put_page+0x5c/0x60
   RSP: 0018:ffff88007c213e00  EFLAGS: 00010246
   Call Trace:
     put_hwpoison_page+0x4e/0x80
     soft_offline_page+0x501/0x520
     SyS_madvise+0x6bc/0x6f0
     entry_SYSCALL_64_fastpath+0x12/0x6a
   Code: 8b fc ff ff 5b 5d c3 48 89 df e8 b0 fa ff ff 48 89 df 31 f6 e8 c6 7d ff ff 5b 5d c3 48 c7 c6 08 54 a2 81 48 89 df e8 a4 c5 01 00 <0f> 0b 66 90 66 66 66 66 90 55 48 89 e5 41 55 41 54 53 48 8b 47
   RIP  [<ffffffff8118998c>] put_page+0x5c/0x60
    RSP <ffff88007c213e00>

The root cause resides in get_any_page() which retries to get a refcount
of the page to be soft-offlined.  This function calls
put_hwpoison_page(), expecting that the target page is putback to LRU
list.  But it can be also freed to buddy.  So the second check need to
care about such case.

Fixes: af8fae7 ("mm/memory-failure.c: clean up soft_offline_page()")
Signed-off-by: Naoya Horiguchi <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Jerome Marchand <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Steve Capper <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: David Rientjes <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Noltari pushed a commit to Noltari/linux that referenced this pull request Feb 26, 2016
commit d96b339 upstream.

I saw the following BUG_ON triggered in a testcase where a process calls
madvise(MADV_SOFT_OFFLINE) on thps, along with a background process that
calls migratepages command repeatedly (doing ping-pong among different
NUMA nodes) for the first process:

   Soft offlining page 0x60000 at 0x700000600000
   __get_any_page: 0x60000 free buddy page
   page:ffffea0001800000 count:0 mapcount:-127 mapping:          (null) index:0x1
   flags: 0x1fffc0000000000()
   page dumped because: VM_BUG_ON_PAGE(atomic_read(&page->_count) == 0)
   ------------[ cut here ]------------
   kernel BUG at /src/linux-dev/include/linux/mm.h:342!
   invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
   Modules linked in: cfg80211 rfkill crc32c_intel serio_raw virtio_balloon i2c_piix4 virtio_blk virtio_net ata_generic pata_acpi
   CPU: 3 PID: 3035 Comm: test_alloc_gene Tainted: G           O    4.4.0-rc8-v4.4-rc8-160107-1501-00000-rc8+ torvalds#74
   Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
   task: ffff88007c63d5c0 ti: ffff88007c210000 task.ti: ffff88007c210000
   RIP: 0010:[<ffffffff8118998c>]  [<ffffffff8118998c>] put_page+0x5c/0x60
   RSP: 0018:ffff88007c213e00  EFLAGS: 00010246
   Call Trace:
     put_hwpoison_page+0x4e/0x80
     soft_offline_page+0x501/0x520
     SyS_madvise+0x6bc/0x6f0
     entry_SYSCALL_64_fastpath+0x12/0x6a
   Code: 8b fc ff ff 5b 5d c3 48 89 df e8 b0 fa ff ff 48 89 df 31 f6 e8 c6 7d ff ff 5b 5d c3 48 c7 c6 08 54 a2 81 48 89 df e8 a4 c5 01 00 <0f> 0b 66 90 66 66 66 66 90 55 48 89 e5 41 55 41 54 53 48 8b 47
   RIP  [<ffffffff8118998c>] put_page+0x5c/0x60
    RSP <ffff88007c213e00>

The root cause resides in get_any_page() which retries to get a refcount
of the page to be soft-offlined.  This function calls
put_hwpoison_page(), expecting that the target page is putback to LRU
list.  But it can be also freed to buddy.  So the second check need to
care about such case.

Fixes: af8fae7 ("mm/memory-failure.c: clean up soft_offline_page()")
Signed-off-by: Naoya Horiguchi <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Jerome Marchand <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Steve Capper <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: David Rientjes <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Jiri Slaby <[email protected]>
ddstreet pushed a commit to ddstreet/linux that referenced this pull request Feb 29, 2016
BugLink: http://bugs.launchpad.net/bugs/1540532

commit d96b339 upstream.

I saw the following BUG_ON triggered in a testcase where a process calls
madvise(MADV_SOFT_OFFLINE) on thps, along with a background process that
calls migratepages command repeatedly (doing ping-pong among different
NUMA nodes) for the first process:

   Soft offlining page 0x60000 at 0x700000600000
   __get_any_page: 0x60000 free buddy page
   page:ffffea0001800000 count:0 mapcount:-127 mapping:          (null) index:0x1
   flags: 0x1fffc0000000000()
   page dumped because: VM_BUG_ON_PAGE(atomic_read(&page->_count) == 0)
   ------------[ cut here ]------------
   kernel BUG at /src/linux-dev/include/linux/mm.h:342!
   invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
   Modules linked in: cfg80211 rfkill crc32c_intel serio_raw virtio_balloon i2c_piix4 virtio_blk virtio_net ata_generic pata_acpi
   CPU: 3 PID: 3035 Comm: test_alloc_gene Tainted: G           O    4.4.0-rc8-v4.4-rc8-160107-1501-00000-rc8+ torvalds#74
   Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
   task: ffff88007c63d5c0 ti: ffff88007c210000 task.ti: ffff88007c210000
   RIP: 0010:[<ffffffff8118998c>]  [<ffffffff8118998c>] put_page+0x5c/0x60
   RSP: 0018:ffff88007c213e00  EFLAGS: 00010246
   Call Trace:
     put_hwpoison_page+0x4e/0x80
     soft_offline_page+0x501/0x520
     SyS_madvise+0x6bc/0x6f0
     entry_SYSCALL_64_fastpath+0x12/0x6a
   Code: 8b fc ff ff 5b 5d c3 48 89 df e8 b0 fa ff ff 48 89 df 31 f6 e8 c6 7d ff ff 5b 5d c3 48 c7 c6 08 54 a2 81 48 89 df e8 a4 c5 01 00 <0f> 0b 66 90 66 66 66 66 90 55 48 89 e5 41 55 41 54 53 48 8b 47
   RIP  [<ffffffff8118998c>] put_page+0x5c/0x60
    RSP <ffff88007c213e00>

The root cause resides in get_any_page() which retries to get a refcount
of the page to be soft-offlined.  This function calls
put_hwpoison_page(), expecting that the target page is putback to LRU
list.  But it can be also freed to buddy.  So the second check need to
care about such case.

Fixes: af8fae7 ("mm/memory-failure.c: clean up soft_offline_page()")
Signed-off-by: Naoya Horiguchi <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Jerome Marchand <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Steve Capper <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: David Rientjes <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Kamal Mostafa <[email protected]>
sashalevin pushed a commit to sashalevin/linux-stable-security that referenced this pull request Apr 29, 2016
[ Upstream commit d96b339 ]

I saw the following BUG_ON triggered in a testcase where a process calls
madvise(MADV_SOFT_OFFLINE) on thps, along with a background process that
calls migratepages command repeatedly (doing ping-pong among different
NUMA nodes) for the first process:

   Soft offlining page 0x60000 at 0x700000600000
   __get_any_page: 0x60000 free buddy page
   page:ffffea0001800000 count:0 mapcount:-127 mapping:          (null) index:0x1
   flags: 0x1fffc0000000000()
   page dumped because: VM_BUG_ON_PAGE(atomic_read(&page->_count) == 0)
   ------------[ cut here ]------------
   kernel BUG at /src/linux-dev/include/linux/mm.h:342!
   invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC
   Modules linked in: cfg80211 rfkill crc32c_intel serio_raw virtio_balloon i2c_piix4 virtio_blk virtio_net ata_generic pata_acpi
   CPU: 3 PID: 3035 Comm: test_alloc_gene Tainted: G           O    4.4.0-rc8-v4.4-rc8-160107-1501-00000-rc8+ torvalds#74
   Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
   task: ffff88007c63d5c0 ti: ffff88007c210000 task.ti: ffff88007c210000
   RIP: 0010:[<ffffffff8118998c>]  [<ffffffff8118998c>] put_page+0x5c/0x60
   RSP: 0018:ffff88007c213e00  EFLAGS: 00010246
   Call Trace:
     put_hwpoison_page+0x4e/0x80
     soft_offline_page+0x501/0x520
     SyS_madvise+0x6bc/0x6f0
     entry_SYSCALL_64_fastpath+0x12/0x6a
   Code: 8b fc ff ff 5b 5d c3 48 89 df e8 b0 fa ff ff 48 89 df 31 f6 e8 c6 7d ff ff 5b 5d c3 48 c7 c6 08 54 a2 81 48 89 df e8 a4 c5 01 00 <0f> 0b 66 90 66 66 66 66 90 55 48 89 e5 41 55 41 54 53 48 8b 47
   RIP  [<ffffffff8118998c>] put_page+0x5c/0x60
    RSP <ffff88007c213e00>

The root cause resides in get_any_page() which retries to get a refcount
of the page to be soft-offlined.  This function calls
put_hwpoison_page(), expecting that the target page is putback to LRU
list.  But it can be also freed to buddy.  So the second check need to
care about such case.

Fixes: af8fae7 ("mm/memory-failure.c: clean up soft_offline_page()")
Signed-off-by: Naoya Horiguchi <[email protected]>
Cc: Sasha Levin <[email protected]>
Cc: Aneesh Kumar K.V <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Jerome Marchand <[email protected]>
Cc: Andrea Arcangeli <[email protected]>
Cc: Hugh Dickins <[email protected]>
Cc: Dave Hansen <[email protected]>
Cc: Mel Gorman <[email protected]>
Cc: Rik van Riel <[email protected]>
Cc: Steve Capper <[email protected]>
Cc: Johannes Weiner <[email protected]>
Cc: Michal Hocko <[email protected]>
Cc: Christoph Lameter <[email protected]>
Cc: David Rientjes <[email protected]>
Cc: <[email protected]>	[3.9+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 18, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 18, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 18, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 18, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 18, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 18, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 18, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 19, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 19, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 19, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 19, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 19, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 19, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 19, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 19, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 19, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 19, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 19, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 19, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
ammarfaizi2 pushed a commit to ammarfaizi2/linux-fork that referenced this pull request Apr 20, 2023
[ Upstream commit 066b867 ]

assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
waby38b pushed a commit to avolmat/linux that referenced this pull request Apr 20, 2023
assume the following setup on a single machine:
1. An openvswitch instance with one bridge and default flows
2. two network namespaces "server" and "client"
3. two ovs interfaces "server" and "client" on the bridge
4. for each ovs interface a veth pair with a matching name and 32 rx and
   tx queues
5. move the ends of the veth pairs to the respective network namespaces
6. assign ip addresses to each of the veth ends in the namespaces (needs
   to be the same subnet)
7. start some http server on the server network namespace
8. test if a client in the client namespace can reach the http server

when following the actions below the host has a chance of getting a cpu
stuck in a infinite loop:
1. send a large amount of parallel requests to the http server (around
   3000 curls should work)
2. in parallel delete the network namespace (do not delete interfaces or
   stop the server, just kill the namespace)

there is a low chance that this will cause the below kernel cpu stuck
message. If this does not happen just retry.
Below there is also the output of bpftrace for the functions mentioned
in the output.

The series of events happening here is:
1. the network namespace is deleted calling
   `unregister_netdevice_many_notify` somewhere in the process
2. this sets first `NETREG_UNREGISTERING` on both ends of the veth and
   then runs `synchronize_net`
3. it then calls `call_netdevice_notifiers` with `NETDEV_UNREGISTER`
4. this is then handled by `dp_device_event` which calls
   `ovs_netdev_detach_dev` (if a vport is found, which is the case for
   the veth interface attached to ovs)
5. this removes the rx_handlers of the device but does not prevent
   packages to be sent to the device
6. `dp_device_event` then queues the vport deletion to work in
   background as a ovs_lock is needed that we do not hold in the
   unregistration path
7. `unregister_netdevice_many_notify` continues to call
   `netdev_unregister_kobject` which sets `real_num_tx_queues` to 0
8. port deletion continues (but details are not relevant for this issue)
9. at some future point the background task deletes the vport

If after 7. but before 9. a packet is send to the ovs vport (which is
not deleted at this point in time) which forwards it to the
`dev_queue_xmit` flow even though the device is unregistering.
In `skb_tx_hash` (which is called in the `dev_queue_xmit`) path there is
a while loop (if the packet has a rx_queue recorded) that is infinite if
`dev->real_num_tx_queues` is zero.

To prevent this from happening we update `do_output` to handle devices
without carrier the same as if the device is not found (which would
be the code path after 9. is done).

Additionally we now produce a warning in `skb_tx_hash` if we will hit
the infinite loop.

bpftrace (first word is function name):

__dev_queue_xmit server: real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 1, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 1
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 2, reg_state: 1
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 6, reg_state: 2
ovs_netdev_detach_dev server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, reg_state: 2
netdev_rx_handler_unregister server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
netdev_rx_handler_unregister ret server: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 27, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 22, reg_state: 2
dp_device_event server: real_num_tx_queues: 1 cpu 9, pid: 21024, tid: 21024, event 18, reg_state: 2
netdev_unregister_kobject: real_num_tx_queues: 1, cpu: 9, pid: 21024, tid: 21024
synchronize_rcu_expedited: cpu 9, pid: 21024, tid: 21024
ovs_vport_send server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
__dev_queue_xmit server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
netdev_core_pick_tx server: addr: 0xffff9f0a46d4a000 real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024, skb_addr: 0xffff9edb6f207000, reg_state: 2
broken device server: real_num_tx_queues: 0, cpu: 2, pid: 28024, tid: 28024
ovs_dp_detach_port server: real_num_tx_queues: 0 cpu 9, pid: 9124, tid: 9124, reg_state: 2
synchronize_rcu_expedited: cpu 9, pid: 33604, tid: 33604

stuck message:

watchdog: BUG: soft lockup - CPU#5 stuck for 26s! [curl:1929279]
Modules linked in: veth pktgen bridge stp llc ip_set_hash_net nft_counter xt_set nft_compat nf_tables ip_set_hash_ip ip_set nfnetlink_cttimeout nfnetlink openvswitch nsh nf_conncount nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 tls binfmt_misc nls_iso8859_1 input_leds joydev serio_raw dm_multipath scsi_dh_rdac scsi_dh_emc scsi_dh_alua sch_fq_codel drm efi_pstore virtio_rng ip_tables x_tables autofs4 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 multipath linear hid_generic usbhid hid crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel virtio_net ahci net_failover crypto_simd cryptd psmouse libahci virtio_blk failover
CPU: 5 PID: 1929279 Comm: curl Not tainted 5.15.0-67-generic torvalds#74-Ubuntu
Hardware name: OpenStack Foundation OpenStack Nova, BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:netdev_pick_tx+0xf1/0x320
Code: 00 00 8d 48 ff 0f b7 c1 66 39 ca 0f 86 e9 01 00 00 45 0f b7 ff 41 39 c7 0f 87 5b 01 00 00 44 29 f8 41 39 c7 0f 87 4f 01 00 00 <eb> f2 0f 1f 44 00 00 49 8b 94 24 28 04 00 00 48 85 d2 0f 84 53 01
RSP: 0018:ffffb78b40298820 EFLAGS: 00000246
RAX: 0000000000000000 RBX: ffff9c8773adc2e0 RCX: 000000000000083f
RDX: 0000000000000000 RSI: ffff9c8773adc2e0 RDI: ffff9c870a25e000
RBP: ffffb78b40298858 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff9c870a25e000
R13: ffff9c870a25e000 R14: ffff9c87fe043480 R15: 0000000000000000
FS:  00007f7b80008f00(0000) GS:ffff9c8e5f740000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7b80f6a0b0 CR3: 0000000329d66000 CR4: 0000000000350ee0
Call Trace:
 <IRQ>
 netdev_core_pick_tx+0xa4/0xb0
 __dev_queue_xmit+0xf8/0x510
 ? __bpf_prog_exit+0x1e/0x30
 dev_queue_xmit+0x10/0x20
 ovs_vport_send+0xad/0x170 [openvswitch]
 do_output+0x59/0x180 [openvswitch]
 do_execute_actions+0xa80/0xaa0 [openvswitch]
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? flow_lookup.constprop.0+0x5c/0x110 [openvswitch]
 ovs_execute_actions+0x4c/0x120 [openvswitch]
 ovs_dp_process_packet+0xa1/0x200 [openvswitch]
 ? ovs_ct_update_key.isra.0+0xa8/0x120 [openvswitch]
 ? ovs_ct_fill_key+0x1d/0x30 [openvswitch]
 ? ovs_flow_key_extract+0x2db/0x350 [openvswitch]
 ovs_vport_receive+0x77/0xd0 [openvswitch]
 ? __htab_map_lookup_elem+0x4e/0x60
 ? bpf_prog_680e8aff8547aec1_kfree+0x3b/0x714
 ? trace_call_bpf+0xc8/0x150
 ? kfree+0x1/0x250
 ? kfree+0x1/0x250
 ? kprobe_perf_func+0x4f/0x2b0
 ? kprobe_perf_func+0x4f/0x2b0
 ? __mod_memcg_lruvec_state+0x63/0xe0
 netdev_port_receive+0xc4/0x180 [openvswitch]
 ? netdev_port_receive+0x180/0x180 [openvswitch]
 netdev_frame_hook+0x1f/0x40 [openvswitch]
 __netif_receive_skb_core.constprop.0+0x23d/0xf00
 __netif_receive_skb_one_core+0x3f/0xa0
 __netif_receive_skb+0x15/0x60
 process_backlog+0x9e/0x170
 __napi_poll+0x33/0x180
 net_rx_action+0x126/0x280
 ? ttwu_do_activate+0x72/0xf0
 __do_softirq+0xd9/0x2e7
 ? rcu_report_exp_cpu_mult+0x1b0/0x1b0
 do_softirq+0x7d/0xb0
 </IRQ>
 <TASK>
 __local_bh_enable_ip+0x54/0x60
 ip_finish_output2+0x191/0x460
 __ip_finish_output+0xb7/0x180
 ip_finish_output+0x2e/0xc0
 ip_output+0x78/0x100
 ? __ip_finish_output+0x180/0x180
 ip_local_out+0x5e/0x70
 __ip_queue_xmit+0x184/0x440
 ? tcp_syn_options+0x1f9/0x300
 ip_queue_xmit+0x15/0x20
 __tcp_transmit_skb+0x910/0x9c0
 ? __mod_memcg_state+0x44/0xa0
 tcp_connect+0x437/0x4e0
 ? ktime_get_with_offset+0x60/0xf0
 tcp_v4_connect+0x436/0x530
 __inet_stream_connect+0xd4/0x3a0
 ? kprobe_perf_func+0x4f/0x2b0
 ? aa_sk_perm+0x43/0x1c0
 inet_stream_connect+0x3b/0x60
 __sys_connect_file+0x63/0x70
 __sys_connect+0xa6/0xd0
 ? setfl+0x108/0x170
 ? do_fcntl+0xe8/0x5a0
 __x64_sys_connect+0x18/0x20
 do_syscall_64+0x5c/0xc0
 ? __x64_sys_fcntl+0xa9/0xd0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? do_syscall_64+0x69/0xc0
 ? __sys_setsockopt+0xea/0x1e0
 ? exit_to_user_mode_prepare+0x37/0xb0
 ? syscall_exit_to_user_mode+0x27/0x50
 ? __x64_sys_setsockopt+0x1f/0x30
 ? do_syscall_64+0x69/0xc0
 ? irqentry_exit+0x1d/0x30
 ? exc_page_fault+0x89/0x170
 entry_SYSCALL_64_after_hwframe+0x61/0xcb
RIP: 0033:0x7f7b8101c6a7
Code: 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 2a 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 18 89 54 24 0c 48 89 34 24 89
RSP: 002b:00007ffffd6b2198 EFLAGS: 00000246 ORIG_RAX: 000000000000002a
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7b8101c6a7
RDX: 0000000000000010 RSI: 00007ffffd6b2360 RDI: 0000000000000005
RBP: 0000561f1370d560 R08: 00002795ad21d1ac R09: 0030312e302e302e
R10: 00007ffffd73f080 R11: 0000000000000246 R12: 0000561f1370c410
R13: 0000000000000000 R14: 0000000000000005 R15: 0000000000000000
 </TASK>

Fixes: 7f8a436 ("openvswitch: Add conntrack action")
Co-developed-by: Luca Czesla <[email protected]>
Signed-off-by: Luca Czesla <[email protected]>
Signed-off-by: Felix Huettner <[email protected]>
Reviewed-by: Eric Dumazet <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
Link: https://lore.kernel.org/r/ZC0pBXBAgh7c76CA@kernel-bug-kernel-bug
Signed-off-by: Jakub Kicinski <[email protected]>
gyroninja added a commit to gyroninja/linux that referenced this pull request Jan 28, 2024
KSAN calls into rcu code which then triggers a write that reenters into KSAN
getting the system stuck doing infinite recursion.

#0  kmsan_get_context () at mm/kmsan/kmsan.h:106
#1  __msan_get_context_state () at mm/kmsan/instrumentation.c:331
#2  0xffffffff81495671 in get_current () at ./arch/x86/include/asm/current.h:42
#3  rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
#4  __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
#5  0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#6  pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#7  kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#8  virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#9  0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#10 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#11 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#12 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#13 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#14 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#15 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#16 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#17 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#18 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#19 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#20 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#21 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#22 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#23 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#24 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#25 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#26 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#27 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#28 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#29 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#30 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#31 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#32 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#33 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#34 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#35 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#36 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#37 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#38 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#39 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#40 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#41 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#42 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#43 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#44 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#45 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#46 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#47 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#48 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#49 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#50 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#51 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
#52 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
#53 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#54 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#55 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#56 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#57 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
#58 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#59 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#60 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#61 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#62 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#63 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#64 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#65 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#66 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#67 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#68 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#69 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
#70 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#71 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#72 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#73 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#74 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#75 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#76 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#77 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff86203c90) at ./arch/x86/include/asm/kmsan.h:82
torvalds#78 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff86203c90) at mm/kmsan/shadow.c:75
torvalds#79 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff86203c90, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#80 kmsan_get_shadow_origin_ptr (address=0xffffffff86203c90, size=8, store=false) at mm/kmsan/shadow.c:97
torvalds#81 0xffffffff81b1dc72 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=8, store=false) at mm/kmsan/instrumentation.c:36
torvalds#82 __msan_metadata_ptr_for_load_8 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:92
torvalds#83 0xffffffff814fdb9e in filter_irq_stacks (entries=<optimized out>, nr_entries=4) at kernel/stacktrace.c:397
torvalds#84 0xffffffff829520e8 in stack_depot_save_flags (entries=0xffffffff8620d974 <init_task+1012>, nr_entries=4, alloc_flags=0, depot_flags=0) at lib/stackdepot.c:500
torvalds#85 0xffffffff81b1e560 in __msan_poison_alloca (address=0xffffffff86203da0, size=24, descr=<optimized out>) at mm/kmsan/instrumentation.c:285
torvalds#86 0xffffffff8562821c in _printk (fmt=0xffffffff85f191a5 "\0016Attempting lock1") at kernel/printk/printk.c:2324
torvalds#87 0xffffffff81942aa2 in kmem_cache_create_usercopy (name=0xffffffff85f18903 "mm_struct", size=1296, align=0, flags=270336, useroffset=<optimized out>, usersize=<optimized out>, ctor=0x0 <fixed_percpu_data>) at mm/slab_common.c:296
torvalds#88 0xffffffff86f337a0 in mm_cache_init () at kernel/fork.c:3262
torvalds#89 0xffffffff86eacb8e in start_kernel () at init/main.c:932
torvalds#90 0xffffffff86ecdf94 in x86_64_start_reservations (real_mode_data=0x140e0 <exception_stacks+28896> <error: Cannot access memory at address 0x140e0>) at arch/x86/kernel/head64.c:555
torvalds#91 0xffffffff86ecde9b in x86_64_start_kernel (real_mode_data=0x140e0 <exception_stacks+28896> <error: Cannot access memory at address 0x140e0>) at arch/x86/kernel/head64.c:536
torvalds#92 0xffffffff810001d3 in secondary_startup_64 () at /pool/workspace/linux/arch/x86/kernel/head_64.S:461
torvalds#93 0x0000000000000000 in ??
gyroninja added a commit to gyroninja/linux that referenced this pull request Jan 28, 2024
As of 5ec8e8e(mm/sparsemem: fix race in accessing memory_section->usage) KMSAN
now calls into RCU tree code during kmsan_get_metadata. This will trigger a
write that will reenter into KMSAN getting the system stuck doing infinite
recursion.

#0  kmsan_get_context () at mm/kmsan/kmsan.h:106
#1  __msan_get_context_state () at mm/kmsan/instrumentation.c:331
#2  0xffffffff81495671 in get_current () at ./arch/x86/include/asm/current.h:42
#3  rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
#4  __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
#5  0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#6  pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#7  kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#8  virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#9  0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#10 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#11 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#12 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#13 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#14 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#15 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#16 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#17 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#18 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#19 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#20 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#21 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#22 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#23 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#24 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#25 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#26 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#27 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#28 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#29 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#30 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#31 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#32 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#33 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#34 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#35 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#36 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#37 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#38 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#39 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#40 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#41 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#42 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#43 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#44 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#45 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#46 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#47 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#48 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#49 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#50 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#51 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
#52 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
#53 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#54 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#55 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#56 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#57 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
#58 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#59 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#60 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#61 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#62 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#63 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#64 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#65 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#66 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#67 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff8620d974 <init_task+1012>) at ./arch/x86/include/asm/kmsan.h:82
torvalds#68 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/shadow.c:75
torvalds#69 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff8620d974 <init_task+1012>, is_origin=false) at mm/kmsan/shadow.c:143
#70 kmsan_get_shadow_origin_ptr (address=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/shadow.c:97
torvalds#71 0xffffffff81b1dbd2 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=4, store=false) at mm/kmsan/instrumentation.c:36
torvalds#72 __msan_metadata_ptr_for_load_4 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:91
torvalds#73 0xffffffff8149568f in rcu_preempt_read_enter () at kernel/rcu/tree_plugin.h:379
torvalds#74 __rcu_read_lock () at kernel/rcu/tree_plugin.h:402
torvalds#75 0xffffffff81b2054b in rcu_read_lock () at ./include/linux/rcupdate.h:748
torvalds#76 pfn_valid (pfn=<optimized out>) at ./include/linux/mmzone.h:2016
torvalds#77 kmsan_virt_addr_valid (addr=addr@entry=0xffffffff86203c90) at ./arch/x86/include/asm/kmsan.h:82
torvalds#78 virt_to_page_or_null (vaddr=vaddr@entry=0xffffffff86203c90) at mm/kmsan/shadow.c:75
torvalds#79 0xffffffff81b2023c in kmsan_get_metadata (address=0xffffffff86203c90, is_origin=false) at mm/kmsan/shadow.c:143
torvalds#80 kmsan_get_shadow_origin_ptr (address=0xffffffff86203c90, size=8, store=false) at mm/kmsan/shadow.c:97
torvalds#81 0xffffffff81b1dc72 in get_shadow_origin_ptr (addr=0xffffffff8620d974 <init_task+1012>, size=8, store=false) at mm/kmsan/instrumentation.c:36
torvalds#82 __msan_metadata_ptr_for_load_8 (addr=0xffffffff8620d974 <init_task+1012>) at mm/kmsan/instrumentation.c:92
torvalds#83 0xffffffff814fdb9e in filter_irq_stacks (entries=<optimized out>, nr_entries=4) at kernel/stacktrace.c:397
torvalds#84 0xffffffff829520e8 in stack_depot_save_flags (entries=0xffffffff8620d974 <init_task+1012>, nr_entries=4, alloc_flags=0, depot_flags=0) at lib/stackdepot.c:500
torvalds#85 0xffffffff81b1e560 in __msan_poison_alloca (address=0xffffffff86203da0, size=24, descr=<optimized out>) at mm/kmsan/instrumentation.c:285
torvalds#86 0xffffffff8562821c in _printk (fmt=0xffffffff85f191a5 "\0016Attempting lock1") at kernel/printk/printk.c:2324
torvalds#87 0xffffffff81942aa2 in kmem_cache_create_usercopy (name=0xffffffff85f18903 "mm_struct", size=1296, align=0, flags=270336, useroffset=<optimized out>, usersize=<optimized out>, ctor=0x0 <fixed_percpu_data>) at mm/slab_common.c:296
torvalds#88 0xffffffff86f337a0 in mm_cache_init () at kernel/fork.c:3262
torvalds#89 0xffffffff86eacb8e in start_kernel () at init/main.c:932
torvalds#90 0xffffffff86ecdf94 in x86_64_start_reservations (real_mode_data=0x140e0 <exception_stacks+28896> <error: Cannot access memory at address 0x140e0>) at arch/x86/kernel/head64.c:555
torvalds#91 0xffffffff86ecde9b in x86_64_start_kernel (real_mode_data=0x140e0 <exception_stacks+28896> <error: Cannot access memory at address 0x140e0>) at arch/x86/kernel/head64.c:536
torvalds#92 0xffffffff810001d3 in secondary_startup_64 () at /pool/workspace/linux/arch/x86/kernel/head_64.S:461
torvalds#93 0x0000000000000000 in ??
jonhunter pushed a commit to jonhunter/linux that referenced this pull request Feb 28, 2024
Puranjay Mohan says:

====================
bpf, arm64: Support Exceptions

Changes in V2->V3:
V2: https://lore.kernel.org/all/[email protected]/
- Use unwinder from stacktrace.c rather than open coding the unwind logic.
- Fix a bug in the prologue related to BPF_FP (Xu Kuohai)

Changes in V1->V2:
V1: https://lore.kernel.org/all/[email protected]/
- Remove exceptions from DENYLIST.aarch64 as they are supported now.

The base support for exceptions was merged with [1] and it was enabled for
x86-64.

This patch set enables the support on ARM64, all sefltests are passing:

# ./test_progs -a exceptions
torvalds#74/1    exceptions/exception_throw_always_1:OK
torvalds#74/2    exceptions/exception_throw_always_2:OK
torvalds#74/3    exceptions/exception_throw_unwind_1:OK
torvalds#74/4    exceptions/exception_throw_unwind_2:OK
torvalds#74/5    exceptions/exception_throw_default:OK
torvalds#74/6    exceptions/exception_throw_default_value:OK
torvalds#74/7    exceptions/exception_tail_call:OK
torvalds#74/8    exceptions/exception_ext:OK
torvalds#74/9    exceptions/exception_ext_mod_cb_runtime:OK
torvalds#74/10   exceptions/exception_throw_subprog:OK
torvalds#74/11   exceptions/exception_assert_nz_gfunc:OK
torvalds#74/12   exceptions/exception_assert_zero_gfunc:OK
torvalds#74/13   exceptions/exception_assert_neg_gfunc:OK
torvalds#74/14   exceptions/exception_assert_pos_gfunc:OK
torvalds#74/15   exceptions/exception_assert_negeq_gfunc:OK
torvalds#74/16   exceptions/exception_assert_poseq_gfunc:OK
torvalds#74/17   exceptions/exception_assert_nz_gfunc_with:OK
torvalds#74/18   exceptions/exception_assert_zero_gfunc_with:OK
torvalds#74/19   exceptions/exception_assert_neg_gfunc_with:OK
torvalds#74/20   exceptions/exception_assert_pos_gfunc_with:OK
torvalds#74/21   exceptions/exception_assert_negeq_gfunc_with:OK
torvalds#74/22   exceptions/exception_assert_poseq_gfunc_with:OK
torvalds#74/23   exceptions/exception_bad_assert_nz_gfunc:OK
torvalds#74/24   exceptions/exception_bad_assert_zero_gfunc:OK
torvalds#74/25   exceptions/exception_bad_assert_neg_gfunc:OK
torvalds#74/26   exceptions/exception_bad_assert_pos_gfunc:OK
torvalds#74/27   exceptions/exception_bad_assert_negeq_gfunc:OK
torvalds#74/28   exceptions/exception_bad_assert_poseq_gfunc:OK
torvalds#74/29   exceptions/exception_bad_assert_nz_gfunc_with:OK
torvalds#74/30   exceptions/exception_bad_assert_zero_gfunc_with:OK
torvalds#74/31   exceptions/exception_bad_assert_neg_gfunc_with:OK
torvalds#74/32   exceptions/exception_bad_assert_pos_gfunc_with:OK
torvalds#74/33   exceptions/exception_bad_assert_negeq_gfunc_with:OK
torvalds#74/34   exceptions/exception_bad_assert_poseq_gfunc_with:OK
torvalds#74/35   exceptions/exception_assert_range:OK
torvalds#74/36   exceptions/exception_assert_range_with:OK
torvalds#74/37   exceptions/exception_bad_assert_range:OK
torvalds#74/38   exceptions/exception_bad_assert_range_with:OK
torvalds#74/39   exceptions/non-throwing fentry -> exception_cb:OK
torvalds#74/40   exceptions/throwing fentry -> exception_cb:OK
torvalds#74/41   exceptions/non-throwing fexit -> exception_cb:OK
torvalds#74/42   exceptions/throwing fexit -> exception_cb:OK
torvalds#74/43   exceptions/throwing extension (with custom cb) -> exception_cb:OK
torvalds#74/44   exceptions/throwing extension -> global func in exception_cb:OK
torvalds#74/45   exceptions/exception_ext_mod_cb_runtime:OK
torvalds#74/46   exceptions/throwing extension (with custom cb) -> global func in exception_cb:OK
torvalds#74/47   exceptions/exception_ext:OK
torvalds#74/48   exceptions/non-throwing fentry -> non-throwing subprog:OK
torvalds#74/49   exceptions/throwing fentry -> non-throwing subprog:OK
torvalds#74/50   exceptions/non-throwing fentry -> throwing subprog:OK
torvalds#74/51   exceptions/throwing fentry -> throwing subprog:OK
torvalds#74/52   exceptions/non-throwing fexit -> non-throwing subprog:OK
torvalds#74/53   exceptions/throwing fexit -> non-throwing subprog:OK
torvalds#74/54   exceptions/non-throwing fexit -> throwing subprog:OK
torvalds#74/55   exceptions/throwing fexit -> throwing subprog:OK
torvalds#74/56   exceptions/non-throwing fmod_ret -> non-throwing subprog:OK
torvalds#74/57   exceptions/non-throwing fmod_ret -> non-throwing global subprog:OK
torvalds#74/58   exceptions/non-throwing extension -> non-throwing subprog:OK
torvalds#74/59   exceptions/non-throwing extension -> throwing subprog:OK
torvalds#74/60   exceptions/non-throwing extension -> non-throwing subprog:OK
torvalds#74/61   exceptions/non-throwing extension -> throwing global subprog:OK
torvalds#74/62   exceptions/throwing extension -> throwing global subprog:OK
torvalds#74/63   exceptions/throwing extension -> non-throwing global subprog:OK
torvalds#74/64   exceptions/non-throwing extension -> main subprog:OK
torvalds#74/65   exceptions/throwing extension -> main subprog:OK
torvalds#74/66   exceptions/reject_exception_cb_type_1:OK
torvalds#74/67   exceptions/reject_exception_cb_type_2:OK
torvalds#74/68   exceptions/reject_exception_cb_type_3:OK
torvalds#74/69   exceptions/reject_exception_cb_type_4:OK
torvalds#74/70   exceptions/reject_async_callback_throw:OK
torvalds#74/71   exceptions/reject_with_lock:OK
torvalds#74/72   exceptions/reject_subprog_with_lock:OK
torvalds#74/73   exceptions/reject_with_rcu_read_lock:OK
torvalds#74/74   exceptions/reject_subprog_with_rcu_read_lock:OK
torvalds#74/75   exceptions/reject_with_rbtree_add_throw:OK
torvalds#74/76   exceptions/reject_with_reference:OK
torvalds#74/77   exceptions/reject_with_cb_reference:OK
torvalds#74/78   exceptions/reject_with_cb:OK
torvalds#74/79   exceptions/reject_with_subprog_reference:OK
torvalds#74/80   exceptions/reject_throwing_exception_cb:OK
torvalds#74/81   exceptions/reject_exception_cb_call_global_func:OK
torvalds#74/82   exceptions/reject_exception_cb_call_static_func:OK
torvalds#74/83   exceptions/reject_multiple_exception_cb:OK
torvalds#74/84   exceptions/reject_exception_throw_cb:OK
torvalds#74/85   exceptions/reject_exception_throw_cb_diff:OK
torvalds#74/86   exceptions/reject_set_exception_cb_bad_ret1:OK
torvalds#74/87   exceptions/reject_set_exception_cb_bad_ret2:OK
torvalds#74/88   exceptions/check_assert_eq_int_min:OK
torvalds#74/89   exceptions/check_assert_eq_int_max:OK
torvalds#74/90   exceptions/check_assert_eq_zero:OK
torvalds#74/91   exceptions/check_assert_eq_llong_min:OK
torvalds#74/92   exceptions/check_assert_eq_llong_max:OK
torvalds#74/93   exceptions/check_assert_lt_pos:OK
torvalds#74/94   exceptions/check_assert_lt_zero:OK
torvalds#74/95   exceptions/check_assert_lt_neg:OK
torvalds#74/96   exceptions/check_assert_le_pos:OK
torvalds#74/97   exceptions/check_assert_le_zero:OK
torvalds#74/98   exceptions/check_assert_le_neg:OK
torvalds#74/99   exceptions/check_assert_gt_pos:OK
torvalds#74/100  exceptions/check_assert_gt_zero:OK
torvalds#74/101  exceptions/check_assert_gt_neg:OK
torvalds#74/102  exceptions/check_assert_ge_pos:OK
torvalds#74/103  exceptions/check_assert_ge_zero:OK
torvalds#74/104  exceptions/check_assert_ge_neg:OK
torvalds#74/105  exceptions/check_assert_range_s64:OK
torvalds#74/106  exceptions/check_assert_range_u64:OK
torvalds#74/107  exceptions/check_assert_single_range_s64:OK
torvalds#74/108  exceptions/check_assert_single_range_u64:OK
torvalds#74/109  exceptions/check_assert_generic:OK
torvalds#74/110  exceptions/check_assert_with_return:OK
torvalds#74      exceptions:OK
Summary: 1/110 PASSED, 0 SKIPPED, 0 FAILED

[1] https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/commit/?h=for-next&id=ec6f1b4db95b7eedb3fe85f4f14e08fa0e9281c3
====================

Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Alexei Starovoitov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants