Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for the Nintendo Switch Online SNES controller #4

Open
wants to merge 1,030 commits into
base: ogorchock
Choose a base branch
from
Open

Add support for the Nintendo Switch Online SNES controller #4

wants to merge 1,030 commits into from

Conversation

nadiaholmquist
Copy link

This pull request adds support for the Switch SNES controller.

Since I have one of these controllers I decided to try and take a stab at adding support for it in the driver. I'm completely unfamiliar with kernel module development so some things might not be entirely right, but from my testing with my European model of the controller, what I've added seems to work well.

piastry and others added 30 commits October 17, 2019 13:47
commit c82e5ac upstream.

Currently the client indicates that a dentry is stale when inode
numbers or type types between a local inode and a remote file
don't match. If this is the case attributes is not being copied
from remote to local, so, it is already known that the local copy
has stale metadata. That's why the inode needs to be marked for
revalidation in order to tell the VFS to lookup the dentry again
before openning a file. This prevents unexpected stale errors
to be returned to the user space when openning a file.

Cc: <[email protected]>
Signed-off-by: Pavel Shilovsky <[email protected]>
Signed-off-by: Steve French <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 0b3d0ef upstream.

Mark inode for force revalidation if LOOKUP_REVAL flag is set.
This tells the client to actually send a QueryInfo request to
the server to obtain the latest metadata in case a directory
or a file were changed remotely. Only do that if the client
doesn't have a lease for the file to avoid unneeded round
trips to the server.

Cc: <[email protected]>
Signed-off-by: Pavel Shilovsky <[email protected]>
Signed-off-by: Steve French <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit cb24881 upstream.

Commit 487317c ("cifs: add spinlock for the openFileList to
cifsInodeInfo") added cifsInodeInfo->open_file_lock spin_lock to protect
the openFileList, but missed a few places where cifs_inode->openFileList
was enumerated.  Change these remaining tcon->open_file_lock to
cifsInodeInfo->open_file_lock to avoid panic in is_size_safe_to_change.

[17313.245641] RIP: 0010:is_size_safe_to_change+0x57/0xb0 [cifs]
[17313.245645] Code: 68 40 48 89 ef e8 19 67 b7 f1 48 8b 43 40 48 8d 4b 40 48 8d 50 f0 48 39 c1 75 0f eb 47 48 8b 42 10 48 8d 50 f0 48 39 c1 74 3a <8b> 80 88 00 00 00 83 c0 01 a8 02 74 e6 48 89 ef c6 07 00 0f 1f 40
[17313.245649] RSP: 0018:ffff94ae1baefa30 EFLAGS: 00010202
[17313.245654] RAX: dead000000000100 RBX: ffff88dc72243300 RCX: ffff88dc72243340
[17313.245657] RDX: dead0000000000f0 RSI: 00000000098f7940 RDI: ffff88dd3102f040
[17313.245659] RBP: ffff88dd3102f040 R08: 0000000000000000 R09: ffff94ae1baefc40
[17313.245661] R10: ffffcdc8bb1c4e80 R11: ffffcdc8b50adb08 R12: 00000000098f7940
[17313.245663] R13: ffff88dc72243300 R14: ffff88dbc8f19600 R15: ffff88dc72243428
[17313.245667] FS:  00007fb145485700(0000) GS:ffff88dd3e000000(0000) knlGS:0000000000000000
[17313.245670] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[17313.245672] CR2: 0000026bb46c6000 CR3: 0000004edb110003 CR4: 00000000007606e0
[17313.245753] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[17313.245756] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[17313.245759] PKRU: 55555554
[17313.245761] Call Trace:
[17313.245803]  cifs_fattr_to_inode+0x16b/0x580 [cifs]
[17313.245838]  cifs_get_inode_info+0x35c/0xa60 [cifs]
[17313.245852]  ? kmem_cache_alloc_trace+0x151/0x1d0
[17313.245885]  cifs_open+0x38f/0x990 [cifs]
[17313.245921]  ? cifs_revalidate_dentry_attr+0x3e/0x350 [cifs]
[17313.245953]  ? cifsFileInfo_get+0x30/0x30 [cifs]
[17313.245960]  ? do_dentry_open+0x132/0x330
[17313.245963]  do_dentry_open+0x132/0x330
[17313.245969]  path_openat+0x573/0x14d0
[17313.245974]  do_filp_open+0x93/0x100
[17313.245979]  ? __check_object_size+0xa3/0x181
[17313.245986]  ? audit_alloc_name+0x7e/0xd0
[17313.245992]  do_sys_open+0x184/0x220
[17313.245999]  do_syscall_64+0x5b/0x1b0

Fixes: 487317c ("cifs: add spinlock for the openFileList to cifsInodeInfo")

CC: Stable <[email protected]>
Signed-off-by: Dave Wysochanski <[email protected]>
Reviewed-by: Ronnie Sahlberg <[email protected]>
Signed-off-by: Steve French <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit b0f53db upstream.

Partially revert 16db3d3 ("kernel/sysctl.c: threads-max observe
limits") because the patch is causing a regression to any workload which
needs to override the auto-tuning of the limit provided by kernel.

set_max_threads is implementing a boot time guesstimate to provide a
sensible limit of the concurrently running threads so that runaways will
not deplete all the memory.  This is a good thing in general but there
are workloads which might need to increase this limit for an application
to run (reportedly WebSpher MQ is affected) and that is simply not
possible after the mentioned change.  It is also very dubious to
override an admin decision by an estimation that doesn't have any direct
relation to correctness of the kernel operation.

Fix this by dropping set_max_threads from sysctl_max_threads so any
value is accepted as long as it fits into MAX_THREADS which is important
to check because allowing more threads could break internal robust futex
restriction.  While at it, do not use MIN_THREADS as the lower boundary
because it is also only a heuristic for automatic estimation and admin
might have a good reason to stop new threads to be created even when
below this limit.

This became more severe when we switched x86 from 4k to 8k kernel
stacks.  Starting since 6538b8e ("x86_64: expand kernel stack to
16K") (3.16) we use THREAD_SIZE_ORDER = 2 and that halved the auto-tuned
value.

In the particular case

  3.12
  kernel.threads-max = 515561

  4.4
  kernel.threads-max = 200000

Neither of the two values is really insane on 32GB machine.

I am not sure we want/need to tune the max_thread value further.  If
anything the tuning should be removed altogether if proven not useful in
general.  But we definitely need a way to override this auto-tuning.

Link: http://lkml.kernel.org/r/[email protected]
Fixes: 16db3d3 ("kernel/sysctl.c: threads-max observe limits")
Signed-off-by: Michal Hocko <[email protected]>
Reviewed-by: "Eric W. Biederman" <[email protected]>
Cc: Heinrich Schuchardt <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 5b6807d upstream.

There's a really hard to reproduce race in z3fold between z3fold_free()
and z3fold_reclaim_page().  z3fold_reclaim_page() can claim the page
after z3fold_free() has checked if the page was claimed and
z3fold_free() will then schedule this page for compaction which may in
turn lead to random page faults (since that page would have been
reclaimed by then).

Fix that by claiming page in the beginning of z3fold_free() and not
forgetting to clear the claim in the end.

[[email protected]: v2]
  Link: http://lkml.kernel.org/r/20190928113456.152742cf@bigdell
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Vitaly Wool <[email protected]>
Reported-by: Markus Linnala <[email protected]>
Cc: Dan Streetman <[email protected]>
Cc: Vlastimil Babka <[email protected]>
Cc: Henry Burns <[email protected]>
Cc: Shakeel Butt <[email protected]>
Cc: Markus Linnala <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 234fdce upstream.

On architectures like s390, arch_free_page() could mark the page unused
(set_page_unused()) and any access later would trigger a kernel panic.
Fix it by moving arch_free_page() after all possible accessing calls.

 Hardware name: IBM 2964 N96 400 (z/VM 6.4.0)
 Krnl PSW : 0404e00180000000 0000000026c2b96e (__free_pages_ok+0x34e/0x5d8)
            R:0 T:1 IO:0 EX:0 Key:0 M:1 W:0 P:0 AS:3 CC:2 PM:0 RI:0 EA:3
 Krnl GPRS: 0000000088d43af7 0000000000484000 000000000000007c 000000000000000f
            000003d080012100 000003d080013fc0 0000000000000000 0000000000100000
            00000000275cca48 0000000000000100 0000000000000008 000003d080010000
            00000000000001d0 000003d000000000 0000000026c2b78a 000000002717fdb0
 Krnl Code: 0000000026c2b95c: ec1100b30659 risbgn %r1,%r1,0,179,6
            0000000026c2b962: e32014000036 pfd 2,1024(%r1)
           #0000000026c2b968: d7ff10001000 xc 0(256,%r1),0(%r1)
           >0000000026c2b96e: 41101100  la %r1,256(%r1)
            0000000026c2b972: a737fff8  brctg %r3,26c2b962
            0000000026c2b976: d7ff10001000 xc 0(256,%r1),0(%r1)
            0000000026c2b97c: e31003400004 lg %r1,832
            0000000026c2b982: ebff1430016a asi 5168(%r1),-1
 Call Trace:
 __free_pages_ok+0x16a/0x5d8)
 memblock_free_all+0x206/0x290
 mem_init+0x58/0x120
 start_kernel+0x2b0/0x570
 startup_continue+0x6a/0xc0
 INFO: lockdep is turned off.
 Last Breaking-Event-Address:
 __free_pages_ok+0x372/0x5d8
 Kernel panic - not syncing: Fatal exception: panic_on_oops
 00: HCPGIR450W CP entered; disabled wait PSW 00020001 80000000 00000000 26A2379C

In the past, only kernel_poison_pages() would trigger this but it needs
"page_poison=on" kernel cmdline, and I suspect nobody tested that on
s390.  Recently, kernel_init_free_pages() (commit 6471384 ("mm:
security: introduce init_on_alloc=1 and init_on_free=1 boot options"))
was added and could trigger this as well.

[[email protected]: add comment]
Link: http://lkml.kernel.org/r/[email protected]
Fixes: 8823b1d ("mm/page_poison.c: enable PAGE_POISONING as a separate option")
Fixes: 6471384 ("mm: security: introduce init_on_alloc=1 and init_on_free=1 boot options")
Signed-off-by: Qian Cai <[email protected]>
Reviewed-by: Heiko Carstens <[email protected]>
Acked-by: Christian Borntraeger <[email protected]>
Acked-by: Michal Hocko <[email protected]>
Cc: "Kirill A. Shutemov" <[email protected]>
Cc: Vasily Gorbik <[email protected]>
Cc: Alexander Duyck <[email protected]>
Cc: <[email protected]>	[5.3+]
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 518a867 upstream.

The "mode" and "level" variables are enums and in this context GCC will
treat them as unsigned ints so the error handling is never triggered.

I also removed the bogus initializer because it isn't required any more
and it's sort of confusing.

[[email protected]: reduce implicit and explicit typecasting]
[[email protected]: fix return value, add comment, per Matthew]
Link: http://lkml.kernel.org/r/20190925110449.GO3264@mwanda
Fixes: 3cadfa2 ("mm/vmpressure.c: convert to use match_string() helper")
Signed-off-by: Dan Carpenter <[email protected]>
Reviewed-by: Andy Shevchenko <[email protected]>
Acked-by: David Rientjes <[email protected]>
Reviewed-by: Matthew Wilcox <[email protected]>
Cc: Greg Kroah-Hartman <[email protected]>
Cc: Thomas Gleixner <[email protected]>
Cc: Enrico Weigelt <[email protected]>
Cc: Kate Stewart <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
Signed-off-by: Linus Torvalds <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 1cbe866 upstream.

rdma_for_each_port is already incrementing the iterator's value it
receives therefore, after the first iteration the iterator is increased by
2 which eventually causing wrong queries and possible traces.

Fix the above by removing the old redundant incrementation that was used
before rdma_for_each_port() macro.

Cc: <[email protected]>
Fixes: ea1075e ("RDMA: Add and use rdma_for_each_port")
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Mohamad Heib <[email protected]>
Reviewed-by: Erez Alfasi <[email protected]>
Signed-off-by: Leon Romanovsky <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
[ Upstream commit 442f1e7 ]

Commit 4b708b7 ("firmware: google: check if size is valid when
decoding VPD data") adds length checks, but the new vpd_decode_entry()
function botched the logic -- it adds the key length twice, instead of
adding the key and value lengths separately.

On my local system, this means vpd.c's vpd_section_create_attribs() hits
an error case after the first attribute it parses, since it's no longer
looking at the correct offset. With this patch, I'm back to seeing all
the correct attributes in /sys/firmware/vpd/...

Fixes: 4b708b7 ("firmware: google: check if size is valid when decoding VPD data")
Cc: <[email protected]>
Cc: Hung-Te Lin <[email protected]>
Signed-off-by: Brian Norris <[email protected]>
Reviewed-by: Stephen Boyd <[email protected]>
Reviewed-by: Guenter Roeck <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit be7ae45 ]

Since commit ec75700 ("gpio: Enable nonexclusive gpiods from DT
nodes") we are able to get GPIOD_FLAGS_BIT_NONEXCLUSIVE marked gpios.
Currently the gpiolib uses the wrong flags variable for the check. We
need to check the gpiod_flags instead of the of_gpio_flags else we
return -EBUSY for GPIOD_FLAGS_BIT_NONEXCLUSIVE marked and requested
gpiod's.

Fixes: ec75700 gpio: Enable nonexclusive gpiods from DT nodes
Cc: [email protected]
Signed-off-by: Marco Felsch <[email protected]>
[Bartosz: the function was moved to gpiolib-of.c so updated the patch]
Signed-off-by: Bartosz Golaszewski <[email protected]>
[Bartosz: backported to v5.3.y]
Signed-off-by: Bartosz Golaszewski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit e735244 ]

When emulating open-drain/open-source by not actively driving the output
lines - we're simply changing their mode to input. This is wrong as it
will then make it impossible to change the value of such line - it's now
considered to actually be in input mode. If we want to still use the
direction_input() callback for simplicity then we need to set FLAG_IS_OUT
manually in gpiod_direction_output() and not clear it in
gpio_set_open_drain_value_commit() and
gpio_set_open_source_value_commit().

Fixes: c663e5f ("gpio: support native single-ended hardware drivers")
Cc: [email protected]
Reported-by: Kent Gibson <[email protected]>
Signed-off-by: Bartosz Golaszewski <[email protected]>
[Bartosz: backported to v5.3, v4.19]
Signed-off-by: Bartosz Golaszewski <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
commit 1fac4a5 upstream.

[BUG]
One user reported a reproducible KASAN report about use-after-free:

  BTRFS info (device sdi1): balance: start -dvrange=1256811659264..1256811659265
  BTRFS info (device sdi1): relocating block group 1256811659264 flags data|raid0
  ==================================================================
  BUG: KASAN: use-after-free in btrfs_init_reloc_root+0x2cd/0x340 [btrfs]
  Write of size 8 at addr ffff88856f671710 by task kworker/u24:10/261579

  CPU: 2 PID: 261579 Comm: kworker/u24:10 Tainted: P           OE     5.2.11-arch1-1-kasan #4
  Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./X99 Extreme4, BIOS P3.80 04/06/2018
  Workqueue: btrfs-endio-write btrfs_endio_write_helper [btrfs]
  Call Trace:
   dump_stack+0x7b/0xba
   print_address_description+0x6c/0x22e
   ? btrfs_init_reloc_root+0x2cd/0x340 [btrfs]
   __kasan_report.cold+0x1b/0x3b
   ? btrfs_init_reloc_root+0x2cd/0x340 [btrfs]
   kasan_report+0x12/0x17
   __asan_report_store8_noabort+0x17/0x20
   btrfs_init_reloc_root+0x2cd/0x340 [btrfs]
   record_root_in_trans+0x2a0/0x370 [btrfs]
   btrfs_record_root_in_trans+0xf4/0x140 [btrfs]
   start_transaction+0x1ab/0xe90 [btrfs]
   btrfs_join_transaction+0x1d/0x20 [btrfs]
   btrfs_finish_ordered_io+0x7bf/0x18a0 [btrfs]
   ? lock_repin_lock+0x400/0x400
   ? __kmem_cache_shutdown.cold+0x140/0x1ad
   ? btrfs_unlink_subvol+0x9b0/0x9b0 [btrfs]
   finish_ordered_fn+0x15/0x20 [btrfs]
   normal_work_helper+0x1bd/0xca0 [btrfs]
   ? process_one_work+0x819/0x1720
   ? kasan_check_read+0x11/0x20
   btrfs_endio_write_helper+0x12/0x20 [btrfs]
   process_one_work+0x8c9/0x1720
   ? pwq_dec_nr_in_flight+0x2f0/0x2f0
   ? worker_thread+0x1d9/0x1030
   worker_thread+0x98/0x1030
   kthread+0x2bb/0x3b0
   ? process_one_work+0x1720/0x1720
   ? kthread_park+0x120/0x120
   ret_from_fork+0x35/0x40

  Allocated by task 369692:
   __kasan_kmalloc.part.0+0x44/0xc0
   __kasan_kmalloc.constprop.0+0xba/0xc0
   kasan_kmalloc+0x9/0x10
   kmem_cache_alloc_trace+0x138/0x260
   btrfs_read_tree_root+0x92/0x360 [btrfs]
   btrfs_read_fs_root+0x10/0xb0 [btrfs]
   create_reloc_root+0x47d/0xa10 [btrfs]
   btrfs_init_reloc_root+0x1e2/0x340 [btrfs]
   record_root_in_trans+0x2a0/0x370 [btrfs]
   btrfs_record_root_in_trans+0xf4/0x140 [btrfs]
   start_transaction+0x1ab/0xe90 [btrfs]
   btrfs_start_transaction+0x1e/0x20 [btrfs]
   __btrfs_prealloc_file_range+0x1c2/0xa00 [btrfs]
   btrfs_prealloc_file_range+0x13/0x20 [btrfs]
   prealloc_file_extent_cluster+0x29f/0x570 [btrfs]
   relocate_file_extent_cluster+0x193/0xc30 [btrfs]
   relocate_data_extent+0x1f8/0x490 [btrfs]
   relocate_block_group+0x600/0x1060 [btrfs]
   btrfs_relocate_block_group+0x3a0/0xa00 [btrfs]
   btrfs_relocate_chunk+0x9e/0x180 [btrfs]
   btrfs_balance+0x14e4/0x2fc0 [btrfs]
   btrfs_ioctl_balance+0x47f/0x640 [btrfs]
   btrfs_ioctl+0x119d/0x8380 [btrfs]
   do_vfs_ioctl+0x9f5/0x1060
   ksys_ioctl+0x67/0x90
   __x64_sys_ioctl+0x73/0xb0
   do_syscall_64+0xa5/0x370
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

  Freed by task 369692:
   __kasan_slab_free+0x14f/0x210
   kasan_slab_free+0xe/0x10
   kfree+0xd8/0x270
   btrfs_drop_snapshot+0x154c/0x1eb0 [btrfs]
   clean_dirty_subvols+0x227/0x340 [btrfs]
   relocate_block_group+0x972/0x1060 [btrfs]
   btrfs_relocate_block_group+0x3a0/0xa00 [btrfs]
   btrfs_relocate_chunk+0x9e/0x180 [btrfs]
   btrfs_balance+0x14e4/0x2fc0 [btrfs]
   btrfs_ioctl_balance+0x47f/0x640 [btrfs]
   btrfs_ioctl+0x119d/0x8380 [btrfs]
   do_vfs_ioctl+0x9f5/0x1060
   ksys_ioctl+0x67/0x90
   __x64_sys_ioctl+0x73/0xb0
   do_syscall_64+0xa5/0x370
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

  The buggy address belongs to the object at ffff88856f671100
   which belongs to the cache kmalloc-4k of size 4096
  The buggy address is located 1552 bytes inside of
   4096-byte region [ffff88856f671100, ffff88856f672100)
  The buggy address belongs to the page:
  page:ffffea0015bd9c00 refcount:1 mapcount:0 mapping:ffff88864400e600 index:0x0 compound_mapcount: 0
  flags: 0x2ffff0000010200(slab|head)
  raw: 02ffff0000010200 dead000000000100 dead000000000200 ffff88864400e600
  raw: 0000000000000000 0000000000070007 00000001ffffffff 0000000000000000
  page dumped because: kasan: bad access detected

  Memory state around the buggy address:
   ffff88856f671600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
   ffff88856f671680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  >ffff88856f671700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
                           ^
   ffff88856f671780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
   ffff88856f671800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
  ==================================================================
  BTRFS info (device sdi1): 1 enospc errors during balance
  BTRFS info (device sdi1): balance: ended with status: -28

[CAUSE]
The problem happens when finish_ordered_io() get called with balance
still running, while the reloc root of that subvolume is already dead.
(Tree is swap already done, but tree not yet deleted for possible qgroup
usage.)

That means root->reloc_root still exists, but that reloc_root can be
under btrfs_drop_snapshot(), thus we shouldn't access it.

The following race could cause the use-after-free problem:

                CPU1              |                CPU2
--------------------------------------------------------------------------
                                  | relocate_block_group()
                                  | |- unset_reloc_control(rc)
                                  | |- btrfs_commit_transaction()
btrfs_finish_ordered_io()         | |- clean_dirty_subvols()
|- btrfs_join_transaction()       |    |
   |- record_root_in_trans()      |    |
      |- btrfs_init_reloc_root()  |    |
         |- if (root->reloc_root) |    |
         |                        |    |- root->reloc_root = NULL
         |                        |    |- btrfs_drop_snapshot(reloc_root);
         |- reloc_root->last_trans|
                 = trans->transid |
	    ^^^^^^^^^^^^^^^^^^^^^^
            Use after free

[FIX]
Fix it by the following modifications:

- Test if the root has dead reloc tree before accessing root->reloc_root
  If the root has BTRFS_ROOT_DEAD_RELOC_TREE, then we don't need to
  create or update root->reloc_tree

- Clear the BTRFS_ROOT_DEAD_RELOC_TREE flag until we have fully dropped
  reloc tree
  To co-operate with above modification, so as long as
  BTRFS_ROOT_DEAD_RELOC_TREE is still set, we won't try to re-create
  reloc tree at record_root_in_trans().

Reported-by: Cebtenzzre <[email protected]>
Fixes: d2311e6 ("btrfs: relocation: Delay reloc tree deletion after merge_reloc_roots")
CC: [email protected] # 5.1+
Reviewed-by: Josef Bacik <[email protected]>
Reviewed-by: Filipe Manana <[email protected]>
Signed-off-by: Qu Wenruo <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 11a19a9 upstream.

A user reported a lockdep splat

 ======================================================
 WARNING: possible circular locking dependency detected
 5.2.11-gentoo #2 Not tainted
 ------------------------------------------------------
 kswapd0/711 is trying to acquire lock:
 000000007777a663 (sb_internal){.+.+}, at: start_transaction+0x3a8/0x500

but task is already holding lock:
 000000000ba86300 (fs_reclaim){+.+.}, at: __fs_reclaim_acquire+0x0/0x30

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (fs_reclaim){+.+.}:
 kmem_cache_alloc+0x1f/0x1c0
 btrfs_alloc_inode+0x1f/0x260
 alloc_inode+0x16/0xa0
 new_inode+0xe/0xb0
 btrfs_new_inode+0x70/0x610
 btrfs_symlink+0xd0/0x420
 vfs_symlink+0x9c/0x100
 do_symlinkat+0x66/0xe0
 do_syscall_64+0x55/0x1c0
 entry_SYSCALL_64_after_hwframe+0x49/0xbe

-> #0 (sb_internal){.+.+}:
 __sb_start_write+0xf6/0x150
 start_transaction+0x3a8/0x500
 btrfs_commit_inode_delayed_inode+0x59/0x110
 btrfs_evict_inode+0x19e/0x4c0
 evict+0xbc/0x1f0
 inode_lru_isolate+0x113/0x190
 __list_lru_walk_one.isra.4+0x5c/0x100
 list_lru_walk_one+0x32/0x50
 prune_icache_sb+0x36/0x80
 super_cache_scan+0x14a/0x1d0
 do_shrink_slab+0x131/0x320
 shrink_node+0xf7/0x380
 balance_pgdat+0x2d5/0x640
 kswapd+0x2ba/0x5e0
 kthread+0x147/0x160
 ret_from_fork+0x24/0x30

other info that might help us debug this:

 Possible unsafe locking scenario:

 CPU0 CPU1
 ---- ----
 lock(fs_reclaim);
 lock(sb_internal);
 lock(fs_reclaim);
 lock(sb_internal);
commit 7a54789 upstream.

Currently, the command:

	btrfs balance start -dconvert=single,soft .

on a Raspberry Pi produces the following kernel message:

	BTRFS error (device mmcblk0p2): balance: invalid convert data profile single

This fails because we use is_power_of_2(unsigned long) to validate
the new data profile, the constant for 'single' profile uses bit 48,
and there are only 32 bits in a long on ARM.

Fix by open-coding the check using u64 variables.

Tested by completing the original balance command on several Raspberry
Pis.

Fixes: 818255f ("btrfs: use common helper instead of open coding a bit test")
CC: [email protected] # 4.20+
Signed-off-by: Zygo Blaxell <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit c67d970 upstream.

When we have a buffered write that starts at an offset greater than or
equals to the file's size happening concurrently with a full ranged
fiemap, we can end up leaking an extent state structure.

Suppose we have a file with a size of 1Mb, and before the buffered write
and fiemap are performed, it has a single extent state in its io tree
representing the range from 0 to 1Mb, with the EXTENT_DELALLOC bit set.

The following sequence diagram shows how the memory leak happens if a
fiemap a buffered write, starting at offset 1Mb and with a length of
4Kb, are performed concurrently.

          CPU 1                                                  CPU 2

  extent_fiemap()
    --> it's a full ranged fiemap
        range from 0 to LLONG_MAX - 1
        (9223372036854775807)

    --> locks range in the inode's
        io tree
      --> after this we have 2 extent
          states in the io tree:
          --> 1 for range [0, 1Mb[ with
              the bits EXTENT_LOCKED and
              EXTENT_DELALLOC_BITS set
          --> 1 for the range
              [1Mb, LLONG_MAX[ with
              the EXTENT_LOCKED bit set

                                                  --> start buffered write at offset
                                                      1Mb with a length of 4Kb

                                                  btrfs_file_write_iter()

                                                    btrfs_buffered_write()
                                                      --> cached_state is NULL

                                                      lock_and_cleanup_extent_if_need()
                                                        --> returns 0 and does not lock
                                                            range because it starts
                                                            at current i_size / eof

                                                      --> cached_state remains NULL

                                                      btrfs_dirty_pages()
                                                        btrfs_set_extent_delalloc()
                                                          (...)
                                                          __set_extent_bit()

                                                            --> splits extent state for range
                                                                [1Mb, LLONG_MAX[ and now we
                                                                have 2 extent states:

                                                                --> one for the range
                                                                    [1Mb, 1Mb + 4Kb[ with
                                                                    EXTENT_LOCKED set
                                                                --> another one for the range
                                                                    [1Mb + 4Kb, LLONG_MAX[ with
                                                                    EXTENT_LOCKED set as well

                                                            --> sets EXTENT_DELALLOC on the
                                                                extent state for the range
                                                                [1Mb, 1Mb + 4Kb[
                                                            --> caches extent state
                                                                [1Mb, 1Mb + 4Kb[ into
                                                                @cached_state because it has
                                                                the bit EXTENT_LOCKED set

                                                    --> btrfs_buffered_write() ends up
                                                        with a non-NULL cached_state and
                                                        never calls anything to release its
                                                        reference on it, resulting in a
                                                        memory leak

Fix this by calling free_extent_state() on cached_state if the range was
not locked by lock_and_cleanup_extent_if_need().

The same issue can happen if anything else other than fiemap locks a range
that covers eof and beyond.

This could be triggered, sporadically, by test case generic/561 from the
fstests suite, which makes duperemove run concurrently with fsstress, and
duperemove does plenty of calls to fiemap. When CONFIG_BTRFS_DEBUG is set
the leak is reported in dmesg/syslog when removing the btrfs module with
a message like the following:

  [77100.039461] BTRFS: state leak: start 6574080 end 6582271 state 16402 in tree 0 refs 1

Otherwise (CONFIG_BTRFS_DEBUG not set) detectable with kmemleak.

CC: [email protected] # 4.16+
Reviewed-by: Josef Bacik <[email protected]>
Signed-off-by: Filipe Manana <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 4203e96 upstream.

We've historically had reports of being unable to mount file systems
because the tree log root couldn't be read.  Usually this is the "parent
transid failure", but could be any of the related errors, including
"fsid mismatch" or "bad tree block", depending on which block got
allocated.

The modification of the individual log root items are serialized on the
per-log root root_mutex.  This means that any modification to the
per-subvol log root_item is completely protected.

However we update the root item in the log root tree outside of the log
root tree log_mutex.  We do this in order to allow multiple subvolumes
to be updated in each log transaction.

This is problematic however because when we are writing the log root
tree out we update the super block with the _current_ log root node
information.  Since these two operations happen independently of each
other, you can end up updating the log root tree in between writing out
the dirty blocks and setting the super block to point at the current
root.

This means we'll point at the new root node that hasn't been written
out, instead of the one we should be pointing at.  Thus whatever garbage
or old block we end up pointing at complains when we mount the file
system later and try to replay the log.

Fix this by copying the log's root item into a local root item copy.
Then once we're safely under the log_root_tree->log_mutex we update the
root item in the log_root_tree.  This way we do not modify the
log_root_tree while we're committing it, fixing the problem.

CC: [email protected] # 4.4+
Reviewed-by: Chris Mason <[email protected]>
Reviewed-by: Filipe Manana <[email protected]>
Signed-off-by: Josef Bacik <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit c5f4987 upstream.

Coverity caught a case where we could return with a uninitialized value
in ret in process_leaf.  This is actually pretty likely because we could
very easily run into a block group item key and have a garbage value in
ret and think there was an errror.  Fix this by initializing ret to 0.

Reported-by: Colin Ian King <[email protected]>
Fixes: fd708b8 ("Btrfs: add a extent ref verify tool")
CC: [email protected] # 4.19+
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 031d73e upstream.

When a series of O_DIRECT reads or writes are truncated, either due to
eof or due to an error, then we should return the number of contiguous
bytes that were received/sent starting at the offset specified by the
application.

Currently, we are failing to correctly check contiguity, and so we're
failing the generic/465 in xfstests when the race between the read
and write RPCs causes the file to get extended while the 2 reads are
outstanding. If the first read RPC call wins the race and returns with
eof set, we should treat the second read RPC as being truncated.

Reported-by: Su Yanjun <[email protected]>
Fixes: 1ccbad9 ("nfs: fix DIO good bytes calculation")
Cc: [email protected] # 4.1+
Signed-off-by: Trond Myklebust <[email protected]>
Signed-off-by: Anna Schumaker <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 2f2b4fd upstream.

GCC 9.x automatically enables support for Loongson MMI instructions when
using some -march= flags, and then errors out when -msoft-float is
specified with:

  cc1: error: ‘-mloongson-mmi’ must be used with ‘-mhard-float’

The kernel shouldn't be using these MMI instructions anyway, just as it
doesn't use floating point instructions. Explicitly disable them in
order to fix the build with GCC 9.x.

Signed-off-by: Paul Burton <[email protected]>
Fixes: 3702bba ("MIPS: Loongson: Add GCC 4.4 support for Loongson2E")
Fixes: 6f7a251 ("MIPS: Loongson: Add basic Loongson 2F support")
Fixes: 5188129 ("MIPS: Loongson-3: Improve -march option and move it to Platform")
Cc: Huacai Chen <[email protected]>
Cc: Jiaxun Yang <[email protected]>
Cc: [email protected] # v2.6.32+
Cc: [email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 38dffe1 upstream.

A Golang developer reported MIPS hwcap isn't reflecting instructions
that the processor actually supported so programs can't apply optimized
code at runtime.

Thus we export the ASEs that can be used in userspace programs.

Reported-by: Meng Zhuo <[email protected]>
Signed-off-by: Jiaxun Yang <[email protected]>
Cc: [email protected]
Cc: Paul Burton <[email protected]>
Cc: <[email protected]> # 4.14+
Signed-off-by: Paul Burton <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 18545e8 upstream.

An extra kfree cleanup was missed since these are now deallocated by core.

Link: https://lore.kernel.org/r/[email protected]
Cc: <[email protected]>
Fixes: 68e326d ("RDMA: Handle SRQ allocations by IB/core")
Signed-off-by: Adit Ranadive <[email protected]>
Reviewed-by: Vishnu Dasa <[email protected]>
Reviewed-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Jason Gunthorpe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Commit bbd1b70 upstream.

ACPI 6.3 adds a flag to the CPU node to indicate whether
the given PE is a thread. Add a function to return that
information for a given linux logical CPU.

Signed-off-by: Jeremy Linton <[email protected]>
Reviewed-by: Sudeep Holla <[email protected]>
Reviewed-by: Robert Richter <[email protected]>
Acked-by: Rafael J. Wysocki <[email protected]>
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: John Garry <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Commit 98dc199 upstream.

ACPI 6.3 adds a thread flag to represent if a CPU/PE is
actually a thread. Given that the MPIDR_MT bit may not
represent this information consistently on homogeneous machines
we should prefer the PPTT flag if its available.

Signed-off-by: Jeremy Linton <[email protected]>
Reviewed-by: Sudeep Holla <[email protected]>
Reviewed-by: Robert Richter <[email protected]>
[will: made acpi_cpu_is_threaded() return 'bool']
Signed-off-by: Will Deacon <[email protected]>
Signed-off-by: John Garry <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
[ Upstream commit 1436a78 ]

Since commit ebd457d ("iio: light: vcnl4000 add devicetree hooks")
the of_match_table is supported but the data shouldn't be a string.
Instead it shall be one of 'enum vcnl4000_device_ids'. Also the matching
logic for the vcnl4020 was wrong. Since the data retrieve mechanism is
still based on the i2c_device_id no failures did appeared till now.

Fixes: ebd457d ("iio: light: vcnl4000 add devicetree hooks")
Signed-off-by: Marco Felsch <[email protected]>
Reviewed-by: Angus Ainslie (Purism) [email protected]
Cc: <[email protected]>
Signed-off-by: Jonathan Cameron <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
commit d4f4de5 upstream.

There are two problems in dcache_readdir() - one is that lockless traversal
of the list needs non-trivial cooperation of d_alloc() (at least a switch
to list_add_rcu(), and probably more than just that) and another is that
it assumes that no removal will happen without the directory locked exclusive.
Said assumption had always been there, never had been stated explicitly and
is violated by several places in the kernel (devpts and selinuxfs).

        * replacement of next_positive() with different calling conventions:
it returns struct list_head * instead of struct dentry *; the latter is
passed in and out by reference, grabbing the result and dropping the original
value.
        * scan is under ->d_lock.  If we run out of timeslice, cursor is moved
after the last position we'd reached and we reschedule; then the scan continues
from that place.  To avoid livelocks between multiple lseek() (with cursors
getting moved past each other, never reaching the real entries) we always
skip the cursors, need_resched() or not.
        * returned list_head * is either ->d_child of dentry we'd found or
->d_subdirs of parent (if we got to the end of the list).
        * dcache_readdir() and dcache_dir_lseek() switched to new helper.
dcache_readdir() always holds a reference to dentry passed to dir_emit() now.
Cursor is moved to just before the entry where dir_emit() has failed or into
the very end of the list, if we'd run out.
        * move_cursor() eliminated - it had sucky calling conventions and
after fixing that it became simply list_move() (in lseek and scan_positives)
or list_move_tail() (in readdir).

        All operations with the list are under ->d_lock now, and we do not
depend upon having all file removals done with parent locked exclusive
anymore.

Cc: [email protected]
Reported-by: "zhengbin (A)" <[email protected]>
Signed-off-by: Al Viro <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit dc78909 upstream.

The officially validated plane width limit is 4k on skl+, however
we already had people using 5k displays before we started to enforce
the limit. Also it seems Windows allows 5k resolutions as well
(though not sure if they do it with one plane or two).

According to hw folks 5k should work with the possible
exception of the following features:
- Ytile (already limited to 4k)
- FP16 (already limited to 4k)
- render compression (already limited to 4k)
- KVMR sprite and cursor (don't care)
- horizontal panning (need to verify this)
- pipe and plane scaling (need to verify this)

So apart from last two items on that list we are already
fine. We should really verify what happens with those last
two items but I don't have a 5k display on hand atm so it'll
have to wait.

In the meantime let's just bump the limit back up to 5k since
several users have already been using it without apparent issues.
At least we'll be no worse off than we were prior to lowering
the limits.

Cc: [email protected]
Cc: Sean Paul <[email protected]>
Cc: José Roberto de Souza <[email protected]>
Tested-by: Leho Kraav <[email protected]>
Fixes: 372b9ff ("drm/i915: Fix skl+ max plane width")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111501
Signed-off-by: Ville Syrjälä <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Reviewed-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Sean Paul <[email protected]>
(cherry picked from commit bed34ef)
Signed-off-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 282b7fd upstream.

This allows userspace to use "legacy" mode for push constants, where
they are committed at 3DPRIMITIVE or flush time, rather than being
committed at 3DSTATE_BINDING_TABLE_POINTERS_XS time.  Gen6-8 and Gen11
both use the "legacy" behavior - only Gen9 works in the "new" way.

Conflating push constants with binding tables is painful for userspace,
we would like to be able to avoid doing so.

Signed-off-by: Kenneth Graunke <[email protected]>
Cc: [email protected]
Reviewed-by: Chris Wilson <[email protected]>
Signed-off-by: Chris Wilson <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 0606259)
Signed-off-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit b925708 upstream.

Since dropping the set-to-gtt-domain in commit a679f58 ("drm/i915:
Flush pages on acquisition"), we no longer mark the contents as dirty on
a write fault. This has the issue of us then not marking the pages as
dirty on releasing the buffer, which means the contents are not written
out to the swap device (should we ever pick that buffer as a victim).
Notably, this is visible in the dumb buffer interface used for cursors.
Having updated the cursor contents via mmap, and swapped away, if the
shrinker should evict the old cursor, upon next reuse, the cursor would
be invisible.

E.g. echo 80 > /proc/sys/kernel/sysrq ; echo f > /proc/sysrq-trigger

Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111541
Fixes: a679f58 ("drm/i915: Flush pages on acquisition")
Signed-off-by: Chris Wilson <[email protected]>
Cc: Matthew Auld <[email protected]>
Cc: Ville Syrjälä <[email protected]>
Cc: <[email protected]> # v5.2+
Reviewed-by: Matthew Auld <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
(cherry picked from commit 5028851)
Signed-off-by: Rodrigo Vivi <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 9f61419 upstream.

Looks like the dma_sync calls don't do what we want on armv7 either.
Fixes:

  Unable to handle kernel paging request at virtual address 50001000
  pgd = (ptrval)
  [50001000] *pgd=00000000
  Internal error: Oops: 805 [#1] SMP ARM
  Modules linked in:
  CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.3.0-rc6-00271-g9f159ae07f07 #4
  Hardware name: Freescale i.MX53 (Device Tree Support)
  PC is at v7_dma_clean_range+0x20/0x38
  LR is at __dma_page_cpu_to_dev+0x28/0x90
  pc : [<c011c76c>]    lr : [<c01181c4>]    psr: 20000013
  sp : d80b5a88  ip : de96c000  fp : d840ce6c
  r10: 00000000  r9 : 00000001  r8 : d843e010
  r7 : 00000000  r6 : 00008000  r5 : ddb6c000  r4 : 00000000
  r3 : 0000003f  r2 : 00000040  r1 : 50008000  r0 : 50001000
  Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment none
  Control: 10c5387d  Table: 70004019  DAC: 00000051
  Process swapper/0 (pid: 1, stack limit = 0x(ptrval))

Signed-off-by: Rob Clark <[email protected]>
Fixes: 3de433c ("drm/msm: Use the correct dma_sync calls in msm_gem")
Tested-by: Fabio Estevam <[email protected]>
Signed-off-by: Fabio Estevam <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 30045f2 upstream.

Since commit c2b7146 ("USB: core: Fix bug caused by duplicate
interface PM usage counter") USB drivers must always balance their
runtime PM gets and puts, including when the driver has already been
unbound from the interface.

Leaving the interface with a positive PM usage counter would prevent a
later bound driver from suspending the device.

Note that runtime PM has never actually been enabled for this driver
since the support_autosuspend flag in its usb_driver struct is not set.

Fixes: c2b7146 ("USB: core: Fix bug caused by duplicate interface PM usage counter")
Cc: stable <[email protected]>
Acked-by: Mauro Carvalho Chehab <[email protected]>
Signed-off-by: Johan Hovold <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
@DanielOgorchock DanielOgorchock force-pushed the ogorchock branch 5 times, most recently from 9da4b73 to 411ce08 Compare May 31, 2020 03:58
DanielOgorchock pushed a commit that referenced this pull request Jun 6, 2020
[ Upstream commit 20a785a ]

This BUG halt was reported a while back, but the patch somehow got
missed:

PID: 2879   TASK: c16adaa0  CPU: 1   COMMAND: "sctpn"
 #0 [f418dd28] crash_kexec at c04a7d8c
 #1 [f418dd7c] oops_end at c0863e02
 #2 [f418dd90] do_invalid_op at c040aaca
 #3 [f418de28] error_code (via invalid_op) at c08631a5
    EAX: f34baac0  EBX: 00000090  ECX: f418deb0  EDX: f5542950  EBP: 00000000
    DS:  007b      ESI: f34ba800  ES:  007b      EDI: f418dea0  GS:  00e0
    CS:  0060      EIP: c046fa5e  ERR: ffffffff  EFLAGS: 00010286
 #4 [f418de5c] add_timer at c046fa5e
 #5 [f418de68] sctp_do_sm at f8db8c77 [sctp]
 #6 [f418df30] sctp_primitive_SHUTDOWN at f8dcc1b5 [sctp]
 #7 [f418df48] inet_shutdown at c080baf9
 #8 [f418df5c] sys_shutdown at c079eedf
 #9 [f418df70] sys_socketcall at c079fe88
    EAX: ffffffda  EBX: 0000000d  ECX: bfceea90  EDX: 0937af98
    DS:  007b      ESI: 0000000c  ES:  007b      EDI: b7150ae4
    SS:  007b      ESP: bfceea7c  EBP: bfceeaa8  GS:  0033
    CS:  0073      EIP: b775c424  ERR: 00000066  EFLAGS: 00000282

It appears that the side effect that starts the shutdown timer was processed
multiple times, which can happen as multiple paths can trigger it.  This of
course leads to the BUG halt in add_timer getting called.

Fix seems pretty straightforward, just check before the timer is added if its
already been started.  If it has mod the timer instead to min(current
expiration, new expiration)

Its been tested but not confirmed to fix the problem, as the issue has only
occured in production environments where test kernels are enjoined from being
installed.  It appears to be a sane fix to me though.  Also, recentely,
Jere found a reproducer posted on list to confirm that this resolves the
issues

Signed-off-by: Neil Horman <[email protected]>
CC: Vlad Yasevich <[email protected]>
CC: "David S. Miller" <[email protected]>
CC: [email protected]
CC: [email protected]
CC: [email protected]
Acked-by: Marcelo Ricardo Leitner <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
DanielOgorchock pushed a commit that referenced this pull request Jun 6, 2020
commit 2b86cb8 upstream.

Be there a platform with the following layout:

      Regular NIC
       |
       +----> DSA master for switch port
               |
               +----> DSA master for another switch port

After changing DSA back to static lockdep class keys in commit
1a33e10 ("net: partially revert dynamic lockdep key changes"), this
kernel splat can be seen:

[   13.361198] ============================================
[   13.366524] WARNING: possible recursive locking detected
[   13.371851] 5.7.0-rc4-02121-gc32a05ecd7af-dirty #988 Not tainted
[   13.377874] --------------------------------------------
[   13.383201] swapper/0/0 is trying to acquire lock:
[   13.388004] ffff0000668ff298 (&dsa_slave_netdev_xmit_lock_key){+.-.}-{2:2}, at: __dev_queue_xmit+0x84c/0xbe0
[   13.397879]
[   13.397879] but task is already holding lock:
[   13.403727] ffff0000661a1698 (&dsa_slave_netdev_xmit_lock_key){+.-.}-{2:2}, at: __dev_queue_xmit+0x84c/0xbe0
[   13.413593]
[   13.413593] other info that might help us debug this:
[   13.420140]  Possible unsafe locking scenario:
[   13.420140]
[   13.426075]        CPU0
[   13.428523]        ----
[   13.430969]   lock(&dsa_slave_netdev_xmit_lock_key);
[   13.435946]   lock(&dsa_slave_netdev_xmit_lock_key);
[   13.440924]
[   13.440924]  *** DEADLOCK ***
[   13.440924]
[   13.446860]  May be due to missing lock nesting notation
[   13.446860]
[   13.453668] 6 locks held by swapper/0/0:
[   13.457598]  #0: ffff800010003de0 ((&idev->mc_ifc_timer)){+.-.}-{0:0}, at: call_timer_fn+0x0/0x400
[   13.466593]  #1: ffffd4d3fb478700 (rcu_read_lock){....}-{1:2}, at: mld_sendpack+0x0/0x560
[   13.474803]  #2: ffffd4d3fb478728 (rcu_read_lock_bh){....}-{1:2}, at: ip6_finish_output2+0x64/0xb10
[   13.483886]  #3: ffffd4d3fb478728 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x6c/0xbe0
[   13.492793]  #4: ffff0000661a1698 (&dsa_slave_netdev_xmit_lock_key){+.-.}-{2:2}, at: __dev_queue_xmit+0x84c/0xbe0
[   13.503094]  #5: ffffd4d3fb478728 (rcu_read_lock_bh){....}-{1:2}, at: __dev_queue_xmit+0x6c/0xbe0
[   13.512000]
[   13.512000] stack backtrace:
[   13.516369] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.7.0-rc4-02121-gc32a05ecd7af-dirty #988
[   13.530421] Call trace:
[   13.532871]  dump_backtrace+0x0/0x1d8
[   13.536539]  show_stack+0x24/0x30
[   13.539862]  dump_stack+0xe8/0x150
[   13.543271]  __lock_acquire+0x1030/0x1678
[   13.547290]  lock_acquire+0xf8/0x458
[   13.550873]  _raw_spin_lock+0x44/0x58
[   13.554543]  __dev_queue_xmit+0x84c/0xbe0
[   13.558562]  dev_queue_xmit+0x24/0x30
[   13.562232]  dsa_slave_xmit+0xe0/0x128
[   13.565988]  dev_hard_start_xmit+0xf4/0x448
[   13.570182]  __dev_queue_xmit+0x808/0xbe0
[   13.574200]  dev_queue_xmit+0x24/0x30
[   13.577869]  neigh_resolve_output+0x15c/0x220
[   13.582237]  ip6_finish_output2+0x244/0xb10
[   13.586430]  __ip6_finish_output+0x1dc/0x298
[   13.590709]  ip6_output+0x84/0x358
[   13.594116]  mld_sendpack+0x2bc/0x560
[   13.597786]  mld_ifc_timer_expire+0x210/0x390
[   13.602153]  call_timer_fn+0xcc/0x400
[   13.605822]  run_timer_softirq+0x588/0x6e0
[   13.609927]  __do_softirq+0x118/0x590
[   13.613597]  irq_exit+0x13c/0x148
[   13.616918]  __handle_domain_irq+0x6c/0xc0
[   13.621023]  gic_handle_irq+0x6c/0x160
[   13.624779]  el1_irq+0xbc/0x180
[   13.627927]  cpuidle_enter_state+0xb4/0x4d0
[   13.632120]  cpuidle_enter+0x3c/0x50
[   13.635703]  call_cpuidle+0x44/0x78
[   13.639199]  do_idle+0x228/0x2c8
[   13.642433]  cpu_startup_entry+0x2c/0x48
[   13.646363]  rest_init+0x1ac/0x280
[   13.649773]  arch_call_rest_init+0x14/0x1c
[   13.653878]  start_kernel+0x490/0x4bc

Lockdep keys themselves were added in commit ab92d68 ("net: core:
add generic lockdep keys"), and it's very likely that this splat existed
since then, but I have no real way to check, since this stacked platform
wasn't supported by mainline back then.

>From Taehee's own words:

  This patch was considered that all stackable devices have LLTX flag.
  But the dsa doesn't have LLTX, so this splat happened.
  After this patch, dsa shares the same lockdep class key.
  On the nested dsa interface architecture, which you illustrated,
  the same lockdep class key will be used in __dev_queue_xmit() because
  dsa doesn't have LLTX.
  So that lockdep detects deadlock because the same lockdep class key is
  used recursively although actually the different locks are used.
  There are some ways to fix this problem.

  1. using NETIF_F_LLTX flag.
  If possible, using the LLTX flag is a very clear way for it.
  But I'm so sorry I don't know whether the dsa could have LLTX or not.

  2. using dynamic lockdep again.
  It means that each interface uses a separate lockdep class key.
  So, lockdep will not detect recursive locking.
  But this way has a problem that it could consume lockdep class key
  too many.
  Currently, lockdep can have 8192 lockdep class keys.
   - you can see this number with the following command.
     cat /proc/lockdep_stats
     lock-classes:                         1251 [max: 8192]
     ...
     The [max: 8192] means that the maximum number of lockdep class keys.
  If too many lockdep class keys are registered, lockdep stops to work.
  So, using a dynamic(separated) lockdep class key should be considered
  carefully.
  In addition, updating lockdep class key routine might have to be existing.
  (lockdep_register_key(), lockdep_set_class(), lockdep_unregister_key())

  3. Using lockdep subclass.
  A lockdep class key could have 8 subclasses.
  The different subclass is considered different locks by lockdep
  infrastructure.
  But "lock-classes" is not counted by subclasses.
  So, it could avoid stopping lockdep infrastructure by an overflow of
  lockdep class keys.
  This approach should also have an updating lockdep class key routine.
  (lockdep_set_subclass())

  4. Using nonvalidate lockdep class key.
  The lockdep infrastructure supports nonvalidate lockdep class key type.
  It means this lockdep is not validated by lockdep infrastructure.
  So, the splat will not happen but lockdep couldn't detect real deadlock
  case because lockdep really doesn't validate it.
  I think this should be used for really special cases.
  (lockdep_set_novalidate_class())

Further discussion here:
https://patchwork.ozlabs.org/project/netdev/patch/[email protected]/

There appears to be no negative side-effect to declaring lockless TX for
the DSA virtual interfaces, which means they handle their own locking.
So that's what we do to make the splat go away.

Patch tested in a wide variety of cases: unicast, multicast, PTP, etc.

Fixes: ab92d68 ("net: core: add generic lockdep keys")
Suggested-by: Taehee Yoo <[email protected]>
Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
Signed-off-by: David S. Miller <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
DanielOgorchock pushed a commit that referenced this pull request Aug 16, 2020
commit 254503a upstream.

The drm/omap driver was fixed to correct an issue where using a
divider of 32 breaks the DSS despite the TRM stating 32 is a valid
number.  Through experimentation, it appears that 31 works, and
it is consistent with the value used by the drm/omap driver.

This patch fixes the divider for fbdev driver instead of the drm.

Fixes: f76ee89 ("omapfb: copy omapdss & displays for omapfb")
Cc: <[email protected]> #4.5+
Signed-off-by: Adam Ford <[email protected]>
Reviewed-by: Tomi Valkeinen <[email protected]>
Cc: Dave Airlie <[email protected]>
Cc: Rob Clark <[email protected]>
[b.zolnierkie: mark patch as applicable to stable 4.5+ (was 4.9+)]
Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
Link: https://patchwork.freedesktop.org/patch/msgid/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
DanielOgorchock pushed a commit that referenced this pull request Sep 5, 2020
[ Upstream commit ab0db04 ]

When running with -o enospc_debug you can get the following splat if one
of the dump_space_info's trip

  ======================================================
  WARNING: possible circular locking dependency detected
  5.8.0-rc5+ #20 Tainted: G           OE
  ------------------------------------------------------
  dd/563090 is trying to acquire lock:
  ffff9e7dbf4f1e18 (&ctl->tree_lock){+.+.}-{2:2}, at: btrfs_dump_free_space+0x2b/0xa0 [btrfs]

  but task is already holding lock:
  ffff9e7e2284d428 (&cache->lock){+.+.}-{2:2}, at: btrfs_dump_space_info+0xaa/0x120 [btrfs]

  which lock already depends on the new lock.

  the existing dependency chain (in reverse order) is:

  -> #3 (&cache->lock){+.+.}-{2:2}:
	 _raw_spin_lock+0x25/0x30
	 btrfs_add_reserved_bytes+0x3c/0x3c0 [btrfs]
	 find_free_extent+0x7ef/0x13b0 [btrfs]
	 btrfs_reserve_extent+0x9b/0x180 [btrfs]
	 btrfs_alloc_tree_block+0xc1/0x340 [btrfs]
	 alloc_tree_block_no_bg_flush+0x4a/0x60 [btrfs]
	 __btrfs_cow_block+0x122/0x530 [btrfs]
	 btrfs_cow_block+0x106/0x210 [btrfs]
	 commit_cowonly_roots+0x55/0x300 [btrfs]
	 btrfs_commit_transaction+0x4ed/0xac0 [btrfs]
	 sync_filesystem+0x74/0x90
	 generic_shutdown_super+0x22/0x100
	 kill_anon_super+0x14/0x30
	 btrfs_kill_super+0x12/0x20 [btrfs]
	 deactivate_locked_super+0x36/0x70
	 cleanup_mnt+0x104/0x160
	 task_work_run+0x5f/0x90
	 __prepare_exit_to_usermode+0x1bd/0x1c0
	 do_syscall_64+0x5e/0xb0
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  -> #2 (&space_info->lock){+.+.}-{2:2}:
	 _raw_spin_lock+0x25/0x30
	 btrfs_block_rsv_release+0x1a6/0x3f0 [btrfs]
	 btrfs_inode_rsv_release+0x4f/0x170 [btrfs]
	 btrfs_clear_delalloc_extent+0x155/0x480 [btrfs]
	 clear_state_bit+0x81/0x1a0 [btrfs]
	 __clear_extent_bit+0x25c/0x5d0 [btrfs]
	 clear_extent_bit+0x15/0x20 [btrfs]
	 btrfs_invalidatepage+0x2b7/0x3c0 [btrfs]
	 truncate_cleanup_page+0x47/0xe0
	 truncate_inode_pages_range+0x238/0x840
	 truncate_pagecache+0x44/0x60
	 btrfs_setattr+0x202/0x5e0 [btrfs]
	 notify_change+0x33b/0x490
	 do_truncate+0x76/0xd0
	 path_openat+0x687/0xa10
	 do_filp_open+0x91/0x100
	 do_sys_openat2+0x215/0x2d0
	 do_sys_open+0x44/0x80
	 do_syscall_64+0x52/0xb0
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  -> #1 (&tree->lock#2){+.+.}-{2:2}:
	 _raw_spin_lock+0x25/0x30
	 find_first_extent_bit+0x32/0x150 [btrfs]
	 write_pinned_extent_entries.isra.0+0xc5/0x100 [btrfs]
	 __btrfs_write_out_cache+0x172/0x480 [btrfs]
	 btrfs_write_out_cache+0x7a/0xf0 [btrfs]
	 btrfs_write_dirty_block_groups+0x286/0x3b0 [btrfs]
	 commit_cowonly_roots+0x245/0x300 [btrfs]
	 btrfs_commit_transaction+0x4ed/0xac0 [btrfs]
	 close_ctree+0xf9/0x2f5 [btrfs]
	 generic_shutdown_super+0x6c/0x100
	 kill_anon_super+0x14/0x30
	 btrfs_kill_super+0x12/0x20 [btrfs]
	 deactivate_locked_super+0x36/0x70
	 cleanup_mnt+0x104/0x160
	 task_work_run+0x5f/0x90
	 __prepare_exit_to_usermode+0x1bd/0x1c0
	 do_syscall_64+0x5e/0xb0
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  -> #0 (&ctl->tree_lock){+.+.}-{2:2}:
	 __lock_acquire+0x1240/0x2460
	 lock_acquire+0xab/0x360
	 _raw_spin_lock+0x25/0x30
	 btrfs_dump_free_space+0x2b/0xa0 [btrfs]
	 btrfs_dump_space_info+0xf4/0x120 [btrfs]
	 btrfs_reserve_extent+0x176/0x180 [btrfs]
	 __btrfs_prealloc_file_range+0x145/0x550 [btrfs]
	 cache_save_setup+0x28d/0x3b0 [btrfs]
	 btrfs_start_dirty_block_groups+0x1fc/0x4f0 [btrfs]
	 btrfs_commit_transaction+0xcc/0xac0 [btrfs]
	 btrfs_alloc_data_chunk_ondemand+0x162/0x4c0 [btrfs]
	 btrfs_check_data_free_space+0x4c/0xa0 [btrfs]
	 btrfs_buffered_write.isra.0+0x19b/0x740 [btrfs]
	 btrfs_file_write_iter+0x3cf/0x610 [btrfs]
	 new_sync_write+0x11e/0x1b0
	 vfs_write+0x1c9/0x200
	 ksys_write+0x68/0xe0
	 do_syscall_64+0x52/0xb0
	 entry_SYSCALL_64_after_hwframe+0x44/0xa9

  other info that might help us debug this:

  Chain exists of:
    &ctl->tree_lock --> &space_info->lock --> &cache->lock

   Possible unsafe locking scenario:

	 CPU0                    CPU1
	 ----                    ----
    lock(&cache->lock);
				 lock(&space_info->lock);
				 lock(&cache->lock);
    lock(&ctl->tree_lock);

   *** DEADLOCK ***

  6 locks held by dd/563090:
   #0: ffff9e7e21d18448 (sb_writers#14){.+.+}-{0:0}, at: vfs_write+0x195/0x200
   #1: ffff9e7dd0410ed8 (&sb->s_type->i_mutex_key#19){++++}-{3:3}, at: btrfs_file_write_iter+0x86/0x610 [btrfs]
   #2: ffff9e7e21d18638 (sb_internal#2){.+.+}-{0:0}, at: start_transaction+0x40b/0x5b0 [btrfs]
   #3: ffff9e7e1f05d688 (&cur_trans->cache_write_mutex){+.+.}-{3:3}, at: btrfs_start_dirty_block_groups+0x158/0x4f0 [btrfs]
   #4: ffff9e7e2284ddb8 (&space_info->groups_sem){++++}-{3:3}, at: btrfs_dump_space_info+0x69/0x120 [btrfs]
   #5: ffff9e7e2284d428 (&cache->lock){+.+.}-{2:2}, at: btrfs_dump_space_info+0xaa/0x120 [btrfs]

  stack backtrace:
  CPU: 3 PID: 563090 Comm: dd Tainted: G           OE     5.8.0-rc5+ #20
  Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./890FX Deluxe5, BIOS P1.40 05/03/2011
  Call Trace:
   dump_stack+0x96/0xd0
   check_noncircular+0x162/0x180
   __lock_acquire+0x1240/0x2460
   ? wake_up_klogd.part.0+0x30/0x40
   lock_acquire+0xab/0x360
   ? btrfs_dump_free_space+0x2b/0xa0 [btrfs]
   _raw_spin_lock+0x25/0x30
   ? btrfs_dump_free_space+0x2b/0xa0 [btrfs]
   btrfs_dump_free_space+0x2b/0xa0 [btrfs]
   btrfs_dump_space_info+0xf4/0x120 [btrfs]
   btrfs_reserve_extent+0x176/0x180 [btrfs]
   __btrfs_prealloc_file_range+0x145/0x550 [btrfs]
   ? btrfs_qgroup_reserve_data+0x1d/0x60 [btrfs]
   cache_save_setup+0x28d/0x3b0 [btrfs]
   btrfs_start_dirty_block_groups+0x1fc/0x4f0 [btrfs]
   btrfs_commit_transaction+0xcc/0xac0 [btrfs]
   ? start_transaction+0xe0/0x5b0 [btrfs]
   btrfs_alloc_data_chunk_ondemand+0x162/0x4c0 [btrfs]
   btrfs_check_data_free_space+0x4c/0xa0 [btrfs]
   btrfs_buffered_write.isra.0+0x19b/0x740 [btrfs]
   ? ktime_get_coarse_real_ts64+0xa8/0xd0
   ? trace_hardirqs_on+0x1c/0xe0
   btrfs_file_write_iter+0x3cf/0x610 [btrfs]
   new_sync_write+0x11e/0x1b0
   vfs_write+0x1c9/0x200
   ksys_write+0x68/0xe0
   do_syscall_64+0x52/0xb0
   entry_SYSCALL_64_after_hwframe+0x44/0xa9

This is because we're holding the block_group->lock while trying to dump
the free space cache.  However we don't need this lock, we just need it
to read the values for the printk, so move the free space cache dumping
outside of the block group lock.

Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
DanielOgorchock pushed a commit that referenced this pull request Sep 5, 2020
commit 9f4aa52 upstream.

During initialization of the DASD DIAG driver a request is issued
that has a bio structure that resides on the stack. With virtually
mapped kernel stacks this bio address might be in virtual storage
which is unsuitable for usage with the diag250 call.
In this case the device can not be set online using the DIAG
discipline and fails with -EOPNOTSUP.
In the system journal the following error message is presented:

dasd: X.X.XXXX Setting the DASD online with discipline DIAG failed
with rc=-95

Fix by allocating the bio structure instead of having it on the stack.

Fixes: ce3dc44 ("s390: add support for virtually mapped kernel stacks")
Signed-off-by: Stefan Haberland <[email protected]>
Reviewed-by: Peter Oberparleiter <[email protected]>
Cc: [email protected] #4.20
Signed-off-by: Jens Axboe <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
DanielOgorchock pushed a commit that referenced this pull request Sep 5, 2020
commit 18c850f upstream.

There's long existed a lockdep splat because we open our bdev's under
the ->device_list_mutex at mount time, which acquires the bd_mutex.
Usually this goes unnoticed, but if you do loopback devices at all
suddenly the bd_mutex comes with a whole host of other dependencies,
which results in the splat when you mount a btrfs file system.

======================================================
WARNING: possible circular locking dependency detected
5.8.0-0.rc3.1.fc33.x86_64+debug #1 Not tainted
------------------------------------------------------
systemd-journal/509 is trying to acquire lock:
ffff970831f84db0 (&fs_info->reloc_mutex){+.+.}-{3:3}, at: btrfs_record_root_in_trans+0x44/0x70 [btrfs]

but task is already holding lock:
ffff97083144d598 (sb_pagefaults){.+.+}-{0:0}, at: btrfs_page_mkwrite+0x59/0x560 [btrfs]

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

 -> #6 (sb_pagefaults){.+.+}-{0:0}:
       __sb_start_write+0x13e/0x220
       btrfs_page_mkwrite+0x59/0x560 [btrfs]
       do_page_mkwrite+0x4f/0x130
       do_wp_page+0x3b0/0x4f0
       handle_mm_fault+0xf47/0x1850
       do_user_addr_fault+0x1fc/0x4b0
       exc_page_fault+0x88/0x300
       asm_exc_page_fault+0x1e/0x30

 -> #5 (&mm->mmap_lock#2){++++}-{3:3}:
       __might_fault+0x60/0x80
       _copy_from_user+0x20/0xb0
       get_sg_io_hdr+0x9a/0xb0
       scsi_cmd_ioctl+0x1ea/0x2f0
       cdrom_ioctl+0x3c/0x12b4
       sr_block_ioctl+0xa4/0xd0
       block_ioctl+0x3f/0x50
       ksys_ioctl+0x82/0xc0
       __x64_sys_ioctl+0x16/0x20
       do_syscall_64+0x52/0xb0
       entry_SYSCALL_64_after_hwframe+0x44/0xa9

 -> #4 (&cd->lock){+.+.}-{3:3}:
       __mutex_lock+0x7b/0x820
       sr_block_open+0xa2/0x180
       __blkdev_get+0xdd/0x550
       blkdev_get+0x38/0x150
       do_dentry_open+0x16b/0x3e0
       path_openat+0x3c9/0xa00
       do_filp_open+0x75/0x100
       do_sys_openat2+0x8a/0x140
       __x64_sys_openat+0x46/0x70
       do_syscall_64+0x52/0xb0
       entry_SYSCALL_64_after_hwframe+0x44/0xa9

 -> #3 (&bdev->bd_mutex){+.+.}-{3:3}:
       __mutex_lock+0x7b/0x820
       __blkdev_get+0x6a/0x550
       blkdev_get+0x85/0x150
       blkdev_get_by_path+0x2c/0x70
       btrfs_get_bdev_and_sb+0x1b/0xb0 [btrfs]
       open_fs_devices+0x88/0x240 [btrfs]
       btrfs_open_devices+0x92/0xa0 [btrfs]
       btrfs_mount_root+0x250/0x490 [btrfs]
       legacy_get_tree+0x30/0x50
       vfs_get_tree+0x28/0xc0
       vfs_kern_mount.part.0+0x71/0xb0
       btrfs_mount+0x119/0x380 [btrfs]
       legacy_get_tree+0x30/0x50
       vfs_get_tree+0x28/0xc0
       do_mount+0x8c6/0xca0
       __x64_sys_mount+0x8e/0xd0
       do_syscall_64+0x52/0xb0
       entry_SYSCALL_64_after_hwframe+0x44/0xa9

 -> #2 (&fs_devs->device_list_mutex){+.+.}-{3:3}:
       __mutex_lock+0x7b/0x820
       btrfs_run_dev_stats+0x36/0x420 [btrfs]
       commit_cowonly_roots+0x91/0x2d0 [btrfs]
       btrfs_commit_transaction+0x4e6/0x9f0 [btrfs]
       btrfs_sync_file+0x38a/0x480 [btrfs]
       __x64_sys_fdatasync+0x47/0x80
       do_syscall_64+0x52/0xb0
       entry_SYSCALL_64_after_hwframe+0x44/0xa9

 -> #1 (&fs_info->tree_log_mutex){+.+.}-{3:3}:
       __mutex_lock+0x7b/0x820
       btrfs_commit_transaction+0x48e/0x9f0 [btrfs]
       btrfs_sync_file+0x38a/0x480 [btrfs]
       __x64_sys_fdatasync+0x47/0x80
       do_syscall_64+0x52/0xb0
       entry_SYSCALL_64_after_hwframe+0x44/0xa9

 -> #0 (&fs_info->reloc_mutex){+.+.}-{3:3}:
       __lock_acquire+0x1241/0x20c0
       lock_acquire+0xb0/0x400
       __mutex_lock+0x7b/0x820
       btrfs_record_root_in_trans+0x44/0x70 [btrfs]
       start_transaction+0xd2/0x500 [btrfs]
       btrfs_dirty_inode+0x44/0xd0 [btrfs]
       file_update_time+0xc6/0x120
       btrfs_page_mkwrite+0xda/0x560 [btrfs]
       do_page_mkwrite+0x4f/0x130
       do_wp_page+0x3b0/0x4f0
       handle_mm_fault+0xf47/0x1850
       do_user_addr_fault+0x1fc/0x4b0
       exc_page_fault+0x88/0x300
       asm_exc_page_fault+0x1e/0x30

other info that might help us debug this:

Chain exists of:
  &fs_info->reloc_mutex --> &mm->mmap_lock#2 --> sb_pagefaults

Possible unsafe locking scenario:

     CPU0                    CPU1
     ----                    ----
 lock(sb_pagefaults);
                             lock(&mm->mmap_lock#2);
                             lock(sb_pagefaults);
 lock(&fs_info->reloc_mutex);

 *** DEADLOCK ***

3 locks held by systemd-journal/509:
 #0: ffff97083bdec8b8 (&mm->mmap_lock#2){++++}-{3:3}, at: do_user_addr_fault+0x12e/0x4b0
 #1: ffff97083144d598 (sb_pagefaults){.+.+}-{0:0}, at: btrfs_page_mkwrite+0x59/0x560 [btrfs]
 #2: ffff97083144d6a8 (sb_internal){.+.+}-{0:0}, at: start_transaction+0x3f8/0x500 [btrfs]

stack backtrace:
CPU: 0 PID: 509 Comm: systemd-journal Not tainted 5.8.0-0.rc3.1.fc33.x86_64+debug #1
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015
Call Trace:
 dump_stack+0x92/0xc8
 check_noncircular+0x134/0x150
 __lock_acquire+0x1241/0x20c0
 lock_acquire+0xb0/0x400
 ? btrfs_record_root_in_trans+0x44/0x70 [btrfs]
 ? lock_acquire+0xb0/0x400
 ? btrfs_record_root_in_trans+0x44/0x70 [btrfs]
 __mutex_lock+0x7b/0x820
 ? btrfs_record_root_in_trans+0x44/0x70 [btrfs]
 ? kvm_sched_clock_read+0x14/0x30
 ? sched_clock+0x5/0x10
 ? sched_clock_cpu+0xc/0xb0
 btrfs_record_root_in_trans+0x44/0x70 [btrfs]
 start_transaction+0xd2/0x500 [btrfs]
 btrfs_dirty_inode+0x44/0xd0 [btrfs]
 file_update_time+0xc6/0x120
 btrfs_page_mkwrite+0xda/0x560 [btrfs]
 ? sched_clock+0x5/0x10
 do_page_mkwrite+0x4f/0x130
 do_wp_page+0x3b0/0x4f0
 handle_mm_fault+0xf47/0x1850
 do_user_addr_fault+0x1fc/0x4b0
 exc_page_fault+0x88/0x300
 ? asm_exc_page_fault+0x8/0x30
 asm_exc_page_fault+0x1e/0x30
RIP: 0033:0x7fa3972fdbfe
Code: Bad RIP value.

Fix this by not holding the ->device_list_mutex at this point.  The
device_list_mutex exists to protect us from modifying the device list
while the file system is running.

However it can also be modified by doing a scan on a device.  But this
action is specifically protected by the uuid_mutex, which we are holding
here.  We cannot race with opening at this point because we have the
->s_mount lock held during the mount.  Not having the
->device_list_mutex here is perfectly safe as we're not going to change
the devices at this point.

CC: [email protected] # 4.19+
Signed-off-by: Josef Bacik <[email protected]>
Reviewed-by: David Sterba <[email protected]>
[ add some comments ]
Signed-off-by: David Sterba <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
DanielOgorchock pushed a commit that referenced this pull request Sep 5, 2020
… set

[ Upstream commit 1101c87 ]

We received an error report that perf-record caused 'Segmentation fault'
on a newly system (e.g. on the new installed ubuntu).

  (gdb) backtrace
  #0  __read_once_size (size=4, res=<synthetic pointer>, p=0x14) at /root/0-jinyao/acme/tools/include/linux/compiler.h:139
  #1  atomic_read (v=0x14) at /root/0-jinyao/acme/tools/include/asm/../../arch/x86/include/asm/atomic.h:28
  #2  refcount_read (r=0x14) at /root/0-jinyao/acme/tools/include/linux/refcount.h:65
  #3  perf_mmap__read_init (map=map@entry=0x0) at mmap.c:177
  #4  0x0000561ce5c0de39 in perf_evlist__poll_thread (arg=0x561ce68584d0) at util/sideband_evlist.c:62
  #5  0x00007fad78491609 in start_thread (arg=<optimized out>) at pthread_create.c:477
  #6  0x00007fad7823c103 in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

The root cause is, evlist__add_bpf_sb_event() just returns 0 if
HAVE_LIBBPF_SUPPORT is not defined (inline function path). So it will
not create a valid evsel for side-band event.

But perf-record still creates BPF side band thread to process the
side-band event, then the error happpens.

We can reproduce this issue by removing the libelf-dev. e.g.
1. apt-get remove libelf-dev
2. perf record -a -- sleep 1

  root@test:~# ./perf record -a -- sleep 1
  perf: Segmentation fault
  Obtained 6 stack frames.
  ./perf(+0x28eee8) [0x5562d6ef6ee8]
  /lib/x86_64-linux-gnu/libc.so.6(+0x46210) [0x7fbfdc65f210]
  ./perf(+0x342e74) [0x5562d6faae74]
  ./perf(+0x257e39) [0x5562d6ebfe39]
  /lib/x86_64-linux-gnu/libpthread.so.0(+0x9609) [0x7fbfdc990609]
  /lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7fbfdc73b103]
  Segmentation fault (core dumped)

To fix this issue,

1. We either install the missing libraries to let HAVE_LIBBPF_SUPPORT
   be defined.
   e.g. apt-get install libelf-dev and install other related libraries.

2. Use this patch to skip the side-band event setup if HAVE_LIBBPF_SUPPORT
   is not set.

Committer notes:

The side band thread is not used just with BPF, it is also used with
--switch-output-event, so narrow the ifdef to the BPF specific part.

Fixes: 23cbb41 ("perf record: Move side band evlist setup to separate routine")
Signed-off-by: Jin Yao <[email protected]>
Acked-by: Jiri Olsa <[email protected]>
Cc: Alexander Shishkin <[email protected]>
Cc: Andi Kleen <[email protected]>
Cc: Jin Yao <[email protected]>
Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Link: http://lore.kernel.org/lkml/[email protected]
Signed-off-by: Arnaldo Carvalho de Melo <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
@DanielOgorchock DanielOgorchock force-pushed the ogorchock branch 4 times, most recently from c4875e1 to 6ee682b Compare July 25, 2021 19:37
@nfp0
Copy link

nfp0 commented Oct 29, 2022

What benefits does this bring for the SNES controller?
Is it for battery reading and LED control?

@nadiaholmquist
Copy link
Author

Is it for battery reading and LED control?

Yes, as well as making it actually work over USB and not just wirelessly, and a consistent button mapping with other controllers supported by Linux.


I should probably look at getting my updated version of this patch that also supports the NES controllers mainlined unless someone else has done the work to support these controllers since last I looked at it, which by now is quite a long time ago.

@SuperSamus
Copy link

I saw this: https://github.com/emilyst/hid-nx-dkms.
It also supports the N64 and Genesis controllers.
(I never tried it)

@nfp0
Copy link

nfp0 commented Oct 30, 2022

@nadiaholmquist Thank you for the clarification. Yes, I would love to see your work mainlined!

@SuperSamus Interesting. I did not know that exists. It's useful for me because I also game with the Genesis controller and plan to get the N64 one (if it ever gets back in stock). This could also get mainlined.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.