-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Linux 5.17.4 #133
Linux 5.17.4 #133
Conversation
commit 5b6547e upstream. Steve reported that ChromeOS encounters the forceidle balancer being ran from rt_mutex_setprio()'s balance_callback() invocation and explodes. Now, the forceidle balancer gets queued every time the idle task gets selected, set_next_task(), which is strictly too often. rt_mutex_setprio() also uses set_next_task() in the 'change' pattern: queued = task_on_rq_queued(p); /* p->on_rq == TASK_ON_RQ_QUEUED */ running = task_current(rq, p); /* rq->curr == p */ if (queued) dequeue_task(...); if (running) put_prev_task(...); /* change task properties */ if (queued) enqueue_task(...); if (running) set_next_task(...); However, rt_mutex_setprio() will explicitly not run this pattern on the idle task (since priority boosting the idle task is quite insane). Most other 'change' pattern users are pidhash based and would also not apply to idle. Also, the change pattern doesn't contain a __balance_callback() invocation and hence we could have an out-of-band balance-callback, which *should* trigger the WARN in rq_pin_lock() (which guards against this exact anti-pattern). So while none of that explains how this happens, it does indicate that having it in set_next_task() might not be the most robust option. Instead, explicitly queue the forceidle balancer from pick_next_task() when it does indeed result in forceidle selection. Having it here, ensures it can only be triggered under the __schedule() rq->lock instance, and hence must be ran from that context. This also happens to clean up the code a little, so win-win. Fixes: d2dfa17 ("sched: Trivial forced-newidle balancer") Reported-by: Steven Rostedt <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Tested-by: T.J. Alumbaugh <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 386ef21 upstream. try_steal_cookie() looks at task_struct::cpus_mask to decide if the task could be moved to `this' CPU. It ignores that the task might be in a migration disabled section while not on the CPU. In this case the task must not be moved otherwise per-CPU assumption are broken. Use is_cpu_allowed(), as suggested by Peter Zijlstra, to decide if the a task can be moved. Fixes: d2dfa17 ("sched: Trivial forced-newidle balancer") Signed-off-by: Sebastian Andrzej Siewior <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 1cd5f05 upstream. Paolo reported that the instruction sequence that is used to replace: call __static_call_return0 namely: 66 66 48 31 c0 data16 data16 xor %rax,%rax decodes to something else on i386, namely: 66 66 48 data16 dec %ax 31 c0 xor %eax,%eax Which is a nonsensical sequence that happens to have the same outcome. *However* an important distinction is that it consists of 2 instructions which is a problem when the thing needs to be overwriten with a regular call instruction again. As such, replace the instruction with something that decodes the same on both i386 and x86_64. Fixes: 3f2a8fc ("static_call/x86: Add __static_call_return0()") Reported-by: Paolo Bonzini <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Link: https://lkml.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 334865b upstream. Bernardo reported an error that Nathan bisected down to (x86_64) defconfig+LTO_CLANG_FULL+X86_PMEM_LEGACY. LTO vmlinux.o ld.lld: error: <instantiation>:1:13: redefinition of 'found' .set found, 0 ^ <inline asm>:29:1: while in macro instantiation extable_type_reg reg=%eax, type=(17 | ((0) << 16)) ^ This appears to be another LTO specific issue similar to what was folded into commit 4b5305d ("x86/extable: Extend extable functionality"), where the `.set found, 0` in DEFINE_EXTABLE_TYPE_REG in arch/x86/include/asm/asm.h conflicts with the symbol for the static function `found` in arch/x86/kernel/pmem.c. Assembler .set directive declare symbols with global visibility, so the assembler may not rename such symbols in the event of a conflict. LTO could rename static functions if there was a conflict in C sources, but it cannot see into symbols defined in inline asm. The symbols are also retained in the symbol table, regardless of LTO. Give the symbols .L prefixes making them locally visible, so that they may be renamed for LTO to avoid conflicts, and to drop them from the symbol table regardless of LTO. Fixes: 4b5305d ("x86/extable: Extend extable functionality") Reported-by: Bernardo Meurer Costa <[email protected]> Debugged-by: Nathan Chancellor <[email protected]> Signed-off-by: Nick Desaulniers <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Reviewed-by: Nathan Chancellor <[email protected]> Tested-by: Nathan Chancellor <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
…duling commit af27e41 upstream. The way KVM drives GICv4.{0,1} is as follows: - vcpu_load() makes the VPE resident, instructing the RD to start scanning for interrupts - just before entering the guest, we check that the RD has finished scanning and that we can start running the vcpu - on preemption, we deschedule the VPE by making it invalid on the RD However, we are preemptible between the first two steps. If it so happens *and* that the RD was still scanning, we nonetheless write to the GICR_VPENDBASER register while Dirty is set, and bad things happen (we're in UNPRED land). This affects both the 4.0 and 4.1 implementations. Make sure Dirty is cleared before performing the deschedule, meaning that its_clear_vpend_valid() becomes a sort of full VPE residency barrier. Reported-by: Jingyi Wang <[email protected]> Tested-by: Nianyao Tang <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Fixes: 57e3ceb ("KVM: arm64: Delay the polling of the GICR_VPENDBASER.Dirty bit") Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit af41d28 upstream. Using conditional branches between two files is hasardous, they may get linked too far from each other. arch/powerpc/kvm/book3s_64_entry.o:(.text+0x3ec): relocation truncated to fit: R_PPC64_REL14 (stub) against symbol `system_reset_common' defined in .text section in arch/powerpc/kernel/head_64.o Reorganise the code to use non conditional branches. Fixes: 89d35b2 ("KVM: PPC: Book3S HV P9: Implement the rest of the P9 path in C") Signed-off-by: Christophe Leroy <[email protected]> [mpe: Avoid odd-looking bne ., use named local labels] Signed-off-by: Michael Ellerman <[email protected]> Link: https://lore.kernel.org/r/89cf27bf43ee07a0b2879b9e8e2f5cd6386a3645.1648366338.git.christophe.leroy@csgroup.eu Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 544808f upstream. At the moment the GIC IRQ domain translation routine happily converts ACPI table GSI numbers below 16 to GIC SGIs (Software Generated Interrupts aka IPIs). On the Devicetree side we explicitly forbid this translation, actually the function will never return HWIRQs below 16 when using a DT based domain translation. We expect SGIs to be handled in the first part of the function, and any further occurrence should be treated as a firmware bug, so add a check and print to report this explicitly and avoid lengthy debug sessions. Fixes: 64b499d ("irqchip/gic-v3: Configure SGIs as standard interrupts") Signed-off-by: Andre Przywara <[email protected]> Signed-off-by: Marc Zyngier <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit a431dbb upstream. The gcc 12 compiler reports a "'mem_section' will never be NULL" warning on the following code: static inline struct mem_section *__nr_to_section(unsigned long nr) { #ifdef CONFIG_SPARSEMEM_EXTREME if (!mem_section) return NULL; #endif if (!mem_section[SECTION_NR_TO_ROOT(nr)]) return NULL; : It happens with CONFIG_SPARSEMEM_EXTREME off. The mem_section definition is #ifdef CONFIG_SPARSEMEM_EXTREME extern struct mem_section **mem_section; #else extern struct mem_section mem_section[NR_SECTION_ROOTS][SECTIONS_PER_ROOT]; #endif In the !CONFIG_SPARSEMEM_EXTREME case, mem_section is a static 2-dimensional array and so the check "!mem_section[SECTION_NR_TO_ROOT(nr)]" doesn't make sense. Fix this warning by moving the "!mem_section[SECTION_NR_TO_ROOT(nr)]" check up inside the CONFIG_SPARSEMEM_EXTREME block and adding an explicit NR_SECTION_ROOTS check to make sure that there is no out-of-bound array access. Link: https://lkml.kernel.org/r/[email protected] Fixes: 3e34726 ("sparsemem extreme implementation") Signed-off-by: Waiman Long <[email protected]> Reported-by: Justin Forbes <[email protected]> Cc: "Kirill A . Shutemov" <[email protected]> Cc: Ingo Molnar <[email protected]> Cc: Rafael Aquini <[email protected]> Signed-off-by: Andrew Morton <[email protected]> Signed-off-by: Linus Torvalds <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 8fd4ddd upstream. System.map shows that vmlinux contains several instances of __static_call_return0(): c0004fc0 t __static_call_return0 c0011518 t __static_call_return0 c00d8160 t __static_call_return0 arch_static_call_transform() uses the middle one to check whether we are setting a call to __static_call_return0 or not: c0011520 <arch_static_call_transform>: c0011520: 3d 20 c0 01 lis r9,-16383 <== r9 = 0xc001 << 16 c0011524: 39 29 15 18 addi r9,r9,5400 <== r9 += 0x1518 c0011528: 7c 05 48 00 cmpw r5,r9 <== r9 has value 0xc0011518 here So if static_call_update() is called with one of the other instances of __static_call_return0(), arch_static_call_transform() won't recognise it. In order to work properly, global single instance of __static_call_return0() is required. Fixes: 3f2a8fc ("static_call/x86: Add __static_call_return0()") Signed-off-by: Christophe Leroy <[email protected]> Signed-off-by: Peter Zijlstra (Intel) <[email protected]> Acked-by: Josh Poimboeuf <[email protected]> Link: https://lkml.kernel.org/r/30821468a0e7d28251954b578e5051dc09300d04.1647258493.git.christophe.leroy@csgroup.eu Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 584b018 upstream. In preparation for not necessarily having a file assigned at prep time, defer any initialization associated with the file to when the opcode handler is run. Cc: [email protected] # v5.15+ Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 5106dd6 upstream. We'll need this in a future patch, when we could be assigning the file after the prep stage. While at it, get rid of the io_file_get() helper, it just makes the code harder to read. Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 6bf9c47 upstream. If an application uses direct open or accept, it knows in advance what direct descriptor value it will get as it picks it itself. This allows combined requests such as: sqe = io_uring_get_sqe(ring); io_uring_prep_openat_direct(sqe, ..., file_slot); sqe->flags |= IOSQE_IO_LINK | IOSQE_CQE_SKIP_SUCCESS; sqe = io_uring_get_sqe(ring); io_uring_prep_read(sqe,file_slot, buf, buf_size, 0); sqe->flags |= IOSQE_FIXED_FILE; io_uring_submit(ring); where we prepare both a file open and read, and only get a completion event for the read when both have completed successfully. Currently links are fully prepared before the head is issued, but that fails if the dependent link needs a file assigned that isn't valid until the head has completed. Conversely, if the same chain is performed but the fixed file slot is already valid, then we would be unexpectedly returning data from the old file slot rather than the newly opened one. Make sure we're consistent here. Allow deferral of file setup, which makes this documented case work. Cc: [email protected] # v5.15+ Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit d536123 upstream. io_uring tracks requests that are referencing an io_uring descriptor to be able to cancel without worrying about loops in the references. Since we now assign the file at execution time, the easier approach is to drop a potentially problematic reference before we punt the request. This eliminates the need to special case these types of files beyond just marking them as such, and simplifies cancelation quite a bit. This also fixes a recent issue where an async punted tee operation would with the io_uring descriptor as the output file would crash when attempting to get a reference to the file from the io-wq worker. We could have worked around that, but this is the much cleaner fix. Fixes: 6bf9c47 ("io_uring: defer file assignment") Reported-by: [email protected] Signed-off-by: Jens Axboe <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
Link: https://lore.kernel.org/r/[email protected] Tested-by: Ronald Warsow <[email protected]> Tested-by: Ron Economos <[email protected]> Tested-by: Justin M. Forbes <[email protected]> Tested-by: Shuah Khan <[email protected]> Tested-by: Fox Chen <[email protected]> Tested-by: Zan Aziz <[email protected]> Tested-by: Guenter Roeck <[email protected]> Tested-by: Linux Kernel Functional Testing <[email protected]> Tested-by: Jiri Slaby <[email protected]> Tested-by: Rudi Heitbaum <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit e7031d8 upstream. [Why] To debug when p-state is being blocked and avoid PMFW hangs when it does occur. [How] Re-use the DCN10 hardware sequencer by adding a new interface for verifying p-state high on the hubbub. The interface is mostly the same as the DCN10 interface, but the bit definitions have changed for the debug bus. Signed-off-by: Nicholas Kazlauskas <[email protected]> Reviewed-by: Eric Yang <[email protected]> Reviewed-by: Harry Wentland <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 3107e1a upstream. [Why] It changed since dcn30 but the hubbub31 constructor hasn't been modified to reflect this. [How] Update the value in the constructor to 0x6 so we're checking the right bits for p-state allow. It worked before by accident, but can falsely assert 0 depending on HW state transitions. The most frequent of which appears to be when all pipes turn off during IGT tests. Cc: Harry Wentland <[email protected]> Fixes: e7031d8 ("drm/amd/display: Add pstate verification and recovery for DCN31") Signed-off-by: Nicholas Kazlauskas <[email protected]> Reviewed-by: Eric Yang <[email protected]> Acked-by: Harry Wentland <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 01f6c73 upstream. Currently the first thing checked is whether the PCSI cpu_suspend function has been initialized. Another change will be overloading `acpi_processor_ffh_lpi_probe` and calling it sooner. So make the `has_lpi` check the first thing checked to prepare for that change. Reviewed-by: Sudeep Holla <[email protected]> Signed-off-by: Mario Limonciello <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit eb087f3 upstream. When `osc_pc_lpi_support_confirmed` is set through `_OSC` and `_LPI` is populated then the cpuidle driver assumes that LPI is fully functional. However currently the kernel only provides architectural support for LPI on ARM. This leads to high power consumption on X86 platforms that otherwise try to enable LPI. So probe whether or not LPI support is implemented before enabling LPI in the kernel. This is done by overloading `acpi_processor_ffh_lpi_probe` to check whether it returns `-EOPNOTSUPP`. It also means that all future implementations of `acpi_processor_ffh_lpi_probe` will need to follow these semantics as well. Reviewed-by: Sudeep Holla <[email protected]> Signed-off-by: Mario Limonciello <[email protected]> Signed-off-by: Rafael J. Wysocki <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 907e772 upstream. Currently there is no way for Realtek DSA subdrivers to serialize consecutive regmap accesses. In preparation for a bugfix relating to indirect PHY register access - which involves a series of regmap reads and writes - add a facility for subdrivers to serialize their regmap access. Specifically, a mutex is added to the driver private data structure and the standard regmap is initialized with custom lock/unlock ops which use this mutex. Then, a "nolock" variant of the regmap is added, which is functionally equivalent to the existing regmap except that regmap locking is disabled. Functions that wish to serialize a sequence of regmap accesses may then lock the newly introduced driver-owned mutex before using the nolock regmap. Doing things this way means that subdriver code that doesn't care about serialized register access - i.e. the vast majority of code - needn't worry about synchronizing register access with an external lock: it can just continue to use the original regmap. Another advantage of this design is that, while regmaps with locking disabled do not expose a debugfs interface for obvious reasons, there still exists the original regmap which does expose this interface. This interface remains safe to use even combined with driver codepaths that use the nolock regmap, because said codepaths will use the same mutex to synchronize access. With respect to disadvantages, it can be argued that having near-duplicate regmaps is confusing. However, the naming is rather explicit, and examples will abound. Finally, while we are at it, rename realtek_smi_mdio_regmap_config to realtek_smi_regmap_config. This makes it consistent with the naming realtek_mdio_regmap_config in realtek-mdio.c. Signed-off-by: Alvin Šipraga <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Signed-off-by: David S. Miller <[email protected]> [alsi: backport to 5.16: s/priv/smi/g and remove realtek-mdio changes] Signed-off-by: Alvin Šipraga <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 2796728 upstream. Realtek switches in the rtl8365mb family can access the PHY registers of the internal PHYs via the switch registers. This method is called indirect access. At a high level, the indirect PHY register access method involves reading and writing some special switch registers in a particular sequence. This works for both SMI and MDIO connected switches. Currently the rtl8365mb driver does not take any care to serialize the aforementioned access to the switch registers. In particular, it is permitted for other driver code to access other switch registers while the indirect PHY register access is ongoing. Locking is only done at the regmap level. This, however, is a bug: concurrent register access, even to unrelated switch registers, risks corrupting the PHY register value read back via the indirect access method described above. Arınç reported that the switch sometimes returns nonsense data when reading the PHY registers. In particular, a value of 0 causes the kernel's PHY subsystem to think that the link is down, but since most reads return correct data, the link then flip-flops between up and down over a period of time. The aforementioned bug can be readily observed by: 1. Enabling ftrace events for regmap and mdio 2. Polling BSMR PHY register for a connected port; it should always read the same (e.g. 0x79ed) 3. Wait for step 2 to give a different value Example command for step 2: while true; do phytool read swp2/2/0x01; done On my i.MX8MM, the above steps will yield a bogus value for the BSMR PHY register within a matter of seconds. The interleaved register access it then evident in the trace log: kworker/3:4-70 [003] ....... 1927.139849: regmap_reg_write: ethernet-switch reg=1004 val=bd phytool-16816 [002] ....... 1927.139979: regmap_reg_read: ethernet-switch reg=1f01 val=0 kworker/3:4-70 [003] ....... 1927.140381: regmap_reg_read: ethernet-switch reg=1005 val=0 phytool-16816 [002] ....... 1927.140468: regmap_reg_read: ethernet-switch reg=1d15 val=a69 kworker/3:4-70 [003] ....... 1927.140864: regmap_reg_read: ethernet-switch reg=1003 val=0 phytool-16816 [002] ....... 1927.140955: regmap_reg_write: ethernet-switch reg=1f02 val=2041 kworker/3:4-70 [003] ....... 1927.141390: regmap_reg_read: ethernet-switch reg=1002 val=0 phytool-16816 [002] ....... 1927.141479: regmap_reg_write: ethernet-switch reg=1f00 val=1 kworker/3:4-70 [003] ....... 1927.142311: regmap_reg_write: ethernet-switch reg=1004 val=be phytool-16816 [002] ....... 1927.142410: regmap_reg_read: ethernet-switch reg=1f01 val=0 kworker/3:4-70 [003] ....... 1927.142534: regmap_reg_read: ethernet-switch reg=1005 val=0 phytool-16816 [002] ....... 1927.142618: regmap_reg_read: ethernet-switch reg=1f04 val=0 phytool-16816 [002] ....... 1927.142641: mdio_access: SMI-0 read phy:0x02 reg:0x01 val:0x0000 <- ?! kworker/3:4-70 [003] ....... 1927.143037: regmap_reg_read: ethernet-switch reg=1001 val=0 kworker/3:4-70 [003] ....... 1927.143133: regmap_reg_read: ethernet-switch reg=1000 val=2d89 kworker/3:4-70 [003] ....... 1927.143213: regmap_reg_write: ethernet-switch reg=1004 val=be kworker/3:4-70 [003] ....... 1927.143291: regmap_reg_read: ethernet-switch reg=1005 val=0 kworker/3:4-70 [003] ....... 1927.143368: regmap_reg_read: ethernet-switch reg=1003 val=0 kworker/3:4-70 [003] ....... 1927.143443: regmap_reg_read: ethernet-switch reg=1002 val=6 The kworker here is polling MIB counters for stats, as evidenced by the register 0x1004 that we are writing to (RTL8365MB_MIB_ADDRESS_REG). This polling is performed every 3 seconds, but is just one example of such unsynchronized access. In Arınç's case, the driver was not using the switch IRQ, so the PHY subsystem was itself doing polling analogous to phytool in the above example. A test module was created [see second Link] to simulate such spurious switch register accesses while performing indirect PHY register reads and writes. Realtek was also consulted to confirm whether this is a known issue or not. The conclusion of these lines of inquiry is as follows: 1. Reading of PHY registers via indirect access will be aborted if, after executing the read operation (via a write to the INDIRECT_ACCESS_CTRL_REG), any register is accessed, other than INDIRECT_ACCESS_STATUS_REG. 2. The PHY register indirect read is only complete when INDIRECT_ACCESS_STATUS_REG reads zero. 3. The INDIRECT_ACCESS_DATA_REG, which is read to get the result of the PHY read, will contain the result of the last successful read operation. If there was spurious register access and the indirect read was aborted, then this register is not guaranteed to hold anything meaningful and the PHY read will silently fail. 4. PHY writes do not appear to be affected by this mechanism. 5. Other similar access routines, such as for MIB counters, although similar to the PHY indirect access method, are actually table access. Table access is not affected by spurious reads or writes of other registers. However, concurrent table access is not allowed. Currently this is protected via mib_lock, so there is nothing to fix. The above statements are corroborated both via the test module and through consultation with Realtek. In particular, Realtek states that this is simply a property of the hardware design and is not a hardware bug. To fix this problem, one must guard against regmap access while the PHY indirect register read is executing. Fix this by using the newly introduced "nolock" regmap in all PHY-related functions, and by aquiring the regmap mutex at the top level of the PHY register access callbacks. Although no issue has been observed with PHY register _writes_, this change also serializes the indirect access method there. This is done purely as a matter of convenience and for reasons of symmetry. Fixes: 4af2950 ("net: dsa: realtek-smi: add rtl8365mb subdriver for RTL8365MB-VC") Link: https://lore.kernel.org/netdev/CAJq09z5FCgG-+jVT7uxh1a-0CiiFsoKoHYsAWJtiKwv7LXKofQ@mail.gmail.com/ Link: https://lore.kernel.org/netdev/[email protected]/ Reported-by: Arınç ÜNAL <[email protected]> Reported-by: Luiz Angelo Daros de Luca <[email protected]> Signed-off-by: Alvin Šipraga <[email protected]> Reviewed-by: Vladimir Oltean <[email protected]> Reviewed-by: Linus Walleij <[email protected]> Signed-off-by: David S. Miller <[email protected]> [alsi: backport to 5.16: s/priv/smi/g] Cc: [email protected] # v5.16+ Signed-off-by: Alvin Šipraga <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 109d899 upstream. The kernel test robot reported build warnings with a randconfig that built realtek-{smi,mdio} without CONFIG_OF set. Since both interface drivers are using OF and will not probe without, add the corresponding dependency to Kconfig. Link: https://lore.kernel.org/all/[email protected]/ Link: https://lore.kernel.org/all/[email protected]/ Fixes: aac9400 ("net: dsa: realtek: add new mdio interface for drivers") Fixes: 765c39a ("net: dsa: realtek: convert subdrivers into modules") Signed-off-by: Alvin Šipraga <[email protected]> Reviewed-by: Andrew Lunn <[email protected]> Acked-by: Luiz Angelo Daros de Luca <[email protected]> Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Jakub Kicinski <[email protected]> [alsi: backport to 5.16: remove mdio part] Cc: [email protected] # v5.16+ Signed-off-by: Alvin Šipraga <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit ad3fc79 upstream. After commit 92082d4 ("btrfs: integrate page status update for data read path into begin/end_page_read"), the 'nr' counter at btrfs_do_readpage() is no longer used, we increment it but we never read from it. So just remove it. Reviewed-by: Johannes Thumshirn <[email protected]> Signed-off-by: Filipe Manana <[email protected]> Signed-off-by: David Sterba <[email protected]> Cc: Nathan Chancellor <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
…ps() commit 6d4a6b5 upstream. Clang's version of -Wunused-but-set-variable recently gained support for unary operations, which reveals two unused variables: fs/btrfs/block-group.c:2949:6: error: variable 'num_started' set but not used [-Werror,-Wunused-but-set-variable] int num_started = 0; ^ fs/btrfs/block-group.c:3116:6: error: variable 'num_started' set but not used [-Werror,-Wunused-but-set-variable] int num_started = 0; ^ 2 errors generated. These variables appear to be unused from their introduction, so just remove them to silence the warnings. Fixes: c9dc4c6 ("Btrfs: two stage dirty block group writeout") Fixes: 1bbc621 ("Btrfs: allow block group cache writeout outside critical section in commit") CC: [email protected] # 5.4+ Link: ClangBuiltLinux#1614 Signed-off-by: Nathan Chancellor <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 8c3ce49 upstream. We might have RISC-V systems (such as QEMU) where VMID is not part of the TLB entry tag so these systems will have to flush all TLB entries upon any change in hgatp.VMID. Currently, we zero-out hgatp CSR in kvm_arch_vcpu_put() and we re-program hgatp CSR in kvm_arch_vcpu_load(). For above described systems, this will flush all TLB entries whenever VCPU exits to user-space hence reducing performance. This patch fixes above described performance issue by not clearing hgatp CSR in kvm_arch_vcpu_put(). Fixes: 34bde9d ("RISC-V: KVM: Implement VCPU world-switch") Cc: [email protected] Signed-off-by: Anup Patel <[email protected]> Signed-off-by: Anup Patel <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 3ae87d2 upstream. Fix firmware file names assignment in si2157 tuner, allow for running devices without firmware files needed. modprobe gives error: unknown chip version Si2147-A30 ROM 0x50 Device initialization is interrupted. Caused by: 1. table si2157_tuners has swapped fields rom_id and required vs struct si2157_tuner_info. 2. both firmware file names can be null for devices with required == false - device uses build-in firmware in this case Tested on this device: m07ca:1871 AVerMedia Technologies, Inc. TD310 DVB-T/T2/C dongle [mchehab: fix mangled patch] Link: https://bugzilla.kernel.org/show_bug.cgi?id=215726 Link: https://lore.kernel.org/lkml/[email protected]/ Link: https://lore.kernel.org/linux-media/[email protected] Fixes: 1c35ba3 ("media: si2157: use a different namespace for firmware") Cc: [email protected] # 5.17.x Signed-off-by: Piotr Chmura <[email protected]> Tested-by: Robert Schlabbach <[email protected]> Signed-off-by: Mauro Carvalho Chehab <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 55037ed upstream. Add include guard wrapper define to uapi/linux/stddef.h to prevent macro redefinition errors when stddef.h is included more than once. This was not needed before since the only contents already used a redefinition test. Signed-off-by: Tadeusz Struk <[email protected]> Link: https://lore.kernel.org/r/[email protected] Fixes: 50d7bd3 ("stddef: Introduce struct_group() helper macro") Cc: [email protected] Signed-off-by: Kees Cook <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 887f75c upstream. DP/HDMI audio on AMD PRO VII stops working after S3: [ 149.450391] amdgpu 0000:63:00.0: amdgpu: MODE1 reset [ 149.450395] amdgpu 0000:63:00.0: amdgpu: GPU mode1 reset [ 149.450494] amdgpu 0000:63:00.0: amdgpu: GPU psp mode1 reset [ 149.983693] snd_hda_intel 0000:63:00.1: refused to change power state from D0 to D3hot [ 150.003439] amdgpu 0000:63:00.0: refused to change power state from D0 to D3hot ... [ 155.432975] snd_hda_intel 0000:63:00.1: CORB reset timeout#2, CORBRP = 65535 The offending commit is daf8de0 ("drm/amdgpu: always reset the asic in suspend (v2)"). Commit 34452ac ("drm/amdgpu: don't use BACO for reset in S3 ") doesn't help, so the issue is something different. Assuming that to make HDA resume to D0 fully realized, it needs to be successfully put to D3 first. And this guesswork proves working, by moving amdgpu_asic_reset() to noirq callback, so it's called after HDA function is in D3. Fixes: daf8de0 ("drm/amdgpu: always reset the asic in suspend (v2)") Signed-off-by: Kai-Heng Feng <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Cc: [email protected] Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 6d82ad1 upstream. Running generic/406 causes the following WARNING in btrfs_destroy_inode() which tells there are outstanding extents left. In btrfs_get_blocks_direct_write(), we reserve a temporary outstanding extents with btrfs_delalloc_reserve_metadata() (or indirectly from btrfs_delalloc_reserve_space(()). We then release the outstanding extents with btrfs_delalloc_release_extents(). However, the "len" can be modified in the COW case, which releases fewer outstanding extents than expected. Fix it by calling btrfs_delalloc_release_extents() for the original length. To reproduce the warning, the filesystem should be 1 GiB. It's triggering a short-write, due to not being able to allocate a large extent and instead allocating a smaller one. WARNING: CPU: 0 PID: 757 at fs/btrfs/inode.c:8848 btrfs_destroy_inode+0x1e6/0x210 [btrfs] Modules linked in: btrfs blake2b_generic xor lzo_compress lzo_decompress raid6_pq zstd zstd_decompress zstd_compress xxhash zram zsmalloc CPU: 0 PID: 757 Comm: umount Not tainted 5.17.0-rc8+ #101 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014 RIP: 0010:btrfs_destroy_inode+0x1e6/0x210 [btrfs] RSP: 0018:ffffc9000327bda8 EFLAGS: 00010206 RAX: 0000000000000000 RBX: ffff888100548b78 RCX: 0000000000000000 RDX: 0000000000026900 RSI: 0000000000000000 RDI: ffff888100548b78 RBP: ffff888100548940 R08: 0000000000000000 R09: ffff88810b48aba8 R10: 0000000000000001 R11: ffff8881004eb240 R12: ffff88810b48a800 R13: ffff88810b48ec08 R14: ffff88810b48ed00 R15: ffff888100490c68 FS: 00007f8549ea0b80(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f854a09e733 CR3: 000000010a2e9003 CR4: 0000000000370eb0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> destroy_inode+0x33/0x70 dispose_list+0x43/0x60 evict_inodes+0x161/0x1b0 generic_shutdown_super+0x2d/0x110 kill_anon_super+0xf/0x20 btrfs_kill_super+0xd/0x20 [btrfs] deactivate_locked_super+0x27/0x90 cleanup_mnt+0x12c/0x180 task_work_run+0x54/0x80 exit_to_user_mode_prepare+0x152/0x160 syscall_exit_to_user_mode+0x12/0x30 do_syscall_64+0x42/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f854a000fb7 Fixes: f0bfa76 ("btrfs: fix ENOSPC failure when attempting direct IO write into NOCOW range") CC: [email protected] # 5.17 Reviewed-by: Johannes Thumshirn <[email protected]> Tested-by: Johannes Thumshirn <[email protected]> Reviewed-by: Filipe Manana <[email protected]> Signed-off-by: Naohiro Aota <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit acee08a upstream. This restores the logic from commit 46bcff2 ("btrfs: fix compressed write bio blkcg attribution") which added cgroup attribution to btrfs writeback. It also adds back the REQ_CGROUP_PUNT flag for these ios. Fixes: 9150724 ("btrfs: determine stripe boundary at bio allocation time in btrfs_submit_compressed_write") CC: [email protected] # 5.16+ Signed-off-by: Dennis Zhou <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
commit 820c363 upstream. Return the allocated block group from do_chunk_alloc(). This is a preparation patch for the next patch. CC: [email protected] # 5.16+ Reviewed-by: Johannes Thumshirn <[email protected]> Tested-by: Johannes Thumshirn <[email protected]> Signed-off-by: Naohiro Aota <[email protected]> Signed-off-by: David Sterba <[email protected]> Signed-off-by: Greg Kroah-Hartman <[email protected]>
…02204200842 Signed-off-by: Jeremy Soller <[email protected]>
Looks like Aquantia ethernet on my desktop is not causing any problems with this kernel 🎉 |
Kernel Releases Tests
|
Kernel Releases Tests
|
As usual, please test NVIDIA, VirtualBox, and ZFS DKMS packages |
It was requested that we test that bluetooth audio is working. |
Tested with kudu6; zfs, nvidia, and virtualbox worked fine and DKMS packages updated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The darp6 won't connect to two sets of bluetooth headphones, but will connect to and plays through a bluetooth speaker.
Update
This may specific to the darp6 as both headphones and the speaker connected and played fine on the gaze15.
Seems to be a darp6 problem, not a regression with this kernel
kudu6 is not resuming from suspend with power button; only way to wake is by closing and opening the lid |
Is this a regression? |
Yes it is a regression. On clean install with 5.16.19 resume worked with the power button on the kudu6. Tested on pang11 and resume worked with both kernels so may be unique to the kudu |
@linuxgnuru Did you have the 510.68.02 driver installed as well when this suspend issues happened? |
Yes the 510 driver was installed but I was in integrated mode |
Thanks for the info. |
Which 510 driver? If you are testing this change, it should be the one we have released, not one from another PR. |
Sorry, it has the 510.60.02 from a clean install |
Welp, since we have now released a newer NVIDIA driver, it might be worth trying it out |
Obsolete, see #135. I'd recommend testing the kudu6 issue first. |
[ Upstream commit 7f74563 ] LE Create CIS command shall not be sent before all CIS Established events from its previous invocation have been processed. Currently it is sent via hci_sync but that only waits for the first event, but there can be multiple. Make it wait for all events, and simplify the CIS creation as follows: Add new flag HCI_CONN_CREATE_CIS, which is set if Create CIS has been sent for the connection but it is not yet completed. Make BT_CONNECT state to mean the connection wants Create CIS. On events after which new Create CIS may need to be sent, send it if possible and some connections need it. These events are: hci_connect_cis, iso_connect_cfm, hci_cs_le_create_cis, hci_le_cis_estabilished_evt. The Create CIS status/completion events shall queue new Create CIS only if at least one of the connections transitions away from BT_CONNECT, so that we don't loop if controller is sending bogus events. This fixes sending multiple CIS Create for the same CIS in the "ISO AC 6(i) - Success" BlueZ test case: < HCI Command: LE Create Co.. (0x08|0x0064) plen 9 #129 [hci0] Number of CIS: 2 CIS Handle: 257 ACL Handle: 42 CIS Handle: 258 ACL Handle: 42 > HCI Event: Command Status (0x0f) plen 4 #130 [hci0] LE Create Connected Isochronous Stream (0x08|0x0064) ncmd 1 Status: Success (0x00) > HCI Event: LE Meta Event (0x3e) plen 29 #131 [hci0] LE Connected Isochronous Stream Established (0x19) Status: Success (0x00) Connection Handle: 257 ... < HCI Command: LE Setup Is.. (0x08|0x006e) plen 13 #132 [hci0] ... > HCI Event: Command Complete (0x0e) plen 6 #133 [hci0] LE Setup Isochronous Data Path (0x08|0x006e) ncmd 1 ... < HCI Command: LE Create Co.. (0x08|0x0064) plen 5 #134 [hci0] Number of CIS: 1 CIS Handle: 258 ACL Handle: 42 > HCI Event: Command Status (0x0f) plen 4 #135 [hci0] LE Create Connected Isochronous Stream (0x08|0x0064) ncmd 1 Status: ACL Connection Already Exists (0x0b) > HCI Event: LE Meta Event (0x3e) plen 29 #136 [hci0] LE Connected Isochronous Stream Established (0x19) Status: Success (0x00) Connection Handle: 258 ... Fixes: c09b80b ("Bluetooth: hci_conn: Fix not waiting for HCI_EVT_LE_CIS_ESTABLISHED") Signed-off-by: Pauli Virtanen <[email protected]> Signed-off-by: Luiz Augusto von Dentz <[email protected]> Signed-off-by: Sasha Levin <[email protected]>
https://cdn.kernel.org/pub/linux/kernel/v5.x/ChangeLog-5.17.4