Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/sof es8336 quirk #26

Closed
wants to merge 21 commits into from
Closed

Fix/sof es8336 quirk #26

wants to merge 21 commits into from

Conversation

yangxiaohua2009
Copy link

@yangxiaohua2009 yangxiaohua2009 commented Feb 18, 2022

The reason why ES8326 codec driver cannot be probed is that there is something wrong with the ACPI so that every module related to it can not be probed, for example the i2c_hid_acpi module. The i2c bus driver is successfully probed though.

Anyway this should be a harmless update which adds support for ES8326 codec and can be tested by the users.

ASoC: codecs: add support for ES8326

The ES8326 codec is not compatible with ES8316 and requires a dedicated driver.

Signed-off-by: David Yang [email protected]

plbossart and others added 19 commits February 11, 2022 12:56
Add missing dmic_num mention and clarify that 'links' mean 'SoundWire
links', not to be used for other links.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
The platform driver may have information on which I2S/TDM link(s) to
enable in the machine driver. In the case of Intel devices, this may
be extracted from NHLT tables in platform firmware. This link
information is necessary to make sure machine driver and topology are
aligned.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
We currently extract the DMIC number only for HDaudio or SoundWire
platforms. For I2S/TDM platforms, this wasn't necessary until now, but
with devices with ES8336 we need to find a solution to detect dmics
more reliably than with a DMI quirk.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
The NHLT information can be used to figure out which SSPs are enabled
in a platform.

The 'SSP' link type is too broad for machine drivers, since it can
cover the Bluetooth sideband and the analog audio codec connections,
so this helper exposes a parameter to filter with the device
type (DEVICE_I2S refers to analog audio codec in NHLT parlance).

The helper returns a mask, since more than one SSP may be used for
analog audio, e.g. the NHLT spec describes the use of SSP0 for
amplifiers and SSP1 for headset codec. Note that if more than one bit
is set, it's impossible to determine which SSP is connected to what
external component. Additional platform-specific information based on
e.g. DMI quirks would still be required in the machine driver to
configure the relevant dailinks.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
For devices designed for Windows, the SSP information should be listed
in the NHLT, and when present can be used to set quirks automatically
in the machine driver.

The NHLT information exposes BT and analog audio connections
separately, for now we are only interested in the analog audio parts.

The use of dev_info() for the SSP mask is intentional so that we can
immediately flag devices with an ES8336 codec. Since NHLT is not used
for recent Chromebooks these messages should be rare.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
Different topology filenames may be required depending on which SSP is
used, and whether or not digital mics are present.

This patch adds a tplg_quirk_mask and in the case of the SOF driver
adds the relevant configurations.

This is a short-term solution to the ES8336 support issues.

In a long-term solution, we would need an interface where the machine
driver or platform driver have the ability to alter the topology
hard-coded low-level hardware support, e.g. by substituting an
interface for another, or disabling an interface that is not supported
on a given skew.

BugLink: thesofproject#3248
Signed-off-by: Pierre-Louis Bossart <[email protected]>
Do not fail if the GPIO used for speakers is not present, at least the
headphone, headset and internal mics should work.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
…arch

We have an existing 'adev' handle from which we can find the codec
device, no need for an I2C bus search.

This change aligns this driver will all other I2S-based machine
drivers.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
Since we see a proliferation of devices with various configurations,
we want to automatically set the DMIC and SSP information. This patch
removes existing DMI quirks to rely on the information extracted from
NHLT.

Note that NHLT can report multiple SSPs, choosing from the
ssp_link_mask in an MSB-first manner was found experimentally to work
fine.

The only thing that cannot be detected is the GPIO type, and users may
want to use the quirk override parameter if the 'wrong' solution is
provided.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
We only saw ESSX8336 so far, but now with reports of 'ESSX8326' we
need to expand to a full list. Let's reuse the 'snd_soc_acpi_codecs'
structure to store the information.

Reported-by: anthony tonitch <[email protected]>
Signed-off-by: Pierre-Louis Bossart <[email protected]>
We only saw ESSX8336 so far, but now with reports of 'ESSX8326' we
need to expand to a full list. Let's reuse the 'snd_soc_acpi_codecs'
structure to store the information.

Note that ES8326 will need a dedicated codec driver, but the plan is
to use the same machine driver for all Everest Audio devices.

Reported-by: anthony tonitch <[email protected]>
Signed-off-by: Pierre-Louis Bossart <[email protected]>
The mclk_id is specified in the topology. 95% of devices typically use
the MCLK_ID 0. In rare cases, some Chromebooks use MCLK_ID 1.

For the ES8336-based platforms, the same ratio applies but we have no
information from NHLT on what the mclk_id is. It might be possible to
fetch an NHLT blob and reverse-engineer the registers, but that's a
lot of work for limited benefits.

This patch suggests a kernel parameter instead, where users can try to
override the default mclk_id, and it saves us the hassle of
maintaining yet another permutation for these ES8336 platforms.

options snd_sof sof_ssp_mclk_id=1

Signed-off-by: Pierre-Louis Bossart <[email protected]>
We only logged the SSP quirk, make sure the GPIO and DMIC quirks are
exposed.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
Additional code was added and the comment on the speaker GPIO needs to
be moved before we actually try to get the GPIO.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
The codec driver exposes a set of properties that can be set from the
machine driver - as done in bytcht_es8316.c

Start by adding the JD_INVERTED quirk.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
Additional code was added and the comment on the speaker GPIO needs to
be moved before we actually try to get the GPIO.

Signed-off-by: Pierre-Louis Bossart <[email protected]>
The ES8326 requires a different codec driver than ES8316/8336, fixup
the codec name and dai name depending on the ACPI _HID exposed in the
DSDT.

Also add the select in Kconfig

Signed-off-by: Pierre-Louis Bossart <[email protected]>
We're missing this check for the CNL PCI id

Reported-by: Nikolai Kostrigin <[email protected]>
Signed-off-by: Pierre-Louis Bossart <[email protected]>
Add the support for all ES83x6 devices

Signed-off-by: Nikolai Kostrigin <[email protected]>
Signed-off-by: Pierre-Louis Bossart <[email protected]>
Copy link
Owner

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't even being to review this unless I see a Signed-off-by tag.

How difficult is this to add this to the commit message:

ASoC: codecs: add support for ES8326

The ES8326 codec is not compatible with ES8316 and requires a dedicated driver.

Signed-off-by: David Yang <[email protected]>

@yangxiaohua2009
Copy link
Author

edited

Copy link
Owner

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good start. You have a lot of style issues, please run checkpatch and fix the indentation/style issues. The more important problem is how properties are handled, they need to be read in the component probe so that the machine driver can change the values of the defaults.

sound/soc/codecs/es8326.c Show resolved Hide resolved
sound/soc/codecs/es8326.c Outdated Show resolved Hide resolved
sound/soc/codecs/es8326.c Outdated Show resolved Hide resolved
sound/soc/codecs/es8326.c Show resolved Hide resolved
sound/soc/codecs/es8326.c Outdated Show resolved Hide resolved
dev_dbg(&i2c->dev, "mclk-rate return %d", ret);
es8326->mclk_rate = 19200000;
}
dev_dbg(&i2c->dev, "mclk-rate %u",es8326->mclk_rate);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this can work, we need to read the properties in the component probe if you want the machine driver to modify them. This is how it works for es8316.c

In addition, please use device_property instead of of_property so that it also works for ACPI platforms.

dev_dbg(&i2c->dev, "mclk-rate return %d", ret);
es8326->mclk_rate = 19200000;
}
dev_dbg(&i2c->dev, "mclk-rate %u",es8326->mclk_rate);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about jack detect inversion, is this not supported?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defined as ES8326_HP_DET_JACK_POL. Is a device properity named jd_inverted necessary?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, if you want to use this codec driver with the same machine driver, then the device properties and quirks have to be supported for all codec drivers. I don't want to special case this es8326 driver for property handling.

dev_dbg(&i2c->dev, "mclk-rate return %d", ret);
es8326->mclk_rate = 19200000;
}
dev_dbg(&i2c->dev, "mclk-rate %u",es8326->mclk_rate);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you will also need DT bindings for all these properties.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These properities are related to hp detection and mic source. The dapm can handle them, but a custom UCM would be needed.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think you got my point. a DT binding is REQUIRED for upstream. not my decision, it's just the way things are.

If you don't provide this DT binding then this code will NEVER land upstream.

sound/soc/codecs/es8326.h Outdated Show resolved Hide resolved
sound/soc/codecs/es8326.c Show resolved Hide resolved
/*
* Note that this should be called from init rather than from hw_params.
*/
static int es8326_set_dai_sysclk(struct snd_soc_dai *codec_dai,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I get rid of the constraints? I tried to remove them but then I see a kernel panic

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's your code, I don't know what this does.

@plbossart
Copy link
Owner

@yangxiaohua2009 please squash patches and do a git push --force to update this PR. I don't want to review white space changes.

@plbossart
Copy link
Owner

Sorry, @yangxiaohua2009 you need to clean-up your code before asking for reviews. I am not being painful or setting the bar high, this is just basic stuff.

scripts/checkpatch.pl --strict --codespell *.patch &> checkpatch.log
total: 708 errors, 92 warnings, 64 checks, 1038 lines checked

checkpatch.log

@yangxiaohua2009
Copy link
Author

Sorry, @yangxiaohua2009 you need to clean-up your code before asking for reviews. I am not being painful or setting the bar high, this is just basic stuff.

scripts/checkpatch.pl --strict --codespell *.patch &> checkpatch.log total: 708 errors, 92 warnings, 64 checks, 1038 lines checked

checkpatch.log

already cleaned

@plbossart
Copy link
Owner

Sorry, @yangxiaohua2009 you need to clean-up your code before asking for reviews. I am not being painful or setting the bar high, this is just basic stuff.
scripts/checkpatch.pl --strict --codespell *.patch &> checkpatch.log total: 708 errors, 92 warnings, 64 checks, 1038 lines checked
checkpatch.log

already cleaned

come on, man. Even when I squash your patches, you still have quite a few issues in there - the worst being no Signed-off-by tags.

WARNING: Missing commit description - Add an appropriate one

WARNING: please write a paragraph that describes the config symbol fully
#31: FILE: sound/soc/codecs/Kconfig:867:
+config SND_SOC_ES8326

ERROR: do not set execute permissions for source files
#59: FILE: sound/soc/codecs/es8326.c

CHECK: Alignment should match open parenthesis
#88: FILE: sound/soc/codecs/es8326.c:25:
+static int es8326_set_bias_level(struct snd_soc_component *codec,
+				enum snd_soc_bias_level level);

CHECK: struct mutex definition without comment
#135: FILE: sound/soc/codecs/es8326.c:72:
+	struct mutex lock;


CHECK: Blank lines aren't necessary after an open brace '{'
#516: FILE: sound/soc/codecs/es8326.c:453:
+{
+

CHECK: Alignment should match open parenthesis
#567: FILE: sound/soc/codecs/es8326.c:504:
+static int es8326_set_bias_level(struct snd_soc_component *codec,
+				enum snd_soc_bias_level level)

CHECK: Alignment should match open parenthesis
#742: FILE: sound/soc/codecs/es8326.c:679:
+static void es8326_enable_jack_detect(struct snd_soc_component *component,
+				struct snd_soc_jack *jack)

CHECK: Alignment should match open parenthesis
#778: FILE: sound/soc/codecs/es8326.c:715:
+static int es8326_set_jack(struct snd_soc_component *component,
+			struct snd_soc_jack *jack, void *data)

ERROR: do not set execute permissions for source files
#887: FILE: sound/soc/codecs/es8326.h

WARNING: Improper SPDX comment style for 'sound/soc/codecs/es8326.h', please use '/*' instead
#892: FILE: sound/soc/codecs/es8326.h:1:
+// SPDX-License-Identifier: GPL-2.0-only

WARNING: Missing or malformed SPDX-License-Identifier tag in line 1
#892: FILE: sound/soc/codecs/es8326.h:1:
+// SPDX-License-Identifier: GPL-2.0-only

ERROR: trailing whitespace
#1005: FILE: sound/soc/codecs/es8326.h:114:
+#define ES8326_S16_LE^I(3 << 2)^I^I$

CHECK: spaces preferred around that '|' (ctx:VxV)
#1009: FILE: sound/soc/codecs/es8326.h:118:
+#define ES8326_DAIFMT_MASK	((1 << 5)|(3 << 0))
                           	         ^
ERROR: trailing whitespace
#1012: FILE: sound/soc/codecs/es8326.h:121:
+#define ES8326_DAIFMT_DSP_A^I^I(3 << 0)^I^I$

CHECK: spaces preferred around that '|' (ctx:VxV)
#1013: FILE: sound/soc/codecs/es8326.h:122:
+#define ES8326_DAIFMT_DSP_B		((1 << 5)|(3 << 0))
                            		         ^
CHECK: spaces preferred around that '|' (ctx:ExV)
#1036: FILE: sound/soc/codecs/es8326.h:145:
+		|(ES8326_ADC_SRC_ANA_INV_SW1 << ES8326_ADC1_SHIFT))
 		^

CHECK: spaces preferred around that '|' (ctx:ExV)
#1038: FILE: sound/soc/codecs/es8326.h:147:
+		|(ES8326_ADC_SRC_DMIC_SDIN2 << ES8326_ADC1_SHIFT))
 		^

ERROR: Missing Signed-off-by: line(s)

@yangxiaohua2009
Copy link
Author

devicetree bindings added. Didn't see properity "everest,jack-detect-inverted" in Documentation/devicetree/bindings/sound/everest,es8316.yaml
Is it defined elsewhere?

@plbossart
Copy link
Owner

devicetree bindings added. Didn't see properity "everest,jack-detect-inverted" in Documentation/devicetree/bindings/sound/everest,es8316.yaml Is it defined elsewhere?

it's a miss. Not sure how this went undetected during reviews. This needs to be fixed with a correction reflecting all the properties supported by this driver.

@plbossart plbossart force-pushed the fix/sof-es8336-quirk branch 2 times, most recently from 0434208 to 92ba636 Compare March 3, 2022 19:37
The ES8326 codec is not compatible with ES8316 and requires a dedicated driver.

Signed-off-by: David Yang <[email protected]>
Copy link
Owner

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's much better @yangxiaohua2009 but the clk handling needs to be revisited and you need to align with es8316.c when possible, and explain differences if required. This will be very useful for maintainers.

maximum: 0x29
default: 0x19

amic2-src:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't you need the 'everest,' prefix for all properties?


jack-pol:
description:
just the value of reg 57. Bit(3) decides whether the jack polarity is inverted.
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how does this work since you also have the 'jack-inverted' value?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A default value is defined here which can be inverted by 'jack-inverted' ?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok

bool hp_det_level;
struct snd_soc_jack *jack;
int irq;
struct mutex lock;
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a comment on what this protects

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

copyed from es8316.c.
The lock protects the situation that an irq is generated while the previous irq is still being processed.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you want to add this comment in the code itself.

if (snd_soc_component_get_bias_level(codec) == SND_SOC_BIAS_ON) {
clk_disable_unprepare(es8326->mclk);
} else {
ret = clk_prepare_enable(es8326->mclk);
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and here you do use the clk API, so why did you need a property at all?

sound/soc/codecs/es8326.c Outdated Show resolved Hide resolved
sound/soc/codecs/es8326.c Show resolved Hide resolved
if (ret == 0)
disable_irq(es8326->irq);
else
dev_warn(&i2c->dev, "Getting irq failed.");
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again why not use the exact same sequence as es8316.c, and explain why the IRQF_ parameters are different?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IRQF_ difference is hardware issue. es8316 generates a high voltage as irq while es8326 generates an impulse.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a comment then that the es8316 is level-based while es8326 is edge-based.

sound/soc/codecs/es8326.c Outdated Show resolved Hide resolved
Signed-off-by: David Yang <[email protected]>
@plbossart
Copy link
Owner

@yangxiaohua2009 please resubmit this as a PR against the SOF tree https://github.com/thesofproject/linux branch topic/sof-dev, thanks!

@plbossart plbossart closed this Mar 8, 2022
plbossart pushed a commit that referenced this pull request Feb 10, 2023
Driver's probe allocates memory for RX FIFO (port->rx_fifo) based on
default RX FIFO depth, e.g. 16.  Later during serial startup the
qcom_geni_serial_port_setup() updates the RX FIFO depth
(port->rx_fifo_depth) to match real device capabilities, e.g. to 32.

The RX UART handle code will read "port->rx_fifo_depth" number of words
into "port->rx_fifo" buffer, thus exceeding the bounds.  This can be
observed in certain configurations with Qualcomm Bluetooth HCI UART
device and KASAN:

  Bluetooth: hci0: QCA Product ID   :0x00000010
  Bluetooth: hci0: QCA SOC Version  :0x400a0200
  Bluetooth: hci0: QCA ROM Version  :0x00000200
  Bluetooth: hci0: QCA Patch Version:0x00000d2b
  Bluetooth: hci0: QCA controller version 0x02000200
  Bluetooth: hci0: QCA Downloading qca/htbtfw20.tlv
  bluetooth hci0: Direct firmware load for qca/htbtfw20.tlv failed with error -2
  Bluetooth: hci0: QCA Failed to request file: qca/htbtfw20.tlv (-2)
  Bluetooth: hci0: QCA Failed to download patch (-2)
  ==================================================================
  BUG: KASAN: slab-out-of-bounds in handle_rx_uart+0xa8/0x18c
  Write of size 4 at addr ffff279347d578c0 by task swapper/0/0

  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.1.0-rt5-00350-gb2450b7e00be-dirty #26
  Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
  Call trace:
   dump_backtrace.part.0+0xe0/0xf0
   show_stack+0x18/0x40
   dump_stack_lvl+0x8c/0xb8
   print_report+0x188/0x488
   kasan_report+0xb4/0x100
   __asan_store4+0x80/0xa4
   handle_rx_uart+0xa8/0x18c
   qcom_geni_serial_handle_rx+0x84/0x9c
   qcom_geni_serial_isr+0x24c/0x760
   __handle_irq_event_percpu+0x108/0x500
   handle_irq_event+0x6c/0x110
   handle_fasteoi_irq+0x138/0x2cc
   generic_handle_domain_irq+0x48/0x64

If the RX FIFO depth changes after probe, be sure to resize the buffer.

Fixes: f9d690b ("tty: serial: qcom_geni_serial: Allocate port->rx_fifo buffer in probe")
Cc: <[email protected]>
Signed-off-by: Krzysztof Kozlowski <[email protected]>
Reviewed-by: Jiri Slaby <[email protected]>
Link: https://lore.kernel.org/r/[email protected]
Signed-off-by: Greg Kroah-Hartman <[email protected]>
plbossart pushed a commit that referenced this pull request Jun 16, 2023
The cited commit adds a compeletion to remove dependency on rtnl
lock. But it causes a deadlock for multiple encapsulations:

 crash> bt ffff8aece8a64000
 PID: 1514557  TASK: ffff8aece8a64000  CPU: 3    COMMAND: "tc"
  #0 [ffffa6d14183f368] __schedule at ffffffffb8ba7f45
  #1 [ffffa6d14183f3f8] schedule at ffffffffb8ba8418
  #2 [ffffa6d14183f418] schedule_preempt_disabled at ffffffffb8ba8898
  #3 [ffffa6d14183f428] __mutex_lock at ffffffffb8baa7f8
  #4 [ffffa6d14183f4d0] mutex_lock_nested at ffffffffb8baabeb
  #5 [ffffa6d14183f4e0] mlx5e_attach_encap at ffffffffc0f48c17 [mlx5_core]
  #6 [ffffa6d14183f628] mlx5e_tc_add_fdb_flow at ffffffffc0f39680 [mlx5_core]
  #7 [ffffa6d14183f688] __mlx5e_add_fdb_flow at ffffffffc0f3b636 [mlx5_core]
  #8 [ffffa6d14183f6f0] mlx5e_tc_add_flow at ffffffffc0f3bcdf [mlx5_core]
  #9 [ffffa6d14183f728] mlx5e_configure_flower at ffffffffc0f3c1d1 [mlx5_core]
 #10 [ffffa6d14183f790] mlx5e_rep_setup_tc_cls_flower at ffffffffc0f3d529 [mlx5_core]
 #11 [ffffa6d14183f7a0] mlx5e_rep_setup_tc_cb at ffffffffc0f3d714 [mlx5_core]
 #12 [ffffa6d14183f7b0] tc_setup_cb_add at ffffffffb8931bb8
 #13 [ffffa6d14183f810] fl_hw_replace_filter at ffffffffc0dae901 [cls_flower]
 #14 [ffffa6d14183f8d8] fl_change at ffffffffc0db5c57 [cls_flower]
 #15 [ffffa6d14183f970] tc_new_tfilter at ffffffffb8936047
 #16 [ffffa6d14183fac8] rtnetlink_rcv_msg at ffffffffb88c7c31
 #17 [ffffa6d14183fb50] netlink_rcv_skb at ffffffffb8942853
 #18 [ffffa6d14183fbc0] rtnetlink_rcv at ffffffffb88c1835
 #19 [ffffa6d14183fbd0] netlink_unicast at ffffffffb8941f27
 #20 [ffffa6d14183fc18] netlink_sendmsg at ffffffffb8942245
 #21 [ffffa6d14183fc98] sock_sendmsg at ffffffffb887d482
 #22 [ffffa6d14183fcb8] ____sys_sendmsg at ffffffffb887d81a
 #23 [ffffa6d14183fd38] ___sys_sendmsg at ffffffffb88806e2
 #24 [ffffa6d14183fe90] __sys_sendmsg at ffffffffb88807a2
 #25 [ffffa6d14183ff28] __x64_sys_sendmsg at ffffffffb888080f
 #26 [ffffa6d14183ff38] do_syscall_64 at ffffffffb8b9b6a8
 #27 [ffffa6d14183ff50] entry_SYSCALL_64_after_hwframe at ffffffffb8c0007c
 crash> bt 0xffff8aeb07544000
 PID: 1110766  TASK: ffff8aeb07544000  CPU: 0    COMMAND: "kworker/u20:9"
  #0 [ffffa6d14e6b7bd8] __schedule at ffffffffb8ba7f45
  #1 [ffffa6d14e6b7c68] schedule at ffffffffb8ba8418
  #2 [ffffa6d14e6b7c88] schedule_timeout at ffffffffb8baef88
  #3 [ffffa6d14e6b7d10] wait_for_completion at ffffffffb8ba968b
  #4 [ffffa6d14e6b7d60] mlx5e_take_all_encap_flows at ffffffffc0f47ec4 [mlx5_core]
  #5 [ffffa6d14e6b7da0] mlx5e_rep_update_flows at ffffffffc0f3e734 [mlx5_core]
  #6 [ffffa6d14e6b7df8] mlx5e_rep_neigh_update at ffffffffc0f400bb [mlx5_core]
  #7 [ffffa6d14e6b7e50] process_one_work at ffffffffb80acc9c
  #8 [ffffa6d14e6b7ed0] worker_thread at ffffffffb80ad012
  #9 [ffffa6d14e6b7f10] kthread at ffffffffb80b615d
 #10 [ffffa6d14e6b7f50] ret_from_fork at ffffffffb8001b2f

After the first encap is attached, flow will be added to encap
entry's flows list. If neigh update is running at this time, the
following encaps of the flow can't hold the encap_tbl_lock and
sleep. If neigh update thread is waiting for that flow's init_done,
deadlock happens.

Fix it by holding lock outside of the for loop. If neigh update is
running, prevent encap flows from offloading. Since the lock is held
outside of the for loop, concurrent creation of encap entries is not
allowed. So remove unnecessary wait_for_completion call for res_ready.

Fixes: 95435ad ("net/mlx5e: Only access fully initialized flows in neigh update")
Signed-off-by: Chris Mi <[email protected]>
Reviewed-by: Roi Dayan <[email protected]>
Reviewed-by: Vlad Buslov <[email protected]>
Signed-off-by: Saeed Mahameed <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants