Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.14.2 - No USB audio dropouts , but no playback at 48000 samplerate #577

Closed
jactmp opened this issue Apr 29, 2014 · 8 comments
Closed

3.14.2 - No USB audio dropouts , but no playback at 48000 samplerate #577

jactmp opened this issue Apr 29, 2014 · 8 comments
Assignees

Comments

@jactmp
Copy link

jactmp commented Apr 29, 2014

Contiuous audio dropouts finally disappeared after upgrade from 3.10.25+ to 3.14.2 - for 32000 and 44100 samplerate. For 48000 - I hear only clicks every few seconds. I'm using PCM2707.

DEVICE:

Apr 29 23:19:26 rp kernel: [  384.300654] usb 1-1.2: new full-speed USB device number 7 using dwc_otg
Apr 29 23:19:26 rp kernel: [  384.404327] usb 1-1.2: New USB device found, idVendor=08bb, idProduct=2707
Apr 29 23:19:26 rp kernel: [  384.404361] usb 1-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
Apr 29 23:19:26 rp kernel: [  384.404378] usb 1-1.2: Product: USB Audio DAC
Apr 29 23:19:26 rp kernel: [  384.404393] usb 1-1.2: Manufacturer: Burr-Brown from TI
Apr 29 23:19:26 rp kernel: [  384.415241] ALSA sound/usb/stream.c:710 7:1:1: add audio endpoint 0x2
Apr 29 23:19:26 rp kernel: [  384.415648] ALSA sound/usb/stream.c:710 7:1:2: add audio endpoint 0x2
Apr 29 23:19:26 rp kernel: [  384.416143] ALSA sound/usb/mixer.c:1312 [3] FU [PCM Playback Switch] ch = 1, val = 0/1/1
Apr 29 23:19:26 rp kernel: [  384.419306] ALSA sound/usb/mixer.c:474 cannot set ctl value: req = 0x4, wValue = 0x201, wIndex = 0x300, type = 4, data = 0x80/0x0
Apr 29 23:19:26 rp kernel: [  384.423225] ALSA sound/usb/mixer.c:1312 [3] FU [PCM Playback Volume] ch = 2, val = -32768/0/256
Apr 29 23:19:26 rp kernel: [  384.433274] input: Burr-Brown from TI USB Audio DAC    as /devices/platform/bcm2708_usb/usb1/1-1/1-1.2/1-1.2:1.2/0003:08BB:2707.0003/input/input2
Apr 29 23:19:26 rp kernel: [  384.442802] hid-generic 0003:08BB:2707.0003: input,hidraw0: USB HID v1.00 Device [Burr-Brown from TI USB Audio DAC   ] on usb-bcm2708_usb-1.2/input2
Apr 29 23:19:26 rp mtp-probe: checking bus 1, device 7: "/sys/devices/platform/bcm2708_usb/usb1/1-1/1-1.2"
Apr 29 23:19:26 rp mtp-probe: bus: 1, device: 7 was not an MTP device

PLAYBACK:

Apr 29 23:23:09 rp kernel: [  607.509482] ALSA sound/usb/pcm.c:490 setting usb interface 1:2
Apr 29 23:23:09 rp kernel: [  607.509520] ALSA sound/usb/endpoint.c:438 Creating new playback data endpoint #2
Apr 29 23:23:09 rp kernel: [  607.509882] ALSA sound/usb/endpoint.c:834 Setting params for ep #2 (type 0, 3 urbs), ret=0
Apr 29 23:23:09 rp kernel: [  607.509909] ALSA sound/usb/pcm.c:229 Starting data EP @dd27e000
Apr 29 23:23:09 rp kernel: [  607.513708] ALSA sound/usb/pcm.c:490 setting usb interface 1:1
Apr 29 23:23:09 rp kernel: [  607.513744] ALSA sound/usb/endpoint.c:438 Creating new playback data endpoint #2
Apr 29 23:23:09 rp kernel: [  607.513938] ALSA sound/usb/endpoint.c:834 Setting params for ep #2 (type 0, 3 urbs), ret=0
Apr 29 23:23:09 rp kernel: [  607.513964] ALSA sound/usb/pcm.c:229 Starting data EP @dd27c000
@P33M
Copy link
Contributor

P33M commented May 2, 2014

Please retest with the latest 3.14 branch. An update has been pushed which may affect this.

@jactmp
Copy link
Author

jactmp commented May 2, 2014

Done. Results as follows:
============= 3.14.2+ (latest) =============
$ cat ~/a.raw | aplay -r <any_samplerate> -c 2 -f S16_LE -t raw -
--- harsh sound ...

$ cat ~/a.raw | aplay -r <any_samplerate> -c 2 -f S16_LE -t raw -Dfront -
--- more harsh sound ...

============= 3.14.2 (before latest update) =============
$ cat ~/a.raw | aplay -r 44100 -c 2 -f S16_LE -t raw -Dfront -
.... sound OK ....

$ cat ~/a.raw | aplay -r 44100 -c 2 -f S16_LE -t raw -
$ cat ~/a.raw | aplay -r 48000 -c 2 -f S16_LE -t raw -
$ cat ~/a.raw | aplay -r 48000 -c 2 -f S16_LE -t raw -Dfront -
.... silence + clicks every few second ...

================ syslog (when playing) =================
May 2 15:30:46 rpi kernel: [ 511.479598] ALSA sound/usb/pcm.c:490 setting usb interface 1:1
May 2 15:30:46 rpi kernel: [ 511.479676] ALSA sound/usb/endpoint.c:430 Re-using EP 2 in iface 1,1 @ce816000
May 2 15:30:46 rpi kernel: [ 511.479902] ALSA sound/usb/endpoint.c:834 Setting params for ep #2 (type 0, 3 urbs), ret=0
May 2 15:30:46 rpi kernel: [ 511.479925] ALSA sound/usb/pcm.c:229 Starting data EP @ce81600

================ configuration =================
$ aplay -L
null
Discard all samples (playback) or generate zero samples (capture)
default:CARD=DAC
USB Audio DAC, USB Audio
Default Audio Device
sysdefault:CARD=DAC
USB Audio DAC, USB Audio
Default Audio Device
front:CARD=DAC,DEV=0
USB Audio DAC, USB Audio
Front speakers
.....

@jactmp
Copy link
Author

jactmp commented May 2, 2014

Also tested with different soundcard (C-Media). 3.14.2+ broke sound for this card as well (3.14.2 behaved more less properly - only few clicks every 20-30 seconds).

May 2 15:37:17 rpi kernel: [ 78.867313] usb 1-1.2: new full-speed USB device number 7 using dwc_otg
May 2 15:37:17 rpi kernel: [ 78.969558] usb 1-1.2: New USB device found, idVendor=0b05, idProduct=1743
May 2 15:37:17 rpi kernel: [ 78.969620] usb 1-1.2: New USB device strings: Mfr=3, Product=1, SerialNumber=0
May 2 15:37:17 rpi kernel: [ 78.969642] usb 1-1.2: Product: USB Advanced Audio Device
May 2 15:37:17 rpi kernel: [ 78.969657] usb 1-1.2: Manufacturer: C-Media Electronics Inc.
May 2 15:37:17 rpi kernel: [ 79.017807] input: C-Media Electronics Inc. USB Advanced Audio Device as /devices/platform/bcm2708_usb/usb1/1-1/1-1.2/1-1.2:1.3/0003:0B05:1743.0003/input/input2
May 2 15:37:17 rpi kernel: [ 79.018691] hid-generic 0003:0B05:1743.0003: input,hidraw0: USB HID v1.00 Device [C-Media Electronics Inc. USB Advanced Audio Device] on usb-bcm2708_usb-1.2/input3
May 2 15:37:17 rpi mtp-probe: checking bus 1, device 7: "/sys/devices/platform/bcm2708_usb/usb1/1-1/1-1.2"
May 2 15:37:17 rpi mtp-probe: bus: 1, device: 7 was not an MTP device
May 2 15:37:20 rpi kernel: [ 81.447768] usb 1-1.2: USB disconnect, device number 7
May 2 15:37:20 rpi kernel: [ 81.767457] usb 1-1.2: new full-speed USB device number 8 using dwc_otg
May 2 15:37:20 rpi kernel: [ 81.869571] usb 1-1.2: New USB device found, idVendor=0b05, idProduct=1743
May 2 15:37:20 rpi kernel: [ 81.869636] usb 1-1.2: New USB device strings: Mfr=3, Product=1, SerialNumber=0
May 2 15:37:20 rpi kernel: [ 81.869656] usb 1-1.2: Product: USB Advanced Audio Device
May 2 15:37:20 rpi kernel: [ 81.869672] usb 1-1.2: Manufacturer: C-Media Electronics Inc.
May 2 15:37:20 rpi kernel: [ 81.909109] input: C-Media Electronics Inc. USB Advanced Audio Device as /devices/platform/bcm2708_usb/usb1/1-1/1-1.2/1-1.2:1.3/0003:0B05:1743.0004/input/input3
May 2 15:37:20 rpi kernel: [ 81.911174] hid-generic 0003:0B05:1743.0004: input,hidraw0: USB HID v1.00 Device [C-Media Electronics Inc. USB Advanced Audio Device] on usb-bcm2708_usb-1.2/input3
May 2 15:37:20 rpi mtp-probe: checking bus 1, device 8: "/sys/devices/platform/bcm2708_usb/usb1/1-1/1-1.2"

@P33M
Copy link
Contributor

P33M commented May 2, 2014

Can you define "harsh"?

Do these devices work on 3.12?

@jactmp
Copy link
Author

jactmp commented May 2, 2014

"Explanation" below:
https://www.dropbox.com/s/gd76p5l26cj358i/recording20140502154830.wav

Generally they work more less properly with 3.10.y and 3.12.y. More less means - that clicks once in a while ( lost frames on USB ?).
Drops disappeaead when (again - more less) after adding dwc_otg.microframe_schedule=1 dwc_otg.speed=1 to cmdline.txt (I was forced to add it to resolve problem with Broadcom bluetooth dongle - which dropped data sent trough rfcomm to external android device).
The problem was solved finally in 3.14.2, but only for 44100.

popcornmix pushed a commit that referenced this issue Oct 8, 2014
Soothes the following checkpatch warnings:

    WARNING: line over 80 characters
    #151: FILE: drivers/mfd/ab8500-core.c:151:
    +	0, 1, 2, 3, 4, -1, -1, -1, -1, 11, 18, 19, 20, 21, 12, 13, 24, 5, 22, 23,

    ERROR: spaces required around that '=' (ctx:VxW)
    #325: FILE: drivers/mfd/ab8500-core.c:325:
    +	ret= mask_and_set_register_interruptible(ab8500, bank, reg,
     	   ^

    WARNING: line over 80 characters
    #418: FILE: drivers/mfd/ab8500-core.c:418:
    +		else if (offset >= AB9540_INT_GPIO50R && offset <= AB9540_INT_GPIO54R)

    WARNING: line over 80 characters
    #420: FILE: drivers/mfd/ab8500-core.c:420:
    +		else if (offset == AB8540_INT_GPIO43R || offset == AB8540_INT_GPIO44R)

    ERROR: spaces required around that '==' (ctx:VxV)
    #454: FILE: drivers/mfd/ab8500-core.c:454:
    +	if ((i==3) && (*offset >= 24))
     	      ^

    ERROR: code indent should use tabs where possible
    #576: FILE: drivers/mfd/ab8500-core.c:576:
    +        .map    = ab8500_irq_map,$

    WARNING: please, no spaces at the start of a line
    #576: FILE: drivers/mfd/ab8500-core.c:576:
    +        .map    = ab8500_irq_map,$

    ERROR: code indent should use tabs where possible
    #577: FILE: drivers/mfd/ab8500-core.c:577:
    +        .xlate  = irq_domain_xlate_twocell,$

    WARNING: please, no spaces at the start of a line
    #577: FILE: drivers/mfd/ab8500-core.c:577:
    +        .xlate  = irq_domain_xlate_twocell,$

    WARNING: char * array declaration might be better as static const
    #1554: FILE: drivers/mfd/ab8500-core.c:1554:
    +	static char *switch_off_status[] = {

    WARNING: char * array declaration might be better as static const
    #1563: FILE: drivers/mfd/ab8500-core.c:1563:
    +	static char *turn_on_status[] = {

    WARNING: sizeof *ab8500 should be sizeof(*ab8500)
    #1582: FILE: drivers/mfd/ab8500-core.c:1582:
    +	ab8500 = devm_kzalloc(&pdev->dev, sizeof *ab8500, GFP_KERNEL);

    ERROR: space required after that close brace '}'
    #1639: FILE: drivers/mfd/ab8500-core.c:1639:
    +	}/* Configure AB8500 or AB9540 IRQ */

    WARNING: line over 80 characters
    #1652: FILE: drivers/mfd/ab8500-core.c:1652:
    +	ab8500->oldmask = devm_kzalloc(&pdev->dev, ab8500->mask_size, GFP_KERNEL);

    WARNING: Prefer [subsystem eg: netdev]_cont([subsystem]dev, ... then dev_cont(dev, ... then pr_cont(...  to printk(KERN_CONT ...
    #1677: FILE: drivers/mfd/ab8500-core.c:1677:
    +				printk(KERN_CONT " \"%s\"",

    WARNING: Prefer [subsystem eg: netdev]_cont([subsystem]dev, ... then dev_cont(dev, ... then pr_cont(...  to printk(KERN_CONT ...
    #1682: FILE: drivers/mfd/ab8500-core.c:1682:
    +		printk(KERN_CONT "\n");

    WARNING: Prefer [subsystem eg: netdev]_cont([subsystem]dev, ... then dev_cont(dev, ... then pr_cont(...  to printk(KERN_CONT ...
    #1684: FILE: drivers/mfd/ab8500-core.c:1684:
    +		printk(KERN_CONT " None\n");

    WARNING: printk() should include KERN_ facility level
    #1695: FILE: drivers/mfd/ab8500-core.c:1695:
    +				printk("\"%s\" ", turn_on_status[i]);

    WARNING: printk() should include KERN_ facility level
    #1700: FILE: drivers/mfd/ab8500-core.c:1700:
    +		printk("None\n");

    total: 5 errors, 14 warnings, 1869 lines checked

Reviewed-by: Linus Walleij <[email protected]>
Signed-off-by: Lee Jones <[email protected]>
@Ruffio
Copy link

Ruffio commented Aug 10, 2016

@YTPL has this issue been resolved? If yes, then please close this issue.

@jactmp
Copy link
Author

jactmp commented Aug 10, 2016

I can't check it - my RPi is broken. Should I close it anyway?

2016-08-10 13:05 GMT+02:00 Rasmus Christiansen [email protected]:

@YTPL https://github.com/ytpl has this issue been resolved? If yes,
then please close this issue.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#577 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AHGVJNWvZYknB6XmK-f28gqJ1RgzGNNeks5qebBugaJpZM4B2llE
.

@Ruffio
Copy link

Ruffio commented Aug 10, 2016

@YTPL Currently used version is 4.4.x so a lot has happened and since no one else has reported of this not working, I think yes.

@jactmp jactmp closed this as completed Aug 10, 2016
popcornmix pushed a commit that referenced this issue Apr 23, 2019
This fixes a possible circular locking dependency detected warning seen
with:
- CONFIG_PROVE_LOCKING=y
- consumer/provider IIO devices (ex: "voltage-divider" consumer of "adc")

When using the IIO consumer interface, e.g. iio_channel_get(), the consumer
device will likely call iio_read_channel_raw() or similar that rely on
'info_exist_lock' mutex.

typically:
...
	mutex_lock(&chan->indio_dev->info_exist_lock);
	if (chan->indio_dev->info == NULL) {
		ret = -ENODEV;
		goto err_unlock;
	}
	ret = do_some_ops()
err_unlock:
	mutex_unlock(&chan->indio_dev->info_exist_lock);
	return ret;
...

Same mutex is also hold in iio_device_unregister().

The following deadlock warning happens when:
- the consumer device has called an API like iio_read_channel_raw()
  at least once.
- the consumer driver is unregistered, removed (unbind from sysfs)

======================================================
WARNING: possible circular locking dependency detected
4.19.24 #577 Not tainted
------------------------------------------------------
sh/372 is trying to acquire lock:
(kn->count#30){++++}, at: kernfs_remove_by_name_ns+0x3c/0x84

but task is already holding lock:
(&dev->info_exist_lock){+.+.}, at: iio_device_unregister+0x18/0x60

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&dev->info_exist_lock){+.+.}:
       __mutex_lock+0x70/0xa3c
       mutex_lock_nested+0x1c/0x24
       iio_read_channel_raw+0x1c/0x60
       iio_read_channel_info+0xa8/0xb0
       dev_attr_show+0x1c/0x48
       sysfs_kf_seq_show+0x84/0xec
       seq_read+0x154/0x528
       __vfs_read+0x2c/0x15c
       vfs_read+0x8c/0x110
       ksys_read+0x4c/0xac
       ret_fast_syscall+0x0/0x28
       0xbedefb60

-> #0 (kn->count#30){++++}:
       lock_acquire+0xd8/0x268
       __kernfs_remove+0x288/0x374
       kernfs_remove_by_name_ns+0x3c/0x84
       remove_files+0x34/0x78
       sysfs_remove_group+0x40/0x9c
       sysfs_remove_groups+0x24/0x34
       device_remove_attrs+0x38/0x64
       device_del+0x11c/0x360
       cdev_device_del+0x14/0x2c
       iio_device_unregister+0x24/0x60
       release_nodes+0x1bc/0x200
       device_release_driver_internal+0x1a0/0x230
       unbind_store+0x80/0x130
       kernfs_fop_write+0x100/0x1e4
       __vfs_write+0x2c/0x160
       vfs_write+0xa4/0x17c
       ksys_write+0x4c/0xac
       ret_fast_syscall+0x0/0x28
       0xbe906840

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&dev->info_exist_lock);
                               lock(kn->count#30);
                               lock(&dev->info_exist_lock);
  lock(kn->count#30);

 *** DEADLOCK ***
...

cdev_device_del() can be called without holding the lock. It should be safe
as info_exist_lock prevents kernelspace consumers to use the exported
routines during/after provider removal. cdev_device_del() is for userspace.

Help to reproduce:
See example: Documentation/devicetree/bindings/iio/afe/voltage-divider.txt
sysv {
	compatible = "voltage-divider";
	io-channels = <&adc 0>;
	output-ohms = <22>;
	full-ohms = <222>;
};

First, go to iio:deviceX for the "voltage-divider", do one read:
$ cd /sys/bus/iio/devices/iio:deviceX
$ cat in_voltage0_raw

Then, unbind the consumer driver. It triggers above deadlock warning.
$ cd /sys/bus/platform/drivers/iio-rescale/
$ echo sysv > unbind

Note I don't actually expect stable will pick this up all the
way back into IIO being in staging, but if's probably valid that
far back.

Signed-off-by: Fabrice Gasnier <[email protected]>
Fixes: ac917a8 ("staging:iio:core set the iio_dev.info pointer to null on unregister")
Cc: <[email protected]>
Signed-off-by: Jonathan Cameron <[email protected]>
popcornmix pushed a commit that referenced this issue Apr 30, 2019
commit 7f75591 upstream.

This fixes a possible circular locking dependency detected warning seen
with:
- CONFIG_PROVE_LOCKING=y
- consumer/provider IIO devices (ex: "voltage-divider" consumer of "adc")

When using the IIO consumer interface, e.g. iio_channel_get(), the consumer
device will likely call iio_read_channel_raw() or similar that rely on
'info_exist_lock' mutex.

typically:
...
	mutex_lock(&chan->indio_dev->info_exist_lock);
	if (chan->indio_dev->info == NULL) {
		ret = -ENODEV;
		goto err_unlock;
	}
	ret = do_some_ops()
err_unlock:
	mutex_unlock(&chan->indio_dev->info_exist_lock);
	return ret;
...

Same mutex is also hold in iio_device_unregister().

The following deadlock warning happens when:
- the consumer device has called an API like iio_read_channel_raw()
  at least once.
- the consumer driver is unregistered, removed (unbind from sysfs)

======================================================
WARNING: possible circular locking dependency detected
4.19.24 #577 Not tainted
------------------------------------------------------
sh/372 is trying to acquire lock:
(kn->count#30){++++}, at: kernfs_remove_by_name_ns+0x3c/0x84

but task is already holding lock:
(&dev->info_exist_lock){+.+.}, at: iio_device_unregister+0x18/0x60

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&dev->info_exist_lock){+.+.}:
       __mutex_lock+0x70/0xa3c
       mutex_lock_nested+0x1c/0x24
       iio_read_channel_raw+0x1c/0x60
       iio_read_channel_info+0xa8/0xb0
       dev_attr_show+0x1c/0x48
       sysfs_kf_seq_show+0x84/0xec
       seq_read+0x154/0x528
       __vfs_read+0x2c/0x15c
       vfs_read+0x8c/0x110
       ksys_read+0x4c/0xac
       ret_fast_syscall+0x0/0x28
       0xbedefb60

-> #0 (kn->count#30){++++}:
       lock_acquire+0xd8/0x268
       __kernfs_remove+0x288/0x374
       kernfs_remove_by_name_ns+0x3c/0x84
       remove_files+0x34/0x78
       sysfs_remove_group+0x40/0x9c
       sysfs_remove_groups+0x24/0x34
       device_remove_attrs+0x38/0x64
       device_del+0x11c/0x360
       cdev_device_del+0x14/0x2c
       iio_device_unregister+0x24/0x60
       release_nodes+0x1bc/0x200
       device_release_driver_internal+0x1a0/0x230
       unbind_store+0x80/0x130
       kernfs_fop_write+0x100/0x1e4
       __vfs_write+0x2c/0x160
       vfs_write+0xa4/0x17c
       ksys_write+0x4c/0xac
       ret_fast_syscall+0x0/0x28
       0xbe906840

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&dev->info_exist_lock);
                               lock(kn->count#30);
                               lock(&dev->info_exist_lock);
  lock(kn->count#30);

 *** DEADLOCK ***
...

cdev_device_del() can be called without holding the lock. It should be safe
as info_exist_lock prevents kernelspace consumers to use the exported
routines during/after provider removal. cdev_device_del() is for userspace.

Help to reproduce:
See example: Documentation/devicetree/bindings/iio/afe/voltage-divider.txt
sysv {
	compatible = "voltage-divider";
	io-channels = <&adc 0>;
	output-ohms = <22>;
	full-ohms = <222>;
};

First, go to iio:deviceX for the "voltage-divider", do one read:
$ cd /sys/bus/iio/devices/iio:deviceX
$ cat in_voltage0_raw

Then, unbind the consumer driver. It triggers above deadlock warning.
$ cd /sys/bus/platform/drivers/iio-rescale/
$ echo sysv > unbind

Note I don't actually expect stable will pick this up all the
way back into IIO being in staging, but if's probably valid that
far back.

Signed-off-by: Fabrice Gasnier <[email protected]>
Fixes: ac917a8 ("staging:iio:core set the iio_dev.info pointer to null on unregister")
Cc: <[email protected]>
Signed-off-by: Jonathan Cameron <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
popcornmix pushed a commit that referenced this issue Apr 30, 2019
commit 7f75591 upstream.

This fixes a possible circular locking dependency detected warning seen
with:
- CONFIG_PROVE_LOCKING=y
- consumer/provider IIO devices (ex: "voltage-divider" consumer of "adc")

When using the IIO consumer interface, e.g. iio_channel_get(), the consumer
device will likely call iio_read_channel_raw() or similar that rely on
'info_exist_lock' mutex.

typically:
...
	mutex_lock(&chan->indio_dev->info_exist_lock);
	if (chan->indio_dev->info == NULL) {
		ret = -ENODEV;
		goto err_unlock;
	}
	ret = do_some_ops()
err_unlock:
	mutex_unlock(&chan->indio_dev->info_exist_lock);
	return ret;
...

Same mutex is also hold in iio_device_unregister().

The following deadlock warning happens when:
- the consumer device has called an API like iio_read_channel_raw()
  at least once.
- the consumer driver is unregistered, removed (unbind from sysfs)

======================================================
WARNING: possible circular locking dependency detected
4.19.24 #577 Not tainted
------------------------------------------------------
sh/372 is trying to acquire lock:
(kn->count#30){++++}, at: kernfs_remove_by_name_ns+0x3c/0x84

but task is already holding lock:
(&dev->info_exist_lock){+.+.}, at: iio_device_unregister+0x18/0x60

which lock already depends on the new lock.

the existing dependency chain (in reverse order) is:

-> #1 (&dev->info_exist_lock){+.+.}:
       __mutex_lock+0x70/0xa3c
       mutex_lock_nested+0x1c/0x24
       iio_read_channel_raw+0x1c/0x60
       iio_read_channel_info+0xa8/0xb0
       dev_attr_show+0x1c/0x48
       sysfs_kf_seq_show+0x84/0xec
       seq_read+0x154/0x528
       __vfs_read+0x2c/0x15c
       vfs_read+0x8c/0x110
       ksys_read+0x4c/0xac
       ret_fast_syscall+0x0/0x28
       0xbedefb60

-> #0 (kn->count#30){++++}:
       lock_acquire+0xd8/0x268
       __kernfs_remove+0x288/0x374
       kernfs_remove_by_name_ns+0x3c/0x84
       remove_files+0x34/0x78
       sysfs_remove_group+0x40/0x9c
       sysfs_remove_groups+0x24/0x34
       device_remove_attrs+0x38/0x64
       device_del+0x11c/0x360
       cdev_device_del+0x14/0x2c
       iio_device_unregister+0x24/0x60
       release_nodes+0x1bc/0x200
       device_release_driver_internal+0x1a0/0x230
       unbind_store+0x80/0x130
       kernfs_fop_write+0x100/0x1e4
       __vfs_write+0x2c/0x160
       vfs_write+0xa4/0x17c
       ksys_write+0x4c/0xac
       ret_fast_syscall+0x0/0x28
       0xbe906840

other info that might help us debug this:

 Possible unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(&dev->info_exist_lock);
                               lock(kn->count#30);
                               lock(&dev->info_exist_lock);
  lock(kn->count#30);

 *** DEADLOCK ***
...

cdev_device_del() can be called without holding the lock. It should be safe
as info_exist_lock prevents kernelspace consumers to use the exported
routines during/after provider removal. cdev_device_del() is for userspace.

Help to reproduce:
See example: Documentation/devicetree/bindings/iio/afe/voltage-divider.txt
sysv {
	compatible = "voltage-divider";
	io-channels = <&adc 0>;
	output-ohms = <22>;
	full-ohms = <222>;
};

First, go to iio:deviceX for the "voltage-divider", do one read:
$ cd /sys/bus/iio/devices/iio:deviceX
$ cat in_voltage0_raw

Then, unbind the consumer driver. It triggers above deadlock warning.
$ cd /sys/bus/platform/drivers/iio-rescale/
$ echo sysv > unbind

Note I don't actually expect stable will pick this up all the
way back into IIO being in staging, but if's probably valid that
far back.

Signed-off-by: Fabrice Gasnier <[email protected]>
Fixes: ac917a8 ("staging:iio:core set the iio_dev.info pointer to null on unregister")
Cc: <[email protected]>
Signed-off-by: Jonathan Cameron <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
popcornmix pushed a commit that referenced this issue Nov 3, 2020
…ions

There was a memory corruption bug happening while running the synthetic
event selftests:

 kmemleak: Cannot insert 0xffff8c196fa2afe5 into the object search tree (overlaps existing)
 CPU: 5 PID: 6866 Comm: ftracetest Tainted: G        W         5.9.0-rc5-test+ #577
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016
 Call Trace:
  dump_stack+0x8d/0xc0
  create_object.cold+0x3b/0x60
  slab_post_alloc_hook+0x57/0x510
  ? tracing_map_init+0x178/0x340
  __kmalloc+0x1b1/0x390
  tracing_map_init+0x178/0x340
  event_hist_trigger_func+0x523/0xa40
  trigger_process_regex+0xc5/0x110
  event_trigger_write+0x71/0xd0
  vfs_write+0xca/0x210
  ksys_write+0x70/0xf0
  do_syscall_64+0x33/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7fef0a63a487
 Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
 RSP: 002b:00007fff76f18398 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
 RAX: ffffffffffffffda RBX: 0000000000000039 RCX: 00007fef0a63a487
 RDX: 0000000000000039 RSI: 000055eb3b26d690 RDI: 0000000000000001
 RBP: 000055eb3b26d690 R08: 000000000000000a R09: 0000000000000038
 R10: 000055eb3b2cdb80 R11: 0000000000000246 R12: 0000000000000039
 R13: 00007fef0a70b500 R14: 0000000000000039 R15: 00007fef0a70b700
 kmemleak: Kernel memory leak detector disabled
 kmemleak: Object 0xffff8c196fa2afe0 (size 8):
 kmemleak:   comm "ftracetest", pid 6866, jiffies 4295082531
 kmemleak:   min_count = 1
 kmemleak:   count = 0
 kmemleak:   flags = 0x1
 kmemleak:   checksum = 0
 kmemleak:   backtrace:
      __kmalloc+0x1b1/0x390
      tracing_map_init+0x1be/0x340
      event_hist_trigger_func+0x523/0xa40
      trigger_process_regex+0xc5/0x110
      event_trigger_write+0x71/0xd0
      vfs_write+0xca/0x210
      ksys_write+0x70/0xf0
      do_syscall_64+0x33/0x40
      entry_SYSCALL_64_after_hwframe+0x44/0xa9

The cause came down to a use of strcat() that was adding an string that was
shorten, but the strcat() did not take that into account.

strcat() is extremely dangerous as it does not care how big the buffer is.
Replace it with seq_buf operations that prevent the buffer from being
overwritten if what is being written is bigger than the buffer.

Fixes: 10819e2 ("tracing: Handle synthetic event array field type checking correctly")
Reviewed-by: Tom Zanussi <[email protected]>
Tested-by: Tom Zanussi <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
sigmaris pushed a commit to sigmaris/linux that referenced this issue Nov 7, 2020
…ions

[ Upstream commit 761a8c5 ]

There was a memory corruption bug happening while running the synthetic
event selftests:

 kmemleak: Cannot insert 0xffff8c196fa2afe5 into the object search tree (overlaps existing)
 CPU: 5 PID: 6866 Comm: ftracetest Tainted: G        W         5.9.0-rc5-test+ raspberrypi#577
 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016
 Call Trace:
  dump_stack+0x8d/0xc0
  create_object.cold+0x3b/0x60
  slab_post_alloc_hook+0x57/0x510
  ? tracing_map_init+0x178/0x340
  __kmalloc+0x1b1/0x390
  tracing_map_init+0x178/0x340
  event_hist_trigger_func+0x523/0xa40
  trigger_process_regex+0xc5/0x110
  event_trigger_write+0x71/0xd0
  vfs_write+0xca/0x210
  ksys_write+0x70/0xf0
  do_syscall_64+0x33/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xa9
 RIP: 0033:0x7fef0a63a487
 Code: 64 89 02 48 c7 c0 ff ff ff ff eb bb 0f 1f 80 00 00 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
 RSP: 002b:00007fff76f18398 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
 RAX: ffffffffffffffda RBX: 0000000000000039 RCX: 00007fef0a63a487
 RDX: 0000000000000039 RSI: 000055eb3b26d690 RDI: 0000000000000001
 RBP: 000055eb3b26d690 R08: 000000000000000a R09: 0000000000000038
 R10: 000055eb3b2cdb80 R11: 0000000000000246 R12: 0000000000000039
 R13: 00007fef0a70b500 R14: 0000000000000039 R15: 00007fef0a70b700
 kmemleak: Kernel memory leak detector disabled
 kmemleak: Object 0xffff8c196fa2afe0 (size 8):
 kmemleak:   comm "ftracetest", pid 6866, jiffies 4295082531
 kmemleak:   min_count = 1
 kmemleak:   count = 0
 kmemleak:   flags = 0x1
 kmemleak:   checksum = 0
 kmemleak:   backtrace:
      __kmalloc+0x1b1/0x390
      tracing_map_init+0x1be/0x340
      event_hist_trigger_func+0x523/0xa40
      trigger_process_regex+0xc5/0x110
      event_trigger_write+0x71/0xd0
      vfs_write+0xca/0x210
      ksys_write+0x70/0xf0
      do_syscall_64+0x33/0x40
      entry_SYSCALL_64_after_hwframe+0x44/0xa9

The cause came down to a use of strcat() that was adding an string that was
shorten, but the strcat() did not take that into account.

strcat() is extremely dangerous as it does not care how big the buffer is.
Replace it with seq_buf operations that prevent the buffer from being
overwritten if what is being written is bigger than the buffer.

Fixes: 10819e2 ("tracing: Handle synthetic event array field type checking correctly")
Reviewed-by: Tom Zanussi <[email protected]>
Tested-by: Tom Zanussi <[email protected]>
Signed-off-by: Steven Rostedt (VMware) <[email protected]>
Signed-off-by: Sasha Levin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants