Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SOF logs show prints which are actually not errors #939

Closed
sathyap-chrome opened this issue May 14, 2019 · 19 comments
Closed

SOF logs show prints which are actually not errors #939

sathyap-chrome opened this issue May 14, 2019 · 19 comments
Labels
bug Something isn't working

Comments

@sathyap-chrome
Copy link

During suspend cycles and upon boot, we see some kernel logs as below even when there are no errors

[ 344.605545] sof-audio-pci 0000:00:1f.3: error: rx list empty but received 0x0
[ 446.001250] sof-audio-pci 0000:00:1f.3: firmware boot complete

In some cases, (we are yet to receive details from customer) the logs are very frequent.
is there recent change / fix to supress these msgs being flagged as error ?

@plbossart
Copy link
Member

This looks like an older version of SOF, maybe 2-3 month old already? we don't have this rx list anymore

@sathyap-chrome
Copy link
Author

Hi Pierre, yes you are right.
So i tested with latest upstream set and see the other 2 lines @ every suspend call and resume thereafter.

sof-audio-pci 0000:00:1f.3: error: no reply expected, received 0x0
sof-audio-pci 0000:00:1f.3: firmware boot complete

I'm still checking with reporter which scenario are these coming frequently, but i wanted to know if we can suppress few of these or move it under debug.

@xiulipan
Copy link

This should be rom reply message in DSP reset. We are still in discussion about how to make it look nicer.

@sathyap-chrome
Copy link
Author

Thanks xiulipan.
Will keep our issue tracker updated on the same

@sathyap-chrome
Copy link
Author

Hi @xiulipan These 2 prints come at every PCM open / close too.
So we do the below steps and evey cat will print these msgs twice.

Considering these are message reply of a success case, can we move it under debug ?

localhost /sys/devices/pci0000:00/0000:00:1f.3/ehdaudio0D2/widgets # while true; do cat 01/pcm_caps; sleep 1; date; done
[ 87.893276] sof-audio-pci 0000:00:1f.3: error: no reply expected, received 0x0
[ 87.984228] sof-audio-pci 0000:00:1f.3: firmware boot complete
0x00000000
Tue Jun 25 13:43:25 PDT 2019
0x00000000
Tue Jun 25 13:43:26 PDT 2019
0x00000000
Tue Jun 25 13:43:27 PDT 2019
[ 91.194438] sof-audio-pci 0000:00:1f.3: error: no reply expected, received 0x0
[ 91.285987] sof-audio-pci 0000:00:1f.3: firmware boot complete
0x00000000
Tue Jun 25 13:43:28 PDT 2019
0x00000000
Tue Jun 25 13:43:29 PDT 2019
0x00000000
Tue Jun 25 13:43:30 PDT 2019
[ 94.509771] sof-audio-pci 0000:00:1f.3: error: no reply expected, received 0x0
[ 94.601671] sof-audio-pci 0000:00:1f.3: firmware boot complete
0x00000000
Tue Jun 25 13:43:31 PDT 2019
0x00000000
Tue Jun 25 13:43:32 PDT 2019

@ranj063
Copy link
Collaborator

ranj063 commented Jul 11, 2019

Hi @xiulipan These 2 prints come at every PCM open / close too.

not really pcm open/close but really runtime resume and suspend. This happens once during FW boot every time you resume from runtime suspend.

@sathyap-chrome
Copy link
Author

@ranj063 thanks for correction. But can we improve this message , currently it shows as error, so lot of validation / triage starts from Audio for no reason :)

@ranj063
Copy link
Collaborator

ranj063 commented Jul 11, 2019

@sathyap-chrome I suppose we could change it to an info instead of an error, we're not really doing anything with the error anyway. @xiulipan @lyakh what do you think?

@sathyap-chrome
Copy link
Author

sounds good.

@kv2019i kv2019i added the bug Something isn't working label Nov 7, 2019
@kv2019i
Copy link
Collaborator

kv2019i commented Nov 7, 2019

@ranj063 ++votes -- having a systematic false error printout in such a common flow is very bad.

@paulstelian97
Copy link

Also agreed, this may be shown as a notice or perhaps warning but no need to show as error as it doesn't normally impact the flow. I wonder whether this should be a verbose trace though (I think not?).

@plbossart
Copy link
Member

@keyonjie and @aiChaoSONG had a proposal to remove this error. It went nowhere I am afraid

@keyonjie
Copy link

keyonjie commented Nov 7, 2019

yes, it's here: #1189
@aiChaoSONG can you please proceed it?

@wwittbrx
Copy link

Is there any update on this? This error has been corrupting logs for a long time now.

@plbossart
Copy link
Member

@keyonjie and @aiChaoSONG ping!

@kv2019i
Copy link
Collaborator

kv2019i commented Mar 19, 2020

This is now fixed with #1189 . Please reopen if still seen.

@kv2019i kv2019i closed this as completed Mar 19, 2020
@ggiuffre
Copy link

My Lenovo Thinkpad X1C7 running Ubuntu 20.04 has bee having this issue for a while, although I update my software regularly. I'm glad to provide more information, if this can help solve the issue.

@paulstelian97
Copy link

My Lenovo Thinkpad X1C7 running Ubuntu 20.04 has bee having this issue for a while, although I update my software regularly. I'm glad to provide more information, if this can help solve the issue.

Ubuntu 20.04 is on old version of the kernel (5.4 IIRC). I think some fixes might need to be backported there, though if it doesn't impact proper functionality I doubt it's actually gonna be backported.

@ggiuffre
Copy link

My Lenovo Thinkpad X1C7 running Ubuntu 20.04 has bee having this issue for a while, although I update my software regularly. I'm glad to provide more information, if this can help solve the issue.

Ubuntu 20.04 is on old version of the kernel (5.4 IIRC). I think some fixes might need to be backported there, though if it doesn't impact proper functionality I doubt it's actually gonna be backported.

I see. Thank you for the prompt reply, Paul.

plbossart pushed a commit that referenced this issue Mar 8, 2021
Lockdep with fstests test case btrfs/041 detected a unsafe locking
scenario when we allocate the log node on a zoned filesystem.

btrfs/041
 ============================================
 WARNING: possible recursive locking detected
 5.11.0-rc7+ #939 Not tainted
 --------------------------------------------
 xfs_io/698 is trying to acquire lock:
 ffff88810cd673a0 (&root->log_mutex){+.+.}-{3:3}, at: btrfs_sync_log+0x3d1/0xee0 [btrfs]

 but task is already holding lock:
 ffff88810b0fc3a0 (&root->log_mutex){+.+.}-{3:3}, at: btrfs_sync_log+0x313/0xee0 [btrfs]

 other info that might help us debug this:
  Possible unsafe locking scenario:

        CPU0
        ----
   lock(&root->log_mutex);
   lock(&root->log_mutex);

  *** DEADLOCK ***

  May be due to missing lock nesting notation

 2 locks held by xfs_io/698:
  #0: ffff88810cd66620 (sb_internal){.+.+}-{0:0}, at: btrfs_sync_file+0x2c3/0x570 [btrfs]
  #1: ffff88810b0fc3a0 (&root->log_mutex){+.+.}-{3:3}, at: btrfs_sync_log+0x313/0xee0 [btrfs]

 stack backtrace:
 CPU: 0 PID: 698 Comm: xfs_io Not tainted 5.11.0-rc7+ #939
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-0-gf21b5a4-rebuilt.opensuse.org 04/01/2014
 Call Trace:
  dump_stack+0x77/0x97
  __lock_acquire.cold+0xb9/0x32a
  lock_acquire+0xb5/0x400
  ? btrfs_sync_log+0x3d1/0xee0 [btrfs]
  __mutex_lock+0x7b/0x8d0
  ? btrfs_sync_log+0x3d1/0xee0 [btrfs]
  ? btrfs_sync_log+0x3d1/0xee0 [btrfs]
  ? find_first_extent_bit+0x9f/0x100 [btrfs]
  ? __mutex_unlock_slowpath+0x35/0x270
  btrfs_sync_log+0x3d1/0xee0 [btrfs]
  btrfs_sync_file+0x3a8/0x570 [btrfs]
  __x64_sys_fsync+0x34/0x60
  do_syscall_64+0x33/0x40
  entry_SYSCALL_64_after_hwframe+0x44/0xa9

This happens, because we are taking the ->log_mutex albeit it has already
been locked.

Also while at it, fix the bogus unlock of the tree_log_mutex in the error
handling.

Fixes: 3ddebf2 ("btrfs: zoned: reorder log node allocation on zoned filesystem")
Reviewed-by: Filipe Manana <[email protected]>
Signed-off-by: Johannes Thumshirn <[email protected]>
Reviewed-by: David Sterba <[email protected]>
Signed-off-by: David Sterba <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

9 participants