Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Miscellaneous dom0 kernel complaints about suspending #7377

Closed
logoerthiner1 opened this issue Mar 26, 2022 · 15 comments
Closed

Miscellaneous dom0 kernel complaints about suspending #7377

logoerthiner1 opened this issue Mar 26, 2022 · 15 comments
Assignees
Labels
affects-4.1 This issue affects Qubes OS 4.1. C: kernel diagnosed Technical diagnosis has been performed (see issue comments). hardware support P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. r4.0-dom0-stable r4.1-dom0-stable r4.2-host-stable

Comments

@logoerthiner1
Copy link

How to file a helpful issue

Qubes OS release

R4.1; freshly updated and kernel version is kernel-latest (5.16.13)

Brief summary

Several dom0 kernel complaints are visible in sudo dmesg, seemingly related to S3 sleep.

Not sure whether this indicates any problem.

[  704.831682] WARNING: CPU: 1 PID: 0 at arch/x86/mm/tlb.c:522 switch_mm_irqs_off+0x218/0x4b0
[  704.831688] Modules linked in: loop vfat fat snd_hda_codec_hdmi intel_powerclamp snd_hda_codec_realtek snd_hda_codec_generic snd_soc_dmic snd_sof_pci_intel_tgl snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp mt7921e snd_sof snd_soc_hdac_hda mt7921_common snd_hda_ext_core snd_soc_acpi_intel_match mt76_connac_lib snd_soc_acpi soundwire_bus mt76 snd_soc_core mac80211 snd_compress ac97_bus snd_pcm_dmaengine snd_hda_intel libarc4 snd_intel_dspcfg snd_intel_sdw_acpi cfg80211 snd_hda_codec snd_hda_core snd_hwdep snd_seq snd_seq_device mei_hdcp snd_pcm thinkpad_acpi think_lmi mei_pxp iTCO_wdt firmware_attributes_class wmi_bmof ucsi_acpi pcspkr intel_pmc_bxt intel_rapl_msr ee1004 iTCO_vendor_support typec_ucsi platform_profile joydev ledtrig_audio typec snd_timer idma64 rfkill wmi snd int3403_thermal soundcore processor_thermal_device_pci_legacy processor_thermal_device e1000e mei_me processor_thermal_rfim
[  704.831717]  processor_thermal_mbox processor_thermal_rapl intel_rapl_common int340x_thermal_zone intel_soc_dts_iosf igen6_edac intel_hid int3400_thermal mei sparse_keymap thunderbolt i2c_i801 i2c_smbus acpi_thermal_rel acpi_tad fuse xenfs ip_tables dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt crct10dif_pclmul crc32_pclmul nvme crc32c_intel i915 ghash_clmulni_intel serio_raw i2c_algo_bit ttm nvme_core drm_kms_helper xhci_pci cec xhci_pci_renesas xhci_hcd drm video pinctrl_tigerlake xen_acpi_processor xen_privcmd xen_pciback xen_blkback xen_gntalloc xen_gntdev xen_evtchn uinput
[  704.831735] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 5.16.13-2.fc32.qubes.x86_64 #1
[  704.831737] Hardware name: LENOVO 20X4A00BCD/20X4A00BCD, BIOS R1JET53W (1.53 ) 11/23/2021
[  704.831737] RIP: e030:switch_mm_irqs_off+0x218/0x4b0
[  704.831740] Code: 48 01 d1 0f 82 a7 02 00 00 48 c7 c2 00 00 00 80 48 2b 15 9b 9b 66 01 48 01 ca 48 0b 15 09 81 7c 01 48 39 c2 0f 84 9d fe ff ff <0f> 0b e8 21 fb ff ff e9 91 fe ff ff e8 17 1b f8 ff 31 f6 bf 00 01
[  704.831741] RSP: e02b:ffffc9004011bec8 EFLAGS: 00010006
[  704.831742] RAX: 0000000002810000 RBX: ffff888175280000 RCX: ffff8881d8cd4000
[  704.831743] RDX: 0000000158cd4000 RSI: ffffffff82650ebb RDI: ffffffff8261227f
[  704.831744] RBP: ffffffff829d9b00 R08: 0000000000000000 R09: 0000000000000000
[  704.831744] R10: 0000000000000004 R11: 0000000000000000 R12: ffff8881070e4c80
[  704.831745] R13: ffff8881002f4000 R14: 0000000000000001 R15: 0000000000000001
[  704.831751] FS:  0000000000000000(0000) GS:ffff888175280000(0000) knlGS:0000000000000000
[  704.831752] CS:  10000e030 DS: 002b ES: 002b CR0: 0000000080050033
[  704.831753] CR2: 00007c3b7aecebc0 CR3: 0000000002810000 CR4: 0000000000050660
[  704.831758] Call Trace:
[  704.831759]  <TASK>
[  704.831762]  switch_mm+0x1c/0x30
[  704.831764]  play_dead_common+0xa/0x20
[  704.831766]  xen_pv_play_dead+0xa/0x50
[  704.831768]  do_idle+0xcf/0xe0
[  704.831770]  cpu_startup_entry+0x19/0x20
[  704.831772]  asm_cpu_bringup_and_idle+0x5/0x10
[  704.831774]  </TASK>
[  704.831775] ---[ end trace 870d68d9e7e3768c ]---

and for each non-booting CPU:

[  707.003969] Enabling non-boot CPUs ...
[  707.003980] installing Xen timer for CPU 1
[  707.004032] BUG: using smp_processor_id() in preemptible [00000000] code: systemd-sleep/11251
[  707.004034] caller is is_xen_pmu+0x12/0x30
[  707.004038] CPU: 0 PID: 11251 Comm: systemd-sleep Tainted: G        W         5.16.13-2.fc32.qubes.x86_64 #1
[  707.004040] Hardware name: LENOVO 20X4A00BCD/20X4A00BCD, BIOS R1JET53W (1.53 ) 11/23/2021
[  707.004041] Call Trace:
[  707.004043]  <TASK>
[  707.004046]  dump_stack_lvl+0x48/0x5e
[  707.004049]  check_preemption_disabled+0xde/0xe0
[  707.004051]  is_xen_pmu+0x12/0x30
[  707.004052]  xen_smp_intr_init_pv+0x75/0x100
[  707.004055]  ? xen_read_cr0+0x20/0x20
[  707.004056]  xen_cpu_up_prepare_pv+0x3e/0x90
[  707.004057]  cpuhp_invoke_callback+0x2b8/0x460
[  707.004059]  ? _raw_spin_unlock_irq+0x1d/0x2f
[  707.004061]  cpuhp_up_callbacks+0x4b/0x170
[  707.004062]  _cpu_up+0xba/0x140
[  707.004064]  thaw_secondary_cpus.cold+0x50/0xaa
[  707.004066]  suspend_enter+0x11e/0x3b0
[  707.004069]  suspend_devices_and_enter+0x165/0x270
[  707.004070]  enter_state+0x125/0x176
[  707.004072]  pm_suspend.cold+0x20/0x6b
[  707.004074]  state_store+0x27/0x50
[  707.004080]  kernfs_fop_write_iter+0x121/0x1b0
[  707.004083]  new_sync_write+0x159/0x1f0
[  707.004087]  vfs_write+0x20d/0x2a0
[  707.004088]  ksys_write+0x67/0xe0
[  707.004089]  do_syscall_64+0x38/0x90
[  707.004091]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  707.004093] RIP: 0033:0x75f1567572f7
[  707.004096] Code: 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24
[  707.004097] RSP: 002b:00007fff032a64e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001
[  707.004099] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 000075f1567572f7
[  707.004100] RDX: 0000000000000004 RSI: 00007fff032a65d0 RDI: 0000000000000004
[  707.004101] RBP: 00007fff032a65d0 R08: 00005e1adb224ca0 R09: 000000000000000d
[  707.004101] R10: 00005e1adb220eb0 R11: 0000000000000246 R12: 0000000000000004
[  707.004102] R13: 00005e1adb2202d0 R14: 0000000000000004 R15: 000075f156829700
[  707.004103]  </TASK>
[  707.004736] cpu 1 spinlock event irq 131
[  707.004933] ACPI: \_SB_.PR01: Found 3 idle states
[  707.005463] CPU1 is up

Steps to reproduce

  1. suspend
  2. wakeup
  3. sudo dmesg in dom0

Expected behavior

No kernel warnings

Actual behavior

Several kernel warnings

@logoerthiner1 logoerthiner1 added P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. T: bug labels Mar 26, 2022
@andrewdavidwong andrewdavidwong added C: kernel hardware support needs diagnosis Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed. labels Mar 27, 2022
@andrewdavidwong andrewdavidwong added this to the Release 4.1 updates milestone Mar 27, 2022
@logoerthiner1
Copy link
Author

logoerthiner1 commented Mar 27, 2022

Update: when using dom0 kernel 5.15.14-1.fc32.qubes.x86_64, only the first message appears; the first message only appear once each Qubes OS booting; and suspension is more stable than 5.16.13 (in 5.16.13 it seems that after one time of suspension, later suspension does not work - the computer does not sleep and even does not shutdown; it seems that dom0 is rejecting or being blocked by something)

I have not done deeper experiment, and #7340 is not easy to trigger within several attempt of suspension.

@marmarek
Copy link
Member

Patch for the second one is on its way already: https://lore.kernel.org/xen-devel/[email protected]/T/#t

@logoerthiner1
Copy link
Author

A 5.15.14 warning when I repeatedly trying suspending-resuming:

[ 2934.501737] ------------[ cut here ]------------
[ 2934.501738] cfs_rq->avg.load_avg || cfs_rq->avg.util_avg || cfs_rq->avg.runnable_avg
[ 2934.501741] installing Xen timer for CPU 5
[ 2934.501741] WARNING: CPU: 4 PID: 13559 at kernel/sched/fair.c:3339 __update_blocked_fair+0x49b/0x4b0
[ 2934.501746] Modules linked in: loop vfat fat snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic snd_soc_dmic snd_sof_pci_intel_tgl snd_sof_intel_hda_common soundwire_intel soundwire_generic_allocation soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi soundwire_bus snd_soc_core mt7921e mt76_connac_lib mt76 snd_compress mac80211 ac97_bus snd_pcm_dmaengine libarc4 snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec intel_powerclamp cfg80211 ee1004 snd_hda_core snd_hwdep think_lmi thinkpad_acpi wmi_bmof snd_seq iTCO_wdt snd_seq_device intel_pmc_bxt iTCO_vendor_support mei_hdcp firmware_attributes_class ucsi_acpi typec_ucsi typec platform_profile intel_rapl_msr pcspkr snd_pcm ledtrig_audio joydev idma64 rfkill snd_timer snd intel_hid soundcore int3403_thermal mei_me processor_thermal_device_pci_legacy processor_thermal_device processor_thermal_rfim mei
[ 2934.501772]  processor_thermal_mbox processor_thermal_rapl intel_rapl_common wmi e1000e i2c_i801 i2c_smbus igen6_edac sparse_keymap thunderbolt int340x_thermal_zone intel_soc_dts_iosf int3400_thermal acpi_thermal_rel acpi_tad fuse xenfs ip_tables dm_thin_pool dm_persistent_data dm_bio_prison dm_crypt trusted asn1_encoder crct10dif_pclmul crc32_pclmul crc32c_intel i915 ghash_clmulni_intel i2c_algo_bit ttm serio_raw xhci_pci nvme drm_kms_helper xhci_pci_renesas cec drm xhci_hcd nvme_core video pinctrl_tigerlake xen_acpi_processor xen_privcmd xen_pciback xen_blkback xen_gntalloc xen_gntdev xen_evtchn uinput
[ 2934.501790] CPU: 4 PID: 13559 Comm: kworker/4:0 Tainted: G        W         5.15.14-1.fc32.qubes.x86_64 #1
[ 2934.501792] Hardware name: LENOVO 20X4A00BCD/20X4A00BCD, BIOS R1JET53W (1.53 ) 11/23/2021
[ 2934.501793] Workqueue:  0x0 (events)
[ 2934.501795] RIP: e030:__update_blocked_fair+0x49b/0x4b0
[ 2934.501797] Code: 6b fd ff ff 49 8b 96 48 01 00 00 48 89 90 50 09 00 00 e9 ff fc ff ff 48 c7 c7 78 de 5e 82 c6 05 43 88 9e 01 01 e8 1e 28 b4 00 <0f> 0b 41 8b 86 38 01 00 00 e9 c6 fc ff ff 0f 1f 80 00 00 00 00 0f
[ 2934.501798] RSP: e02b:ffffc90045e2fcd8 EFLAGS: 00010086
[ 2934.501799] RAX: 0000000000000000 RBX: 0000000000000020 RCX: ffff888175320a08
[ 2934.501800] RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff888175320a00
[ 2934.501801] RBP: ffff888175331800 R08: 0000000000000048 R09: ffffc90045e2fc70
[ 2934.501802] R10: 0000000000000049 R11: 0000000000000000 R12: ffff888175331f80
[ 2934.501802] R13: ffff888175331e40 R14: ffff8881753316c0 R15: 0000000000000000
[ 2934.501807] FS:  0000000000000000(0000) GS:ffff888175300000(0000) knlGS:0000000000000000
[ 2934.501808] CS:  10000e030 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2934.501809] CR2: 00007bf7b8011726 CR3: 0000000002810000 CR4: 0000000000050660
[ 2934.501813] Call Trace:
[ 2934.501815]  <TASK>
[ 2934.501816]  update_blocked_averages+0xa8/0x180
[ 2934.501818]  newidle_balance+0x175/0x380
[ 2934.501819]  pick_next_task_fair+0x39/0x3f0
[ 2934.501820]  pick_next_task+0x51/0xbd0
[ 2934.501822]  ? dequeue_task_fair+0xba/0x390
[ 2934.501824]  __schedule+0x13a/0x570
[ 2934.501827]  schedule+0x44/0xa0
[ 2934.501827]  worker_thread+0xc2/0x310
[ 2934.501829]  ? process_one_work+0x390/0x390
[ 2934.501830]  kthread+0x10c/0x130
[ 2934.501832]  ? set_kthread_struct+0x40/0x40
[ 2934.501834]  ret_from_fork+0x1f/0x30
[ 2934.501837]  </TASK>
[ 2934.501837] ---[ end trace b5349e65b4982981 ]---

I have 8 vcpus and I tried ~15 times of suspending & resuming, and this appears only once on one core.

Not sure whether this appears in latest kernel.

@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel-latest (including package kernel-latest-5.16.18-1.fc25.qubes) has been pushed to the r4.0 testing repository for dom0.
To test this update, please install it with the following command:

sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel-latest (including package kernel-latest-5.16.18-1.fc32.qubes) has been pushed to the r4.1 testing repository for dom0.
To test this update, please install it with the following command:

sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing

Changes included in this update

@andrewdavidwong andrewdavidwong added diagnosed Technical diagnosis has been performed (see issue comments). and removed needs diagnosis Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed. labels Mar 31, 2022
marmarek added a commit to QubesOS/qubes-linux-kernel that referenced this issue Apr 2, 2022
marmarek added a commit to QubesOS/qubes-linux-kernel that referenced this issue Apr 2, 2022
@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel-5-4 (including package kernel-5.4.188-1.fc25.qubes) has been pushed to the r4.0 testing repository for dom0.
To test this update, please install it with the following command:

sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel (including package kernel-5.10.109-1.fc32.qubes) has been pushed to the r4.1 testing repository for dom0.
To test this update, please install it with the following command:

sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel-5-4 (including package kernel-5.4.188-1.fc25.qubes) has been pushed to the r4.0 stable repository for dom0.
To install this update, please use the standard update command:

sudo qubes-dom0-update

Or update dom0 via Qubes Manager.

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel-latest (including package kernel-latest-5.16.18-2.fc32.qubes) has been pushed to the r4.1 stable repository for dom0.
To install this update, please use the standard update command:

sudo qubes-dom0-update

Or update dom0 via Qubes Manager.

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel-latest (including package kernel-latest-5.16.18-2.fc25.qubes) has been pushed to the r4.0 stable repository for dom0.
To install this update, please use the standard update command:

sudo qubes-dom0-update

Or update dom0 via Qubes Manager.

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel (including package kernel-5.10.109-1.fc32.qubes) has been pushed to the r4.1 stable repository for dom0.
To install this update, please use the standard update command:

sudo qubes-dom0-update

Or update dom0 via Qubes Manager.

Changes included in this update

marmarek added a commit to QubesOS/qubes-linux-kernel that referenced this issue Jun 11, 2022
@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel (including package kernel-5.15.46-2.fc32.qubes) has been pushed to the r4.1 testing repository for dom0.
To test this update, please install it with the following command:

sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel (including package kernel-5.15.52-1.fc32.qubes) has been pushed to the r4.1 stable repository for dom0.
To install this update, please use the standard update command:

sudo qubes-dom0-update

Or update dom0 via Qubes Manager.

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel (including package kernel-6.1.26-1.qubes.fc32) has been pushed to the r4.1 testing repository for dom0.
To test this update, please install it with the following command:

sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing

Changes included in this update

@qubesos-bot
Copy link

Automated announcement from builder-github

The component linux-kernel (including package kernel-6.1.35-1.qubes.fc32) has been pushed to the r4.1 stable repository for dom0.
To install this update, please use the standard update command:

sudo qubes-dom0-update

Or update dom0 via Qubes Manager.

Changes included in this update

@andrewdavidwong andrewdavidwong added the affects-4.1 This issue affects Qubes OS 4.1. label Aug 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-4.1 This issue affects Qubes OS 4.1. C: kernel diagnosed Technical diagnosis has been performed (see issue comments). hardware support P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. r4.0-dom0-stable r4.1-dom0-stable r4.2-host-stable
Projects
None yet
Development

No branches or pull requests

5 participants