Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

debian-12 fails to start with in-VM kernel #8505

Closed
3hhh opened this issue Sep 10, 2023 · 19 comments
Closed

debian-12 fails to start with in-VM kernel #8505

3hhh opened this issue Sep 10, 2023 · 19 comments
Labels
affects-4.1 This issue affects Qubes OS 4.1. C: Debian/Ubuntu C: kernel diagnosed Technical diagnosis has been performed (see issue comments). P: default Priority: default. Default priority for new issues, to be replaced given sufficient information.

Comments

@3hhh
Copy link

3hhh commented Sep 10, 2023

How to file a helpful issue

Qubes OS release

4.1

Brief summary

[2023-09-10 11:21:33] .[30m.[47mWelcome to GRUB!
[2023-09-10 11:21:33] 
[2023-09-10 11:21:33] .[37m.[40m.[37m.[40m.[37m.[40m.[3;34H      [ grub-xen.cfg  424B  100%  11.50KiB/s ].[3;1Herror: no such device: /boot/xen/pvboot-x86_64.elf.
[2023-09-10 11:21:34] Reading (xen/xvda,gpt3/boot/grub/grub.cfg
[2023-09-10 11:21:34] .[H.[J.[1;1Herror: file `/boot/grub/fonts/unicode.pf2' not found.
[2023-09-10 11:21:34] error: no suitable video mode found.
[2023-09-10 11:21:34] error: no video mode activated.
[2023-09-10 11:21:34] .[4;34H      [ grub.cfg  15.44KiB  100%  19.21KiB/s ].[4;1H.[H.[J.[1;1H  Booting `Debian GNU/Linux'
[2023-09-10 11:21:34] 
[2023-09-10 11:21:34] Loading Li
pci_unplug: Xen Platform PCI: unrecognised magic value
[2023-09-10 11:21:36] [    0.234185] ACPI: No IOAPIC entries present
[2023-09-10 11:21:36] [    0.295817] PCI: Fatal: No config space access function found
[2023-09-10 11:21:36] [    0.326782] ACPI: OSL: SCI (ACPI GSI 9) not registered
[2023-09-10 11:21:36] [    0.328729] ACPI Error: No handler or method for GPE 00, disabling event (20220331/evgpe-839)
[2023-09-10 11:21:36] [    0.328771] ACPI Error: No handler or method for GPE 01, disabling event (20220331/evgpe-839)
[2023-09-10 11:21:36] [    0.328818] ACPI Error: No handler or method for GPE 03, disabling event (20220331/evgpe-839)
[2023-09-10 11:21:36] [    0.328853] ACPI Error: No handler or method for GPE 04, disabling event (20220331/evgpe-839)
[2023-09-10 11:21:36] [    0.328889] ACPI Error: No handler or method for GPE 05, disabling event (20220331/evgpe-839)
[2023-09-10 11:21:36] [    0.328924] ACPI Error: No handler or method for GPE 06, disabling event (20220331/evgpe-839)
[2023-09-10 11:21:36] [    0.328959] ACPI Error: No handler or method for GPE 07, disabling event (20220331/evgpe-839)
[2023-09-10 11:21:42] .[2J.[3J.[-1;-1fSetting up swapspace version 1, size = 1073737728 bytes
[2023-09-10 11:21:44] UUID=307ee45f-d167-4674-bcbb-2a6eb51098bf
[2023-09-10 11:21:44] /dev/xvda3: clean, 195540/643376 files, 1643068/2569216 blocks
[2023-09-10 11:21:44] mount: mounting /dev/mapper/dmroot on /root failed: No such device
[2023-09-10 11:21:45] Failed to mount /dev/mapper/dmroot as root file system.
[2023-09-10 11:21:45] 
[2023-09-10 11:21:45] 
[2023-09-10 11:21:45] BusyBox v1.35.0 (Debian 1:1.35.0-4+b3) built-in shell (ash)
[2023-09-10 11:21:45] Enter 'help' for a list of built-in commands.

Steps to reproduce

  1. Install debian-12 template via qvm-template.
  2. Switch kernel to pvgrub2-pvh.
  3. Start the template.

Expected behavior

Starts.

Actual behavior

Fails to start.

Notes

My old debian-11 template works just fine that way.

@3hhh 3hhh added P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. T: bug labels Sep 10, 2023
@andrewdavidwong
Copy link
Member

Might be related to #8493.

@andrewdavidwong andrewdavidwong added C: kernel C: Debian/Ubuntu needs diagnosis Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed. affects-4.1 This issue affects Qubes OS 4.1. labels Sep 10, 2023
@marmarek
Copy link
Member

This should be already fixed by QubesOS/qubes-builder-debian@83fbd33. Which template version do you have, and can you check if you have /etc/initramfs-tools/conf.d/99-template-build.conf file there? If you have it, remove it and regenerate initramfs (update-initramfs).

@3hhh
Copy link
Author

3hhh commented Sep 10, 2023 via email

@marmarek
Copy link
Member

You need to update the initramfs for the kernel version in /boot, not currently running one (from dom0). So, add -k all or something like this.

@3hhh
Copy link
Author

3hhh commented Sep 11, 2023 via email

@3hhh
Copy link
Author

3hhh commented Oct 17, 2023

Still the same with current updates.

I guess the relevant error is error: no such device: /boot/xen/pvboot-x86_64.elf. There's indeed no such file, but neither in debian-11 and that works just fine.

@grnklod
Copy link

grnklod commented Oct 18, 2023

You can try this in debian-12 template:

sudo apt --reinstall install linux-image*
sudo apt install grub2 qubes-kernel-vm-support
sudo grub-install /dev/xvda
sudo update-grub

https://forum.qubes-os.org/t/cannot-boot-to-native-fedora-37-minimal-kernel/15761/6

@3hhh
Copy link
Author

3hhh commented Oct 18, 2023 via email

@grnklod
Copy link

grnklod commented Oct 18, 2023

Maybe you have this issue:
#4974
#8465

@marmarek
Copy link
Member

Try removing "quiet" option from /etc/default/grub (and regenerate grub config after that), hopefully you'll get more details then

@3hhh
Copy link
Author

3hhh commented Oct 18, 2023 via email

@marmarek
Copy link
Member

Can you check xl dmesg (or /var/log/xen/console/hypervisor.log) about that time? Maybe the VM was killed by Xen for some reason.

@3hhh
Copy link
Author

3hhh commented Oct 19, 2023

Looks like you got me on the right lead:

(XEN) Domain 27 (vcpu#1) crashed on cpu#2:
(XEN) ----[ Xen-4.14.6  x86_64  debug=n   Not tainted ]----
(XEN) CPU:    2
(XEN) RIP:    0010:[<ffffffff95bd88b7>]
(XEN) RFLAGS: 0000000000010246   CONTEXT: hvm guest (d27v1)
(XEN) rax: 0000000000000000   rbx: 0000000000000001   rcx: 0000000000001000
(XEN) rdx: ffffda63c03fa340   rsi: ffffda63c03fa380   rdi: ffff89394fe8d000
(XEN) rbp: ffff9c38400e39b8   rsp: ffff9c38400e38b0   r8:  0000000000000000
(XEN) r9:  00000000000dd61d   r10: ffff893a35d37740   r11: 0000000000000018
(XEN) r12: 0000000000000000   r13: ffff893a35d37740   r14: ffff893a39fd3600
(XEN) r15: 0000000000000282   cr0: 0000000080050033   cr4: 00000000001706e0
(XEN) cr3: 000000006a010001   cr2: 0000000000000000
(XEN) fsb: 0000000000000000   gsb: ffff893a35d00000   gss: 0000000000000000
(XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: 0010
(XEN) p2m_pod_demand_populate: Dom27 out of PoD memory! (tot=102416 ents=921600 dom27)
(XEN) domain_crash called from p2m-pod.c:1254
(XEN) p2m_pod_demand_populate: Dom27 out of PoD memory! (tot=102416 ents=921600 dom27)
(XEN) domain_crash called from p2m-pod.c:1254
(XEN) p2m_pod_demand_populate: Dom27 out of PoD memory! (tot=102416 ents=921600 dom27)
(XEN) domain_crash called from p2m-pod.c:1254

So this is an instance of #7023.

The debian-12 VM was configured to use the default 400MB - 4GB memory balancing & 2 vcpus.
In-VM kernel is 6.1.0.13.

@3hhh
Copy link
Author

3hhh commented Oct 19, 2023

Starting it with 1GB fixed RAM works.

@3hhh
Copy link
Author

3hhh commented Oct 19, 2023

So a debian upstream issue apparently...

@adrelanos
Copy link
Member

(XEN) p2m_pod_demand_populate: Dom27 out of PoD memory! (tot=102416 ents=921600 dom27)

I had a similar issue and could fix it by increasing the initial memory setting in Qubes VM Manager. See:
#8649

@grnklod
Copy link

grnklod commented Nov 6, 2023

I've stumbled upon this myself and traced this issue to some problem with debian-12 template and max memory value.
If I create qube based on debian-12 template with pvgrub2-pvh kernel, PVH mode, enabled memory balancing and max memory set to 3069-4031 MB then qube fail to start.
If I set max memory to any other value then it works.
If I change qube template to debian-12-xfce then it works with any max memory value.

@3hhh
Copy link
Author

3hhh commented Dec 21, 2023

(XEN) p2m_pod_demand_populate: Dom27 out of PoD memory! (tot=102416 ents=921600 dom27)

I had a similar issue and could fix it by increasing the initial memory setting in Qubes VM Manager. See: #8649

Yes, that works, too.

The bug still exists with the newest debian-12 kernel btw.

I also wonder why debian-12 has so much higher memory requirements than debian-11. More than 100MB memory footprint difference per VM aren't nice when it comes to 20-50 VMs.

@marmarek
Copy link
Member

marmarek commented May 3, 2024

With memory hotplug in R4.2 the populate-on-demand (PoD) is not used anymore, so the crash on start (the way it did here) due to too little memory shouldn't happen anymore. The feature isn't in R4.1, but since support for R4.1 ends soon and there is a simple workaround, I don't think it's worth fixing it in other way.

@marmarek marmarek closed this as completed May 3, 2024
@andrewdavidwong andrewdavidwong added diagnosed Technical diagnosis has been performed (see issue comments). and removed needs diagnosis Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed. labels May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-4.1 This issue affects Qubes OS 4.1. C: Debian/Ubuntu C: kernel diagnosed Technical diagnosis has been performed (see issue comments). P: default Priority: default. Default priority for new issues, to be replaced given sufficient information.
Projects
None yet
Development

No branches or pull requests

5 participants