Linux stubdom - qemu crashes with intensive disk I/O #3651

alcreator · 2018-03-03T02:23:50Z

Qubes OS version:

Qubes R4.0 (rc4)

Affected component(s):

HVM using linux stubdom (without pv drivers)

Steps to reproduce the behavior:

Perform an intensive I/O operation on the emulated disks, for example:

Using a linux liveCD (archlinux tested):
Boot under standalone HVM, and add xen_nopv to the kernel command line.
Create and mount standard partition and filesystem on an emulated disk (I tested with ntfs).
Enter the following command:
$ while true; do var=1; rm /path/to/mounted/filesystem/file;
while [ $var -lt 5 ]; do ((++var)); cat /dev/cdrom >> /path/to/mounted/filesystem/file; done;
diff /dev/cdrom /path/to/mounted/filesystem/file; done

/dev/urandom can also be used, although this takes longer to crash.
I have been able to trigger this issue with both linux and windows guest operating systems (performing a windows 10/server 2016 installation seems to trigger this faster than the linux test, however it dosen't occur on every installation attempt).

Expected behavior:

Qemu running under linux stubdom does not crash

Actual behavior:

After some time (about 15 min on my hardware), qemu crashes with the error message "Bad ram offset 14787f000" in the device model console. The offset changes on each crash.

General notes:

This appears to have been introduced with qubes-vmm-xen commit 6dd581aaaa4506a9dd34eb48559aabd23a2da361 "stubdom-linux: Use mptsas1068 scsi controller". I haven't reproduced the issue after a few hours testing with the commit reverted.

With the commit reverted, qemu defaults to the lsi53c895a SCSI controller. I have tested this commit with the megasas and megasas-gen2 SCSI controllers, and they also exhibit this issue.

From the observed triggers and symptoms, this bug may be related to the issue described in upstream bugfix "xen-mapcache: Fix the bug when overlapping emulated DMA operations may cause inconsistency in guest memory mappings" (discussion: https://lists.gnu.org/archive/html/qemu-devel/2017-07/msg02463.html)

The issue occurs significantly quicker with a full 32-bit stubdom build (less than 1 minute - build details at https://gist.github.com/alcreator/8c21502abc99c92fccf2a9903c9cb346), and if the "performance" cpu scaling governor is used under dom0

Related issues:

#3068

marmarek · 2018-03-03T17:35:46Z

We have chosen this controller to ease Windows installation - the default one isn't supported by Windows installer out of the box (see #3068). Ideally we'd have it configurable, but unfortunately support for stubdomains (and specifically stubdomains with current qemu, instead of ancient qemu-traditional fork) is very limited in libxl.
Does updating qemu help (QubesOS/qubes-vmm-xen-stubdom-linux#14)? It doesn't look like the fix you've linked got any attention...

cc @HW42

alcreator · 2018-03-04T05:50:00Z

Unfortunately it still occurs with the latest qemu. I don't know if anyone has tested the upstream fix, since there still isn't official Q35 support in qemu for xen.
The only workaround that I can think of, if the root cause can't be found/fixed, would be to test if read-only disks still cause the crash. If they don't, then switch r/w disks back to ide/ahci (testing that these don't crash either), and have r/o disks on the mptsas1068.

awokd · 2018-09-13T07:42:13Z

We have chosen this controller to ease Windows installation - the default one isn't supported by Windows installer out of the box (see #3068). Ideally we'd have it configurable, but unfortunately support for stubdomains (and specifically stubdomains with current qemu, instead of ancient qemu-traditional fork) is very limited in libxl.

What would need to be done to make it configurable? Read #3068 and it seems like letting people set the hard drive controller to AHCI while leaving the CD/DVD controller as is would resolve multiple outstanding issues out there with running different OSes under Qubes 4.0. For example, I've seen posts about ReactOS, Windows XP, Android emulation, some BSD flavours, etc. running on HVM that could all be resolved if the HVM could provide a supported disk controller.

awokd · 2018-09-26T21:29:21Z

Looks like #3494 beat me to it.

alcreator · 2018-09-29T01:16:44Z

Just to update this issue, qemu still crashed after some hours using an AHCI drive. I did a ~30 hour test using the emulated IDE drive, and qemu didn't crash with this. Also, it seems to only occur when writing to a SCSI/AHCI drive, simply reading from it didn't crash qemu (tested for a few hours). So switching r/w disks to IDE and leaving ro disks on the SCSI controller is a viable workaround.

marmarek · 2018-10-01T12:18:46Z

So switching r/w disks to IDE and leaving ro disks on the SCSI controller is a viable workaround.

The problem with this is it may break (make unbootable) VMs installed with SCSI controller in use. This means we need to either make it configurable - non trivial given where disk controller is chosen.
Another problem is QEMU supports up to 4 IDE disks. We're quite close to this limit:

root
private
volatile
modules (r/o, so will stay SCSI)
cdrom from which you install the system

This for example leaves no space for additional disk connected with qvm-block attach --persistent.
At least we should prioritize which disks are set as IDE, and keep others as SCSI (instead of ignoring them as was done previously). Which makes the change even less trivial.

alcreator · 2018-10-08T00:09:06Z

It's possible change the cdrom drive to use either an AHCI controller or the SCSI controller (as an scsi-cd device), which will free up an additional slot on the default IDE controller. It's also a lot faster than using an IDE cdrom.

I've tested standard installs of Windows 7 and Windows 10, and they appear to be ok with switching from SCSI to IDE, as long as the SCSI adapter isn't removed from the vm (otherwise it will bluescreen if switching back to SCSI). Similarly with a Linux guest booted with xen_nopv, the OS boots ok with a controller change. I agree that changing the default adapter may be risky for certain guests, but as some people are reporting they can't install certain OS on SCSI, I think it's a better default option.

As for the crashing/bad ram offset issue:

I noticed with some testing of the ahci controller using 32-bit stubdomain that there are some mapcache warnings when /proc/sys/vm/overcommit_memory is set to 0, and /proc/sys/vm/overcommit_ratio is set to 100

Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9cebe00 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9ceb800 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9cebe00 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9ceb800 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9cebe00 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9ceb800 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9cebe00 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9ceb800 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9cebe00 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9ceb800 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9cebe00 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9ceb800 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9cebe00 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9ceb800 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9cebe00 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9ceb800 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9cebe00 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9ceb800 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9cebe00 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9ceb800 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9cebe00 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9ceb800 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9cebe00 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9ceb800 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9cebe00 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9ceb800 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9cebe00 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9ceb800 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9cebe00 is present
Locked DMA mapping while invalidating mapcache! 000000000000d9bf -> 0xa9ceb800 is present

The default NCQ queue depth for the AHCI controller is also smaller than the mptsas1068, which seems to correlate with the increased amount of time that AHCI takes to crash.

Default IDE:
cat /sys/block/sda/device/queue_depth
1

AHCI:
cat /sys/block/sda/device/queue_depth
31

mptsas1068:
cat /sys/block/sda/device/queue_depth
64

Changing the queue_depth to 1 on the mptsas1068 seems to increase the time to failure, but dosen't prevent it.

If someone wants to try debugging this, I would recommend using a 32-bit stubdomain build, and attempting an install of Windows 10/Server 2016 or 2019, from a vm configured with 50G disk space, and network connected. The linux test case seems to be harder to reproduce since qemu 3.0.0, but having moderate network traffic (3-4MB/s), and writing to multiple disks triggers the crash a bit quicker.

Also as a side note, changing the cache mode from writeback to writethrough for each emulated disk may help reduce the chance of data loss in the event of qemu crashes.

marmarek · 2020-08-08T19:16:37Z

This happens quite frequently in HVM grub tests on fedora-32 (for example: https://openqa.qubes-os.org/tests/11142#step/TC_41_HVMGrub_fedora-32/1). Digging through the logs reveals:

[2020-08-07 22:20:21] Locked DMA mapping while invalidating mapcache! 00000000000000ea -> 0x794658371000 is present
[2020-08-07 22:20:21] Locked DMA mapping while invalidating mapcache! 00000000000000ea -> 0x794658371000 is present
[2020-08-07 22:20:21] Locked DMA mapping while invalidating mapcache! 00000000000000ea -> 0x794658371000 is present
...

In the above linked test, the sequence of commands in a standalone fedora-32 HVM is:

dnf clean expire-cache && dnf install -y qubes-kernel-vm-support grub2-tools
dnf install -y kernel-core
grub2-mkconfig -o /boot/grub2/grub.cfg

The third one actually never starts because of the issue.
But more often, it happens when starting that VM initially.
Interestingly this hits fedora-32 template far more often than debian-10. Possibly related to the I/O generated during startup.

brendanhoar · 2021-11-17T17:29:03Z

Just as an informational update:

Under Qubes R4.1, QEMU appears to be presenting the default block devices utilizing emulated ATA/IDE devices for root/private/volatile when performing an HVM OS installation.

[At least, when I check post-install with Windows 7 and Windows 10.]

It appears that startup-attached CDROM ISOs are placed on the fourth ATA/IDE channel...at least when using qvm-start --install-windows-tools.

B

github-actions · 2023-08-05T09:32:07Z

This issue is being closed because:

This issue is on the "Release 4.0 updates" milestone.
Qubes OS 4.0 reached EOL (end-of-life) over one year ago.
There has not been any activity on this issue in over one year.

If anyone believes that this issue should be reopened and reassigned to an active milestone, please leave a brief comment.
(For example, if a bug still affects Qubes OS 4.1, then the comment "Affects 4.1" will suffice.)

andrewdavidwong added T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists. C: Xen labels Mar 3, 2018

andrewdavidwong added this to the Release 4.0 milestone Mar 3, 2018

andrewdavidwong modified the milestones: Release 4.0, Release 4.0 updates Mar 31, 2018

alcreator mentioned this issue Sep 29, 2018

Update to qemu 3.0.0 and security enhancements QubesOS/qubes-vmm-xen-stubdom-linux#18

Merged

awokd mentioned this issue Oct 1, 2018

Windows 7 x64 HVM and error 07B - how to change scsi controller type - 4.0RC3 #3494

Closed

alcreator mentioned this issue Oct 8, 2018

Consider switching to 32-bit PV stubdomains #4380

Closed

andrewdavidwong added needs diagnosis Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed. P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. labels Aug 10, 2020

andrewdavidwong added the eol-4.0 Closed because Qubes 4.0 has reached end-of-life (EOL) label Aug 5, 2023

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 5, 2023

andrewdavidwong removed the needs diagnosis Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed. label Aug 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Linux stubdom - qemu crashes with intensive disk I/O #3651

Linux stubdom - qemu crashes with intensive disk I/O #3651

alcreator commented Mar 3, 2018 •

edited

Loading

marmarek commented Mar 3, 2018

alcreator commented Mar 4, 2018

awokd commented Sep 13, 2018

awokd commented Sep 26, 2018

alcreator commented Sep 29, 2018

marmarek commented Oct 1, 2018

alcreator commented Oct 8, 2018

marmarek commented Aug 8, 2020

brendanhoar commented Nov 17, 2021

github-actions bot commented Aug 5, 2023

Linux stubdom - qemu crashes with intensive disk I/O #3651

Linux stubdom - qemu crashes with intensive disk I/O #3651

Comments

alcreator commented Mar 3, 2018 • edited Loading

Qubes OS version:

Affected component(s):

Steps to reproduce the behavior:

Expected behavior:

Actual behavior:

General notes:

Related issues:

marmarek commented Mar 3, 2018

alcreator commented Mar 4, 2018

awokd commented Sep 13, 2018

awokd commented Sep 26, 2018

alcreator commented Sep 29, 2018

marmarek commented Oct 1, 2018

alcreator commented Oct 8, 2018

marmarek commented Aug 8, 2020

brendanhoar commented Nov 17, 2021

github-actions bot commented Aug 5, 2023

alcreator commented Mar 3, 2018 •

edited

Loading