Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting several AppVMs to start on boot causes only one random VM to autostart #2666

Closed
Yethal opened this issue Mar 4, 2017 · 9 comments
Closed
Labels
C: core r4.0-dom0-cur-test T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.

Comments

@Yethal
Copy link

Yethal commented Mar 4, 2017

Qubes OS version (e.g., R3.2):

3.2

Affected TemplateVMs (e.g., fedora-23, if applicable):

fedora-24, possibly others

Expected behavior:

All AppVMs set to autostart actually autostart.

Actual behavior:

Network disconnected AppVM and one random network connected AppVM autostart correctly. The rest needs to be manually started.

Steps to reproduce the behavior:

  1. Create seven AppVMs (that's the number I use but is probably irrelevant)
  2. Set all seven AppVMs to autostart on boot
  3. Set one AppVM to be network disconnected
  4. Reboot physical machine
  5. Observe the state of all VMs in Qubes Manager

General notes:

I can provide whichever logs are needed just tell me which ones.

Related issues:

@unman
Copy link
Member

unman commented Mar 4, 2017

This isn't my experience. I regularly open 9 or 10 qubes on start-up without problems. Almost all of those are based on minimal (or mini) templates, but not all. I dont have huge amounts RAM, but a reasonable processor and SSD. Even on laptop with regularHD I could start 5 or 6
Can you provide some details on your box?
Also, does it make any difference if you set all the VMs with netvm none?
What about changing the template to a minimal template?

@Yethal
Copy link
Author

Yethal commented Mar 4, 2017

I have 32GB of RAM and an nvme SSD so I doubt it's caused by my spec.
I've just checked this, all VMs autostart correctly after setting their NetVM to none.

@marmarek
Copy link
Member

marmarek commented Mar 4, 2017

Does those network connected VMs use default NetVM? Default NetVM is specifically started before other VMs to avoid race condition you are experiencing.

@Yethal
Copy link
Author

Yethal commented Mar 4, 2017

All VMs use the default sys-firewall VM. Although I just checked, under Global settings I had sys-net set as default NetVM and not sys-firewall which was probably the cause of this issue. This however leads me to a question, why is it even possible to set sys-net as default NetVM (aside from "users are free to shoot themselves in the foot if they want to") ?

@marmarek
Copy link
Member

marmarek commented Mar 4, 2017

Because it's also valid setting, in some cases.

@Yethal
Copy link
Author

Yethal commented Mar 4, 2017

Fair enough. Is the race condition something that is going to be worked on or can I close this one out?

@marmarek
Copy link
Member

marmarek commented Mar 5, 2017

There are few very close to this one (#1075, #1990, #1665), but not exactly this one. So lets keep this open.

@marmarek marmarek added the T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists. label Mar 5, 2017
@marmarek marmarek added this to the Release 3.1 updates milestone Mar 5, 2017
@unman
Copy link
Member

unman commented Apr 14, 2017

@andrewdavidwong Confirmed this issue still arises in 3.2 milestone

marmarek added a commit to marmarek/qubes-core-admin that referenced this issue Jun 3, 2017
While libvirt handle locking itself, there is also Qubes-specific
startup part. Especially starting qrexec-daemon and waiting until
qrexec-agent connect to it. When someone will attempt to start VM the
second time (or simply assume it's already running) - qrexec will not be
connected yet and the operation will fail. Solve the problem by wrapping
the whole vm.start() function with a lock, including a check if VM is
running and waiting for qrexec.

Also, do not throw exception if VM is already running.

This way, after a call to vm.start(), VM will be started with qrexec
connected - regardless of who really started it.
Note that, it will not solve the situation when someone check if VM is
running manually, like:

    if not vm.is_running():
        yield from vm.start()

Such code should be changed to simply:

    yield from vm.start()

Fixes QubesOS/qubes-issues#2001
Fixes QubesOS/qubes-issues#2666
marmarek added a commit to marmarek/qubes-core-admin that referenced this issue Jun 3, 2017
While libvirt handle locking itself, there is also Qubes-specific
startup part. Especially starting qrexec-daemon and waiting until
qrexec-agent connect to it. When someone will attempt to start VM the
second time (or simply assume it's already running) - qrexec will not be
connected yet and the operation will fail. Solve the problem by wrapping
the whole vm.start() function with a lock, including a check if VM is
running and waiting for qrexec.

Also, do not throw exception if VM is already running.

This way, after a call to vm.start(), VM will be started with qrexec
connected - regardless of who really started it.
Note that, it will not solve the situation when someone check if VM is
running manually, like:

    if not vm.is_running():
        yield from vm.start()

Such code should be changed to simply:

    yield from vm.start()

Fixes QubesOS/qubes-issues#2001
Fixes QubesOS/qubes-issues#2666
marmarek added a commit to marmarek/qubes-core-admin that referenced this issue Jun 3, 2017
While libvirt handle locking itself, there is also Qubes-specific
startup part. Especially starting qrexec-daemon and waiting until
qrexec-agent connect to it. When someone will attempt to start VM the
second time (or simply assume it's already running) - qrexec will not be
connected yet and the operation will fail. Solve the problem by wrapping
the whole vm.start() function with a lock, including a check if VM is
running and waiting for qrexec.

Also, do not throw exception if VM is already running.

This way, after a call to vm.start(), VM will be started with qrexec
connected - regardless of who really started it.
Note that, it will not solve the situation when someone check if VM is
running manually, like:

    if not vm.is_running():
        yield from vm.start()

Such code should be changed to simply:

    yield from vm.start()

Fixes QubesOS/qubes-issues#2001
Fixes QubesOS/qubes-issues#2666
@qubesos-bot
Copy link

Automated announcement from builder-github

The package qubes-core-dom0-4.0.1-1.fc25 has been pushed to the r4.0 testing repository for dom0.
To test this update, please install it with the following command:

sudo qubes-dom0-update --enablerepo=qubes-dom0-current-testing

Changes included in this update

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C: core r4.0-dom0-cur-test T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.
Projects
None yet
Development

No branches or pull requests

5 participants