Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory ballooning issue is back for the fedora-41 template #9663

Open
lubellier opened this issue Dec 21, 2024 · 24 comments
Open

Memory ballooning issue is back for the fedora-41 template #9663

lubellier opened this issue Dec 21, 2024 · 24 comments
Assignees
Labels
affects-4.2 This issue affects Qubes OS 4.2. C: Fedora needs diagnosis Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed. P: major Priority: major. Between "default" and "critical" in severity. T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.

Comments

@lubellier
Copy link

Qubes OS release

Qubes OS 4.2.3

Brief summary

The fedora-41 template based AppVMs stay to the minimal memory limit.
Other users got also this issue, see the related topic in the forum.

Steps to reproduce

  1. Update your fedora-41 template with the fedora and QubesOS repositories
  2. Start the XX AppVM (with the fedora-41 template)
  3. Check XX prefs are memory 400MB / maxmem 4000MB
  4. Start Firefox and browse the web
  5. The XX current memory stays to 400MB

Expected behavior

The XX current memory should grow over 400MB.

Actual behavior

Related to Fedora 41 template / selinux

Done checks

Is there any manual action we can do to fix an already installed template, or do we need to reinstall from this new template?

Try removing /.qubes-relabeled in the template and restarting it - it should fix labels on startup; it may take some time, might require increasing qrexec_timeout property.

I did the /.qubes-relabeled removing / qrexec_timeout procedure in the fedora-41 template, the relabel job executed and re-created the /.qubes-relabeled file as expected:

fedora-41 logs:

[2024-12-21 14:34:23] [.[0m.[0;31m*     .[0m] Job qubes-relabel-root.service/start running (3s / no limit)
...
.[K[  .[0;31m*.[0;1;31m*.[0m.[0;31m* .[0m] Job qubes-relabel-root.service/start running (1min 9s / no limit)
...
[2024-12-21 14:35:28] [   71.312855] reboot: System halted

On next boot:

[user@tpl-f41 ~]$ ls -al /.qubes-relabeled 
-rw-r--r--. 1 root root 0 Dec 21 14:35 /.qubes-relabeled

Logs of the XX AppVM:

[2024-12-21 14:50:02] [   35.492145] audit: type=1400 audit(1734792602.359:123): avc:  denied  { read } for  pid=621 comm="qrexec-agent" name="meminfo-writer.pid" dev="tmpfs" ino=784 scontext=system_u:system_r:local_login_t:s0-s0:c0.c1023 tcontext=system_u:object_r:var_run_t:s0 tclass=file permissive=0

@lubellier lubellier added P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists. labels Dec 21, 2024
@marmarek
Copy link
Member

I think the issue is that %post selinux trigger should run also on selinux-policty-targeted update (%triggerin -- selinux-policy or %triggerin -- selinux-policy-targeted).
@DemiMarie what is the easiest way to do that, without duplicating the whole thing? I guess moving it to a macro that is used in both places would be very ugly, right? Maybe move it to a separate script and call that script when needed?
But also, can the relabel happen on update, or relabel on next boot is the only option?

@andrewdavidwong andrewdavidwong added C: Fedora needs diagnosis Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed. affects-4.2 This issue affects Qubes OS 4.2. labels Dec 21, 2024
@DemiMarie
Copy link

@marmarek Both a macro and a script would work. An online relabel is racy and might or might not work, or it might work most of the time and fail occasionally. However, I expect that most policy upgrades do not require a relabel, and in any case whether a relabel should happen is up to Fedora, not Qubes OS, except for upgrades to our own policy.

@marmarek
Copy link
Member

It seems that's not the correct diagnosis. After the update, file_contexts.subs is correct (has qubes parts). But, file_context is missing some. My current hypothesis is that some of the %selinux_relabel_* macros are missing.

@marmarek marmarek self-assigned this Dec 22, 2024
@marmarek marmarek added P: major Priority: major. Between "default" and "critical" in severity. and removed P: default Priority: default. Default priority for new issues, to be replaced given sufficient information. labels Dec 22, 2024
@marmarek
Copy link
Member

That's not it either...
The current behavior is weird. After update /run/meminfo-writer.pid gets wrong label (var_run_t). But if I run restorecon /run/meminfo-writer.pid, it gets corrected to qubes_meminfo_writer_var_run_t, so it looks like the policy does have appropriate part included. This behaves the same after reboot, so it isn't just some relabel during update.

@marmarek
Copy link
Member

And at the same time, files in /run/qubes seems to be correctly labeled (except /run/qubes/qubesdb.sock which also got var_run_t...).

@marmarek
Copy link
Member

As for /run/qubes/qubesdb.sock, it's really weird:

  1. After template boot: var_run_t
  2. After restorecon /run/qubes/qubesdb.sock: qubes_qubesdb_socket_t (this is the correct one)
  3. After systemctl restart qubes-db: qubes_var_run_t

I'm confused...

@Minimalist73
Copy link

While doing some tests, I noticed that reinstalling qubes-utils-selinux selinux-policy and selinux-policy-targeted fix the labels.
But, If i reinstall only qubes-utils-selinux, it doesn't.
Same for reinstalling qubes-utils-selinux selinux-policy or qubes-utils-selinux selinux-policy-targeted. It always need both selinux-policy-targeted selinux-policy and qubes-utils-selinux to work.
I'm not sure why it does work like this. The only thing different from F40 is that a restorecon is started in the selinux-policy-targeted scriptlet on both /usr/sbin and /usr/bin

@DemiMarie
Copy link

If I start with the Fedora 41 minimal template, install selinux-policy and a few other packages (I think qubes-gpg-sign, qubes-core-agent-passwordless-root, and some Salt stuff), enable SELinux on the kernel command line, and reboot a few times, the system eventually relabels itself and reaches a state where /run/qubes/qubesdb.sock has the correct context, as does /run/meminfo-writer.pid.

I think the best place to get help is from upstream Fedora.

@DemiMarie
Copy link

That's not it either... The current behavior is weird. After update /run/meminfo-writer.pid gets wrong label (var_run_t). But if I run restorecon /run/meminfo-writer.pid, it gets corrected to qubes_meminfo_writer_var_run_t, so it looks like the policy does have appropriate part included. This behaves the same after reboot, so it isn't just some relabel during update.

Which package was updated?

@DemiMarie
Copy link

@marmarek: can you check if a full relabel followed by two reboots solves the problem?

@DemiMarie DemiMarie self-assigned this Dec 23, 2024
@DemiMarie
Copy link

I'm not sure why it does work like this. The only thing different from F40 is that a restorecon is started in the selinux-policy-targeted scriptlet on both /usr/sbin and /usr/bin

If this happens while file_context is missing some entries, that could explain the problems.

@marmarek
Copy link
Member

marmarek commented Dec 23, 2024

@marmarek: can you check if a full relabel followed by two reboots solves the problem?

Why multiple relabels/reboots would change anything? We are talking about /run, which doesn't survive reboot...

@marmarek
Copy link
Member

FWIW manually forcing full relabel does not help.
But while at it, I found this:

-----w--w-. 1 root root 0 Nov 30 13:33 /.qubes-relabeled

interesting mode...

@marmarek
Copy link
Member

Diff of the resulting file_contexts file before and after updating selinux-policy-targeted contains this:

@@ -5149,7 +5153,6 @@
 /usr/bin/livecd-creator	--	system_u:object_r:livecd_exec_t:s0
 /usr/bin/lttng-sessiond	--	system_u:object_r:lttng_sessiond_exec_t:s0
 /usr/bin/mariadb-backup	--	system_u:object_r:mysqld_exec_t:s0
-/usr/bin/meminfo-writer	--	system_u:object_r:qubes_meminfo_writer_exec_t:s0
 /usr/bin/modules-update	--	system_u:object_r:kmod_exec_t:s0
 /usr/bin/mount\.ecryptfs	--	system_u:object_r:mount_ecryptfs_exec_t:s0
 /usr/bin/neutron-server	--	system_u:object_r:neutron_exec_t:s0
@@ -5160,7 +5163,6 @@
 /usr/bin/partition_uuid	--	system_u:object_r:fsadm_exec_t:s0
 /usr/bin/qemu-pr-helper	--	system_u:object_r:virtd_exec_t:s0
 /usr/bin/quantum-server	--	system_u:object_r:neutron_exec_t:s0
-/usr/bin/qubesdb-daemon	--	system_u:object_r:qubes_qubesdb_daemon_exec_t:s0
 /usr/bin/restart-dirsrv	--	system_u:object_r:initrc_exec_t:s0
 /usr/bin/roundup-server	--	system_u:object_r:roundup_exec_t:s0
 /usr/bin/samba-gpupdate	--	system_u:object_r:samba_gpupdate_exec_t:s0

This looks okay, as both files are in fact in /usr/sbin, not /usr/bin. And entries for /usr/sbin are still there. What I think is happening is that %post script of the policy package does /usr/sbin/fixfiles -C ${FILE_CONTEXT}.pre restore, which should relabel only files that changed labels. So far so good. But then we have /etc/selinux/targeted/contexts/files/file_contexts.subs_dist:

...
/usr/sbin            /usr/bin

So, I think fixfiles thinks /usr/sbin/qubesdb-daemon lost its entry too.

In fact, there is surprisingly little entries for /usr/sbin/ in the default policy (2 from qubes and 2 from other packages), but there are a lot of custom-labeled files in /usr/sbin. Maybe, due to the above subst rule we should have /usr/bin in the policy even though files live in /usr/sbin?

@marmarek
Copy link
Member

marmarek commented Dec 23, 2024

There is /usr/libexec/selinux/binsbin-convert.sh that adds /usr/bin entries for /usr/sbin lines. Running it (with targeted parameter) restores those entries. selinux-policy.spec has this:

%posttrans targeted
%checkConfigConsistency targeted
%{_libexecdir}/selinux/varrun-convert.sh targeted
%{_libexecdir}/selinux/binsbin-convert.sh targeted
...

%triggerin -- fapolicyd-selinux
%{_libexecdir}/selinux/binsbin-convert.sh targeted
%{_sbindir}/restorecon /usr/sbin/fapolicyd*

%triggerin -- usbguard-selinux
%{_libexecdir}/selinux/binsbin-convert.sh targeted
%{_sbindir}/restorecon /usr/sbin/usbguard*

So, theoretically it should be called after the update, but I guess something went wrong here. And as can be seen above, there are specific packages that has it applied to via triggers. This binsbin-convert.sh thing, and the aliasing of bin-sbin seems to be Fedora-specific. Currently we don't enable SELinux on other distributions, but as can be seen in this issue, it's very tricky interaction.

I think the most reliable solution would be to add /usr/bin entries to our policy for all /usr/sbin binaries manually (have them more or less duplicated). This should fix the situation on Fedora (to be tested), regardless if binsbin-convert.sh works. And also should not be a problem in the future on non-Fedora systems not having this magic merging.

DemiMarie added a commit to DemiMarie/qubes-core-qubesdb that referenced this issue Dec 23, 2024
/usr/sbin is now considered an alias for /usr/bin, so include rules for
/usr/bin to ensure correct labelling.  Do not remove the old rules to
avoid regression on Fedora 40.

Part-of: QubesOS/qubes-issues#9663
DemiMarie added a commit to DemiMarie/qubes-core-qubesdb that referenced this issue Dec 23, 2024
/usr/sbin is now considered an alias for /usr/bin, so include rules for
/usr/bin to ensure correct labelling.  Do not remove the old rules to
avoid regression on Fedora 40.

Suggested-by: Marek Marczykowski-Górecki <[email protected]>
Part-of: QubesOS/qubes-issues#9663
DemiMarie added a commit to DemiMarie/qubes-linux-utils that referenced this issue Dec 23, 2024
/usr/sbin is now considered an alias for /usr/bin, so include rules for
/usr/bin to ensure correct labelling.  Do not remove the old rules to
avoid regression on Fedora 40.

Suggested-by: Marek Marczykowski-Górecki <[email protected]>
Part-of: QubesOS/qubes-issues#9663
DemiMarie added a commit to DemiMarie/qubes-core-qubesdb that referenced this issue Dec 23, 2024
/usr/sbin is now considered an alias for /usr/bin, so include rules for
/usr/bin to ensure correct labelling.  Do not remove the old rules to
avoid regression on Fedora 40.

Suggested-by: Marek Marczykowski-Górecki <[email protected]>
Part-of: QubesOS/qubes-issues#9663
marmarek added a commit to marmarek/qubes-linux-utils that referenced this issue Dec 24, 2024
Adjust according to https://fedoraproject.org/wiki/SELinux/IndependentPolicy#Creating_the_Spec_File

Specifically, when %selinux_relabel_post is used, %selinux_relabel_pre
needs to be there too.

QubesOS/qubes-issues#9663
marmarek added a commit to marmarek/qubes-core-qrexec that referenced this issue Dec 24, 2024
Adjust according to https://fedoraproject.org/wiki/SELinux/IndependentPolicy#Creating_the_Spec_File

Specifically, when %selinux_relabel_post is used, %selinux_relabel_pre
needs to be there too.

QubesOS/qubes-issues#9663
@marmarek
Copy link
Member

Thanks @DemiMarie for the PRs. This seems to fix the issue when it's applied together with the problematic selinux-policy package update. But when applied on an already broken template, it doesn't always help. I've opened 3 more PRs, hopefully fixing this part.

@DemiMarie
Copy link

Thanks @marmarek for figuring out the spec file bugs. How did you figure out that the %pre was missing? Just looking at the Fedora project wiki?

@marmarek
Copy link
Member

How did you figure out that the %pre was missing? Just looking at the Fedora project wiki?

That did help a bit, but mostly at looking at content of those %selinux_ macros (rpm -q --scripts of the installed package). Relabel snippet uses a "pre" file for comparing, which wasn't created by any other snippet we had there.

@Minimalist73
Copy link

Thanks @DemiMarie for the PRs. This seems to fix the issue when it's applied together with the problematic selinux-policy package update. But when applied on an already broken template, it doesn't always help. I've opened 3 more PRs, hopefully fixing this part.

I built everything needed with all commits from this issue and it seems to work fine.

  • It works when installing along with selinux-policy and selinux-policy-targeted (Fix not yet impacted templates)
  • It works when only reinstalling selinux-policy and selinux-policy-targeted (Future-proof)

Only issue remaining now is the already broken templates. The rpm scriptlets randomly fails because of the low memory.
Since it's not really reliable, maybe we could use a Qubes Updater plugin to hotfix the label issue on meminfo-writer?

I did this one, it seems to work and brings the memory up relatively fast:

import subprocess


def fix_meminfo_writer_label(os_data, log, **kwargs):
    """
    Fix meminfo-writer SELinux label to make memory ballooning work again

    # https://github.com/QubesOS/qubes-issues/issues/9663
    """

    if os_data["id"] == "fedora":
        meminfo_path = "/usr/sbin/meminfo-writer"
        expected_label = "qubes_meminfo_writer_exec_t"

        label_changed = False
        try:
            output = subprocess.check_output(
                ["ls", "-Z", meminfo_path], universal_newlines=True
            )
            if expected_label not in output:
                subprocess.check_call(["chcon", "-t", expected_label, meminfo_path])
                log.info(
                    f"SELinux label for {meminfo_path} changed to '{expected_label}'"
                )
                label_changed = True
        except subprocess.CalledProcessError as e:
            log.error(f"Error processing {meminfo_path}: {e}")

        if label_changed:
            try:
                subprocess.check_call(["systemctl", "restart", "qubes-meminfo-writer"])
                log.info("qubes-meminfo-writer service restarted")
            except subprocess.CalledProcessError as e:
                log.error(f"Error restarting qubes-meminfo-writer service: {e}")

marmarek added a commit to marmarek/qubes-core-admin-linux that referenced this issue Dec 24, 2024
The relevant package updates are supposed to fix the label anyway, but
before that the template has too little memory to apply the update
reliably (most of the time it works, but not always). Apply the label
fix early to have more memory when installing updates.

Script by Minimalist <[email protected]>

QubesOS/qubes-issues#9663
@marmarek
Copy link
Member

Good idea

@Solomon1732
Copy link

Solomon1732 commented Dec 25, 2024

While doing some tests, I noticed that reinstalling qubes-utils-selinux selinux-policy and selinux-policy-targeted fix the labels. But, If i reinstall only qubes-utils-selinux, it doesn't. Same for reinstalling qubes-utils-selinux selinux-policy or qubes-utils-selinux selinux-policy-targeted. It always need both selinux-policy-targeted selinux-policy and qubes-utils-selinux to work. I'm not sure why it does work like this. The only thing different from F40 is that a restorecon is started in the selinux-policy-targeted scriptlet on both /usr/sbin and /usr/bin

I tried it and it works for broken templates. Maybe it's worth to note it in a news segment on Qubes as a workaround while a long term fix is on the way?

marmarek added a commit to QubesOS/qubes-core-qubesdb that referenced this issue Dec 25, 2024
* origin/pr/66:
  Ensure that qubesdb SELinux labels are correct on Fedora 41

Pull request description:

/usr/sbin is now considered an alias for /usr/bin, so include rules for /usr/bin to ensure correct labelling.  Do not remove the old rules to avoid regression on Fedora 40.

Suggested-by: Marek Marczykowski-Górecki <[email protected]>
Part-of: QubesOS/qubes-issues#9663
marmarek added a commit to QubesOS/qubes-linux-utils that referenced this issue Dec 25, 2024
* origin/pr/120:
  Ensure that qubesdb SELinux labels are correct on Fedora 41

Pull request description:

/usr/sbin is now considered an alias for /usr/bin, so include rules for /usr/bin to ensure correct labelling.  Do not remove the old rules to avoid regression on Fedora 40.

Suggested-by: Marek Marczykowski-Górecki <[email protected]>
Part-of: QubesOS/qubes-issues#9663
marmarek pushed a commit to QubesOS/qubes-linux-utils that referenced this issue Dec 25, 2024
/usr/sbin is now considered an alias for /usr/bin, so include rules for
/usr/bin to ensure correct labelling.  Do not remove the old rules to
avoid regression on Fedora 40.

Suggested-by: Marek Marczykowski-Górecki <[email protected]>
Part-of: QubesOS/qubes-issues#9663
(cherry picked from commit 43387c3)
marmarek added a commit to QubesOS/qubes-linux-utils that referenced this issue Dec 25, 2024
Adjust according to https://fedoraproject.org/wiki/SELinux/IndependentPolicy#Creating_the_Spec_File

Specifically, when %selinux_relabel_post is used, %selinux_relabel_pre
needs to be there too.

QubesOS/qubes-issues#9663

(cherry picked from commit 5c27d6e)
marmarek added a commit to QubesOS/qubes-core-admin-linux that referenced this issue Dec 25, 2024
The relevant package updates are supposed to fix the label anyway, but
before that the template has too little memory to apply the update
reliably (most of the time it works, but not always). Apply the label
fix early to have more memory when installing updates.

Script by Minimalist <[email protected]>

QubesOS/qubes-issues#9663

(cherry picked from commit e7dedd2)
@lubellier
Copy link
Author

The today stable fedora-41 template update fixed the issue. 👍

I noted some errors about the SELinux policy on package installation :

...
python3-aiosignal-1.3.2-1.fc41.noarch: Installed
qubes-db-4.2.7-1.fc41.x86_64: Upgrade
python3-tempora-5.7.0-1.fc41.noarch: Installed
python3-portend-3.2.0-6.fc41.noarch: Installed
python3-jaraco-classes-3.4.0-3.fc41.noarch: Installed
qubes-utils-selinux-4.2.19-1.fc41.x86_64: Upgrade
SELinux:  Could not load policy file /etc/selinux/targeted/policy/policy.33:  Cannot allocate memory
/sbin/load_policy:  Can't load policy:  Cannot allocate memory
libsemanage.semanage_reload_policy: load_policy returned error code 2. (No such file or directory).
SELinux:  Could not load policy file /etc/selinux/targeted/policy/policy.33:  Cannot allocate memory
/sbin/load_policy:  Can't load policy:  Cannot allocate memory
libsemanage.semanage_reload_policy: load_policy returned error code 2. (No such file or directory).
/usr/sbin/semodule:  Failed!
qubes-utils-libs-4.2.19-1.fc41.x86_64: Upgrade
qubes-db-vm-selinux-4.2.7-1.fc41.x86_64: Upgrade
SELinux:  Could not load policy file /etc/selinux/targeted/policy/policy.33:  Cannot allocate memory
/sbin/load_policy:  Can't load policy:  Cannot allocate memory
libsemanage.semanage_reload_policy: load_policy returned error code 2. (No such file or directory).
SELinux:  Could not load policy file /etc/selinux/targeted/policy/policy.33:  Cannot allocate memory
/sbin/load_policy:  Can't load policy:  Cannot allocate memory
libsemanage.semanage_reload_policy: load_policy returned error code 2. (No such file or directory).
/usr/sbin/semodule:  Failed!
python3-qubesimgconverter-4.2.19-1.fc41.x86_64: Upgrade
libxcrypt-4.4.37-1.fc41.x86_64: Upgrade
python3-aiohappyeyeballs-2.4.4-1.fc41.noarch: Installed
c-ares-1.34.4-1.fc41.x86_64: Installed
python3-pycares-4.3.0-9.fc41.x86_64: Installed
...

@Solomon1732
Copy link

The same updates prevented it for me in unaffected templates. I didn't notice such errors on my end, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-4.2 This issue affects Qubes OS 4.2. C: Fedora needs diagnosis Requires technical diagnosis from developer. Replace with "diagnosed" or remove if otherwise closed. P: major Priority: major. Between "default" and "critical" in severity. T: bug Type: bug report. A problem or defect resulting in unintended behavior in something that exists.
Projects
None yet
Development

No branches or pull requests

6 participants