Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lower kernel printk level after boot #1244

Closed
cgwalters opened this issue Jun 29, 2022 · 13 comments · Fixed by coreos/fedora-coreos-config#1840
Closed

Lower kernel printk level after boot #1244

cgwalters opened this issue Jun 29, 2022 · 13 comments · Fixed by coreos/fedora-coreos-config#1840
Labels

Comments

@cgwalters
Copy link
Member

Most other Fedora editions use anaconda, which for a long time has injected the quiet kernel command line argument because it has a history of use on desktop systems where we don't want users to see kernel output spew by default.

We do want server administrators to see kernel output by default.

Fedora CoreOS has never injected the quiet karg, and as a consequence too this means that for later runtime the kernel is much more verbose.

This is particularly bad in concert with #567
which causes this verbose output on slow serial consoles.

I think #567 is the primary fix but we should also consider at least quieting kmsg after switching to the real root successfully.

@dustymabe
Copy link
Member

I think #567 is the primary fix

Yep. Luckily I think we are close to finishing that work.

but we should also consider at least quieting kmsg after switching to the real root successfully.

I think the largest (most annoying) issue that I experience here is the audit messages getting printed out, which has its own separate issue and potential solution.

cgwalters added a commit to cgwalters/fedora-coreos-config that referenced this issue Jul 13, 2022
Closes: coreos/fedora-coreos-tracker#1244

A lot going on for this simple service.  See the tracker issue above
for more info, but briefly:

Anaconda has historically injected `quiet` into the kernel command
line in many cases, and this suppresses *both* kernel and systemd
output.  On computers in general, but particularly many bare metal
servers, the serial console can be slow.  This can cause reliability
issues.

However for servers, we usually *do* want to see informational
output when they boot.  For example, today the kernel "mitigations"
information for hardware vulnerabilities is output.

This change is a compromise; we boot up at the kernel's default
verbosity level (which for Fedora and derivatives is the upstream 7),
but switch to 4 before exiting the initramfs.  At that point, we've
gotten most of the boot time output, and our initramfs is not
extremely performance sensitive right now.
cgwalters added a commit to cgwalters/fedora-coreos-config that referenced this issue Jul 13, 2022
Closes: coreos/fedora-coreos-tracker#1244

A lot going on for this simple service.  See the tracker issue above
for more info, but briefly:

Anaconda has historically injected `quiet` into the kernel command
line in many cases, and this suppresses *both* kernel and systemd
output.  On computers in general, but particularly many bare metal
servers, the serial console can be slow.  This can cause reliability
issues.

However for servers, we usually *do* want to see informational
output when they boot.  For example, today the kernel "mitigations"
information for hardware vulnerabilities is output.

This change is a compromise; we boot up at the kernel's default
verbosity level (which for Fedora and derivatives is the upstream 7),
but switch to 4 before exiting the initramfs.  At that point, we've
gotten most of the boot time output, and our initramfs is not
extremely performance sensitive right now.
@cgwalters
Copy link
Member Author

I think the largest (most annoying) issue that I experience here is the audit messages getting printed out, which has its own #220 and #461.

Yeah, audit on FCOS is definitely a big part of it, but unfortunately we diverged here and rhcos includes auditd, so doesn't get hit by that.

But...an example of something that's common between these is this:

[root@cosa-devsh ~]# podman run --rm -ti busybox echo hello
✔ docker.io/library/busybox:latest
Trying to pull docker.io/library/busybox:latest...
Getting image source signatures
Copying blob 19d511225f94 done  
Copying config 62aedd01bd done  
Writing manifest to image destination
Storing signatures
[   23.636325] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this.
[   23.643401] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[   23.645028] IPv6: ADDRCONF(NETDEV_UP): veth195b65cb: link is not ready
[   23.645921] IPv6: ADDRCONF(NETDEV_CHANGE): veth195b65cb: link becomes ready
[   23.646881] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[   23.647868] cni-podman0: port 1(veth195b65cb) entered blocking state
[   23.648726] cni-podman0: port 1(veth195b65cb) entered disabled state
[   23.649885] device veth195b65cb entered promiscuous mode
[   23.651199] cni-podman0: port 1(veth195b65cb) entered blocking state
[   23.652015] cni-podman0: port 1(veth195b65cb) entered forwarding state
[   23.831078] cgroup: cgroup: disabling cgroup2 socket matching due to net_prio or net_cls activation
hello
[   23.977840] cni-podman0: port 1(veth195b65cb) entered disabled state
[   23.978746] device veth195b65cb left promiscuous mode
[   23.979660] cni-podman0: port 1(veth195b65cb) entered disabled state
[root@cosa-devsh ~]# echo 4 > /proc/sys/kernel/printk
[root@cosa-devsh ~]# podman run --rm -ti busybox echo hello
hello
[root@cosa-devsh ~]# 

Not spewing all that stuff about the default bridge device we create every time we run a container will be an IMO enormous win that is orthogonal to auditd.

@cgwalters
Copy link
Member Author

PR in coreos/fedora-coreos-config#1840

@dustymabe
Copy link
Member

I'm still not convinced. I do think having the kernel messages hit the console are useful when there are no other ways to debug what went wrong. It feels like here we are optimizing for the "local typing on the console" case, which isn't really what we encourage users to do. Of course, sometimes a user needs to access the console to run commands to debug, but the kernel messages aren't that much of a bother in that case IMO.

This lack of clear benefit, and considering we're about to fix #567 anyway makes me lean towards not doing this. Though, I'd be supportive if this was done Fedora wide or in dracut upstream.

I'm interested in what others think. Added the meeting label for discussion.

@dustymabe dustymabe added the meeting topics for meetings label Jul 13, 2022
@cgwalters
Copy link
Member Author

cgwalters commented Jul 13, 2022

I do think having the kernel messages hit the console are useful when there are no other ways to debug what went wrong.

Note we do still get warning-level and above messages on the console. And we get all debug level messages from before the switch to the real root.

@miabbott
Copy link
Member

Not spewing all that stuff about the default bridge device we create every time we run a container will be an IMO enormous win that is orthogonal to auditd.

We've seen downstream complaints about these exact messages and how they can be alarming - https://bugzilla.redhat.com/show_bug.cgi?id=2064467

@jlebon
Copy link
Member

jlebon commented Jul 13, 2022

This was discussed in today's community meeting:

13:47:16 < jlebon> #info there's general agreement this would be good, but
                   implementation details are still to be fleshed out

@jlebon jlebon removed the meeting topics for meetings label Jul 13, 2022
cgwalters added a commit to cgwalters/fedora-coreos-config that referenced this issue Jul 13, 2022
Closes: coreos/fedora-coreos-tracker#1244

A lot going on for this simple service.  See the tracker issue above
for more info, but briefly:

Anaconda has historically injected `quiet` into the kernel command
line in many cases, and this suppresses *both* kernel and systemd
output.  On computers in general, but particularly many bare metal
servers, the serial console can be slow.  This can cause reliability
issues.

However for servers, we usually *do* want to see informational
output when they boot.  For example, today the kernel "mitigations"
information for hardware vulnerabilities is output.

This change is a compromise; we boot up at the kernel's default
verbosity level (which for Fedora and derivatives is the upstream 7),
but switch to 4 very early on in the real root.  At that point, we've
gotten most of the boot time output, and our initramfs is not
extremely performance sensitive right now.

Also, we explicitly only *lower* the output log level, and only if
there isn't explicitly `debug` on the kernel command line.
cgwalters added a commit to cgwalters/fedora-coreos-config that referenced this issue Jul 13, 2022
Closes: coreos/fedora-coreos-tracker#1244

A lot going on for this simple service.  See the tracker issue above
for more info, but briefly:

Anaconda has historically injected `quiet` into the kernel command
line in many cases, and this suppresses *both* kernel and systemd
output.  On computers in general, but particularly many bare metal
servers, the serial console can be slow.  This can cause reliability
issues.

However for servers, we usually *do* want to see informational
output when they boot.  For example, today the kernel "mitigations"
information for hardware vulnerabilities is output.

This change is a compromise; we boot up at the kernel's default
verbosity level (which for Fedora and derivatives is the upstream 7),
but switch to 4 very early on in the real root.  At that point, we've
gotten most of the boot time output, and our initramfs is not
extremely performance sensitive right now.

Also, we explicitly only *lower* the output log level, and only if
there isn't explicitly `debug` on the kernel command line.
cgwalters added a commit to cgwalters/fedora-coreos-config that referenced this issue Jul 13, 2022
Closes: coreos/fedora-coreos-tracker#1244

A lot going on for this simple service.  See the tracker issue above
for more info, but briefly:

Anaconda has historically injected `quiet` into the kernel command
line in many cases, and this suppresses *both* kernel and systemd
output.  On computers in general, but particularly many bare metal
servers, the serial console can be slow.  This can cause reliability
issues.

However for servers, we usually *do* want to see informational
output when they boot.  For example, today the kernel "mitigations"
information for hardware vulnerabilities is output.

This change is a compromise; we boot up at the kernel's default
verbosity level (which for Fedora and derivatives is the upstream 7),
but switch to 4 very early on in the real root.  At that point, we've
gotten most of the boot time output, and our initramfs is not
extremely performance sensitive right now.

Also, we explicitly only *lower* the output log level, and only if
there isn't explicitly `debug` on the kernel command line.
cgwalters added a commit to cgwalters/fedora-coreos-config that referenced this issue Jul 15, 2022
Closes: coreos/fedora-coreos-tracker#1244

A lot going on for this simple service.  See the tracker issue above
for more info, but briefly:

Anaconda has historically injected `quiet` into the kernel command
line in many cases, and this suppresses *both* kernel and systemd
output.  On computers in general, but particularly many bare metal
servers, the serial console can be slow.  This can cause reliability
issues.

However for servers, we usually *do* want to see informational
output when they boot.  For example, today the kernel "mitigations"
information for hardware vulnerabilities is output.

This change is a compromise; we boot up at the kernel's default
verbosity level (which for Fedora and derivatives is the upstream 7),
but switch to 4 very early on in the real root.  At that point, we've
gotten most of the boot time output, and our initramfs is not
extremely performance sensitive right now.

Also, we explicitly only *lower* the output log level, and only if
there isn't explicitly `debug` on the kernel command line.
@cgwalters
Copy link
Member Author

cgwalters added a commit to coreos/fedora-coreos-config that referenced this issue Jul 15, 2022
Closes: coreos/fedora-coreos-tracker#1244

A lot going on for this simple service.  See the tracker issue above
for more info, but briefly:

Anaconda has historically injected `quiet` into the kernel command
line in many cases, and this suppresses *both* kernel and systemd
output.  On computers in general, but particularly many bare metal
servers, the serial console can be slow.  This can cause reliability
issues.

However for servers, we usually *do* want to see informational
output when they boot.  For example, today the kernel "mitigations"
information for hardware vulnerabilities is output.

This change is a compromise; we boot up at the kernel's default
verbosity level (which for Fedora and derivatives is the upstream 7),
but switch to 4 very early on in the real root.  At that point, we've
gotten most of the boot time output, and our initramfs is not
extremely performance sensitive right now.

Also, we explicitly only *lower* the output log level, and only if
there isn't explicitly `debug` on the kernel command line.
@dustymabe dustymabe added the status/pending-testing-release Fixed upstream. Waiting on a testing release. label Jul 19, 2022
@dustymabe dustymabe changed the title Change default to be equivalent of quiet Lower kernel printk level after boot Jul 19, 2022
@dustymabe
Copy link
Member

The fix for this went into next stream release 36.20220716.1.0. Please try out the new release and report issues.

@dustymabe
Copy link
Member

The fix for this went into testing stream release 36.20220716.2.0. Please try out the new release and report issues.

@dustymabe dustymabe added status/pending-stable-release Fixed upstream and in testing. Waiting on stable release. and removed status/pending-testing-release Fixed upstream. Waiting on a testing release. labels Jul 19, 2022
@dustymabe
Copy link
Member

The fix for this went into stable stream release 36.20220716.3.1.

@dustymabe dustymabe removed the status/pending-stable-release Fixed upstream and in testing. Waiting on stable release. label Jul 28, 2022
@git1106
Copy link

git1106 commented Jul 28, 2022

When 36.20220716.3 was auto-updated I was surprised that the console log writings were missing and was directed here by your friendly IRC support.

For users who prefer the old spammy behavior, this can now be achieved using the following Terraform snippet

        {
          "overwrite" : true,
          "path" : "/etc/sysctl.d/99-nel.conf",
          "contents" : {
            "source" : "data:;base64,${base64encode("kernel.printk=7 4 1 7")}",
          },
          "mode" : local.S_IRUSR + local.S_IWUSR + local.S_IRGRP + local.S_IROTH,
        },

Sorry, we use Terraform with ignition directly, so I don't have a tested Butane version. But the translation into Butane should be straightforward I guess.

Our use case is:

  • We use AWS EC2 and normally the serial console is disabled, so we don't mind about spam.
  • However, for development purposes and when a machine fails to respond over the network we first connect to the serial console to see what is going on. We don't have any console login, but especially the audit messages (prio 5) can be useful to see what our machine is doing.
  • I don't want to claim this is the ultimate debugging method. Of course in difficult cases we need the disk image of the failed machine and access the the journal to get user space log messages.

@bgilbert
Copy link
Contributor

bgilbert commented Jul 28, 2022

The Butane equivalent is:

variant: fcos
version: 1.4.0
storage:
  files:
    - path: /etc/sysctl.d/99-printk.conf
      contents:
        inline: |
          kernel.printk=7 4 1 7

The overwrite flag isn't necessary because the file won't already be there, and the mode isn't necessary because 644 is the default.

HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this issue Oct 10, 2023
Closes: coreos/fedora-coreos-tracker#1244

A lot going on for this simple service.  See the tracker issue above
for more info, but briefly:

Anaconda has historically injected `quiet` into the kernel command
line in many cases, and this suppresses *both* kernel and systemd
output.  On computers in general, but particularly many bare metal
servers, the serial console can be slow.  This can cause reliability
issues.

However for servers, we usually *do* want to see informational
output when they boot.  For example, today the kernel "mitigations"
information for hardware vulnerabilities is output.

This change is a compromise; we boot up at the kernel's default
verbosity level (which for Fedora and derivatives is the upstream 7),
but switch to 4 very early on in the real root.  At that point, we've
gotten most of the boot time output, and our initramfs is not
extremely performance sensitive right now.

Also, we explicitly only *lower* the output log level, and only if
there isn't explicitly `debug` on the kernel command line.
HuijingHei pushed a commit to HuijingHei/fedora-coreos-config that referenced this issue Oct 10, 2023
Closes: coreos/fedora-coreos-tracker#1244

A lot going on for this simple service.  See the tracker issue above
for more info, but briefly:

Anaconda has historically injected `quiet` into the kernel command
line in many cases, and this suppresses *both* kernel and systemd
output.  On computers in general, but particularly many bare metal
servers, the serial console can be slow.  This can cause reliability
issues.

However for servers, we usually *do* want to see informational
output when they boot.  For example, today the kernel "mitigations"
information for hardware vulnerabilities is output.

This change is a compromise; we boot up at the kernel's default
verbosity level (which for Fedora and derivatives is the upstream 7),
but switch to 4 very early on in the real root.  At that point, we've
gotten most of the boot time output, and our initramfs is not
extremely performance sensitive right now.

Also, we explicitly only *lower* the output log level, and only if
there isn't explicitly `debug` on the kernel command line.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants