-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to enable kdump feature on Fedora CoreOS #622
Comments
Thanks for raising this. The crash initramfs is the same as the normal one, correct? Where do the kernel arguments for the crash kernel come from? If they're directly taken from the boot arguments to the running kernel, we may need to add some argument filtering, or some |
Yes, It's the same as the normal one. And the kernel arguments for the crash kernel basically come from /proc/cmdline that is the same as the running kernel's one.
Please let me confirm just in case. There are some arguments for Ignition that are added on the first CoreOS boot only. When we can crash on the first boot, crash kernel(2nd kernel) unintentionally reruns Ignition because the 2nd kernel will boot with the arguments for the first boot only. This situation is what we want to avoid. If so, I think kexec-tools will be installed on the 2nd boot or later. And it looks like Ignition won't be rerun unintentionally. Is there any other situations that trigger Ignition? FYI, The following outputs are the sample from my environtment.
|
That's useful.
Correct.
Presumably we'd want to ship kexec-tools as part of Fedora CoreOS. We generally discourage package overlays (
None that are supported. |
Thanks for your explanation and the reference #401. I understand what you are concerned about. To install kexec-tools, there are two ways : the "side yum repo" approach and the "part of Fedora CoreOS" approach. In any case, we may need some argument filtering to avoid the problem you mentioned. |
Yeah, I'm leaning that way myself. The tools aren't large, there's basically never going to be users who want different versions of the tools, there's not much that would actually run inside a container image, etc. |
Right, I really don't like regenerating the initramfs on boot. There's really a lot going on in kdump and how it interacts with ostree/rpm-ostree is quite potentially tricky. The fact that it has code running both on the host as a systemd service and in the initramfs and how those interact... One high level point: we can't enable kdump by default, because not every user will be willing to pay the cost in RAM for file dumping, network dump location needs to be configurable...basically it needs to be off by default because it really requires configuration to be useful. I think then it's interesting to look at what the UX should feel like for enabling it. Will circle back to this. |
@k-keiichi-rh can you test out patching the manifest to add In particular what I want to better understand is: if we take that path, do we need a separate kdump initramfs at all? Today rpm-ostree ships a generic initramfs - there's no configuration embedded. If we can configure the kdump initramfs solely via the kernel cmdline, that gets us out of a whole lot of problems. If actually kdump needs more complex configuration (and I think it does for |
@k-keiichi-rh btw if it helps feel free to schedule a realtime meeting on this! |
I looks like |
I will do that and share the result.
I understand the most important thing is to avoid renerating the initramfs for kdump to be simplified.
I believe that the test I will try will lead us to move forward for further discussion. |
Yes, /etc/kdump.conf and /etc/sysconfig/kdump are required to enable kdump.
kdumpctl is triggered by kdump.service via systemd.
There is no support of user-overriding. |
Thank you for your help. I am a newcomer in the CoreOS system area. So it will be really helpful for me. |
When adding kexec-tools by default into the OS image, kdump service doesn't automatically start in Fedora CoreOS. The following is my fedora-coreos-config to add kexec-tools:
The following is the changes when adding kexec-tools by default:
As for my next step, I will summarize what we can do at the moment. |
Yes, to do that just add it to e.g. But I think that's going to reveal a larger problem in that in this model our initramfs came pre-built with kdump code inside it - we just don't have configuration for it. I am pretty sure however we do this it's going to involve patching kdump to better integrate with rpm-ostree. See e.g. https://github.com/coreos/coreos-assembler/blob/master/README-devel.md#using-overrides for how to test out modifications to the kdump source code. I believe we also need to land coreos/rpm-ostree#2170 so that an administrator can configure kdump. |
Taking a step back, let's walk through what the "administrator experience" for kdump on a CoreOS system would be. I'm writing this out in terms of shell commands to run, but I think what we really want is to make this ergnomic to enable via Ignition (fcct) - e.g. fcct could have high level sugar that would generate a systemd unit to run these commands.
The fcct sugar could look like:
(BTW a higher level thing in this - for OpenShift 4 IMO we should make it completely trivial to have a cluster service collect kdump crashes - or even just have a cluster service detect a kernel crash, this is related to coreos/ignition#585 and also https://github.com/kubernetes/node-problem-detector - maybe even ship an operator via OLM for this. But anyways let's get the base mechanics working first) |
I agree with your suggestion. I would like to discuss the shell commands part.
It looks like "rpm-ostree initramfs-etc" wouldn't help to configure kdump. The following output is the arguments in my Fedora CoreOS:
So it's difficult to avoid to regenerate the initramfs for kdump on boot. But I don't have any solutions to resolve this problem now. |
The idea behind the work in progress initramfs-etc command is that it allows generating a secondary initramfs from local configuration, distinct from the "golden" initramfs created by the CoreOS build process. The bootloaders we care about should support multiple initrds and concatenate them dynamically. |
I tried to summarize what we know so far (mostly based off of the work that @k-keiichi-rh has done) and put together a WIP doc. Please correct me if I've misunderstood anything! |
@cgwalters I'm also not completely sure how initramfs-etc will help with kdump. Currently, the kdump initrd is generated with the help of a |
I hadn't realized that. Well it certainly simplifies things then; you're right that ostree/rpm-ostree probably don't need to be involved. |
That fact changes my "lean towards adding by default" stance. Since this needs manual configuration and a reboot anyways, it seems fine to leave as an extension. |
@cgwalters If we follow the steps outlined in @k-keiichi-rh's first comment, kdump seems to work fine. So should our next step be making enabling kdump more ergonomic by adding some fcct sugar, so that all configuration can be done through Ignition configs? |
Yeah, the deliverable here may just be changes to the FCOS docs describing the basics, and fcct sugar. |
Though we should also probably write at least a basic test for this too. |
Yeah agreed. And then in the future, we can keep optimizing it once we have kargs-via-Ignition and first-boot extensions. |
I submitted a PR to add kexec-tools to rhcos extensions which I could link here if we merged openshift/os#413 |
If kdump is enough of a core FCOS/RHCOS feature that we're adding FCCT sugar for it, it seems pretty clear to me that we should make it part of the base OS. Is there any reason not to do so? The reboot to add kdump kargs should probably be implemented using coreos/ignition#1051. It's still unsafe to reboot from a systemd unit during first boot after we've exited the initramfs. |
Perhaps we just want sugar for kargs and package installs, then we wouldn't need special kdump sugar.
It drags in some new random dependencies (that are probably unnecessary actually) and it wouldn't be on by default, so if explicit action is required it might as well be to install it as well. (To be clear I was leaning towards add by default originally, could still be convinced - at least kdump is something that applies universally across metal and cloud for example)
Yep. For OpenShift though the MCO is already he single point of rebooting, so all that's needed there is to ship kexec-tools as an extension, we already support writing its config via Ignition and also support kernel arguments explicitly. |
Well, except that we're an image-based OS and we discourage package installs. It's not obvious to me that we should only ship things that are enabled by default. For example, we ship software to cloud platforms that only makes sense on bare metal, since we have a unified image for all use cases. |
I think everyone agrees with that but as stated that sounds like an argument against #401 entirely. IOW this discussion is more about the nuance of this specific case. We can take this one perhaps to the next open discussion and do a vote? |
So the thing really blocking kdump from being enabled on FCOS is "It's still unsafe to reboot from a systemd unit during first boot after we've exited the initramfs.", right? If rebooting from a systemd unit after switch-root is not a problem, then we essentially just need to add a systemd unit through Ignition configs that performs the actions in #622 (comment) and reboot? |
Eh, not quite. People are obviously going to want to package layer, and it's good to avoid ~instantly breaking their systems when they do. But it's one thing to help folks who want to run FCOS for use cases at the margins of what we support, and it's another to push functionality that we actually support/recommend out into extensions. It's not clear to me that the latter is worth the usability and complexity tradeoffs. |
This is technically true, but you could say that for anything which wants to reboot during the firstboot, even Zincati. While we can and should design better solutions for things which need a reboot (e.g. kargs-via-Ignition and "live" extensions), we can't address all use cases for which a user will want to reboot on first-boot. We need a better solution here, which probably will involve some systemd changes (will look at filing an issue there; edit: found this issue and this PR), but meanwhile, I'd say doing |
This was discussed in the meeting today:
|
Answering the questions from #641 (comment): Can the package be run from a container?It probably could run in a container, though it's not designed to. So there would be some maintenance burden involved. Can the tool be helpful in debugging container runtime issues?It could be useful if the bug is in the kernel. Can the tool be helpful in debugging networking issues?It could be useful if the bug is in the kernel. Is it possible to layer the package onto the base OS as a day 2 operation?It is, and it works fine as a layered package. Though it increases friction for something that we want to consider part of the "host API". E.g. if we want a Does the package have additional dependencies? (i.e. does it drag in Python, Perl, etc)
What is the size of the package and its dependencies?
So total 2.3M, though this includes docs, which we don't ship. Can the packaging be adjusted to just deliver binaries?Not applicable. A core desirable part of the package is the systemd service. What is the intended use case of the package?Being able to collect crashed kernel cores for analysis. Can the tool be used to do things we’d rather users not be able to do in FCOS? (E.g. can it be abused as a Turing complete interpreter?)I'm not very familiar with kdump, though it seems focused in functionality. It does add Does the tool have a history of CVEs?Scanned Bodhi as well as the Fedora and RHEL RPM changelog for CVEs and didn't find any. |
This was discussed in: coreos/fedora-coreos-tracker#622
This was discussed in: coreos/fedora-coreos-tracker#622
This was discussed in: coreos/fedora-coreos-tracker#622
This was discussed in: coreos/fedora-coreos-tracker#622
Given that:
I think we can close this issue when the docs PR gets merged. I will open an FCCT specific issue to track kdump sugar support. Edit: coreos/butane#175 |
Makes sense to me! |
coreos/fedora-coreos-docs#198 is now merged. |
This was discussed in: coreos/fedora-coreos-tracker#622
Kdump feature to collect kernel crash dumps would be useful to find the reason or the root cause of system failure.
And Fedora already has a document to enable it:
https://fedoraproject.org/wiki/How_to_use_kdump_to_debug_kernel_crashes
So we can also enable kdump feature on Fedora CoreOS with minor adjustments with the following steps:
=> kexec-tools, dracut-squash, snappy, ethtool and squashfs-tools are required.
=> Change the default path from "path /var/crash" to "path /sysroot/ostree/deploy/rhcos/var/crash"
After done the above steps, we would be able to collect kernel crash dumps in Fedora CoreOS.
If we don't have any issues, adding kdump instructions as a part of Troubleshooting in Fedora CoreOS docs would be helpful.
However there are several discussion points I think. So I would like to discuss them before proposing the doc.
I really appreciate if anyone suggest issues other than the following.
1. Interactions with rpm-ostree around kdump initramfs
The kdump service in kexec-tools loads a kernel image and the corresponding initramfs
into the reserved memory space. Howerver configuring kdump is outside of the rpm-ostree
and there is a possibility that the initramfs won't be updated when upgrading the kernel.
If the initramfs isn't updated, a mismatch between kernel and initramfs might occur.
As far as I can see, there were no problems when upgrading to different stream like
stable=>testing and testing=>next in Fedora CoreOS.
Because the new /boot/ostree/fedrora-coreos-XXXX is generated during upgrading and the initramfs that matches
the upgraded kernel is also generated during boot.
So it looks like the mismatch won't occur if the boot directory is recreated every time we upgrade.
But I heard that some issues regarding this kind of interaction are reported in the area of OpenShift and Atomic Host.
Please correct me if my understanding is wrong or something else I overlooked.
2. Might fail to upgrade when moving current stream based on Fedora 32 to Fedora 33
All of current streams(stable, testing and next stream) are based on Fedora 32.
But if the streams based on Fedora 32 are moved to Fedora 33 or later, the older kexec-tools tries to
replace the dracut package with an older one. In this case, the version mismatch might lead to fail to
upgrade.
One of the solutions is to remove the kexex-tools package before upgrading.
Howerver it seems less than ideal.
I need to test whether the above problem occus or not. I will do that.
The text was updated successfully, but these errors were encountered: