-
Notifications
You must be signed in to change notification settings - Fork 111
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change Ignition to handle pivot, or move it to installer #148
Comments
What'd be great here is if someone prototyped out writing an Ignition config that did exactly the above; you'd need to download the pivot binary (use the Ignition fetch code), etc. The |
And we need to change |
In an ideal scenario I would like for the post pivot scenario to look exactly like a normal spin up of a fresh RHCOS node. That way testing against one set is fundamentally the same as testing the other (minus the actual action of performing the pivot from Fedora AH). Following that vein you could get away with only setting the Where it would still need some solving currently is how to get it to not fetch the user config initially on boot into Fedora AH in favor of the special pivot config. |
Right. The problem we have to think about is all the things that aren't in the ostree. For example, disk layout. Unless explicitly suppressed, So part of this is probably having the Ignition config suppress everything not needed to do the pivot. And in fact after doing the pivot with |
Those could be explicitly suppressed inside of the pivot config and the base pre-emption config could be updated to un-suppress them. |
@dustymabe proposed doing the pivot inside the initramfs, before and independently of Ignition, which could simplify a lot of this. That still leaves the question of how FAH or FCOS should know to perform the pivot. It could be driven by some magic in the Ignition config, but that doesn't feel great. If we do the pivot from the initramfs, it's tempting to use a dedicated BootstrapOS image which is only kernel and initramfs, and whose only functionality is to pivot. |
💯% agree. I feel like doing this in initramfs makes things simpler at the draw back of making things more complicated. Magic. IMHO, is bad. (aka 🎆 👎 😺)
Agreed. I'm not sure how much it's worth having FCOS, RHCOS, and a special one time only bootstrap os in terms of maintenance. In theory the bootstrap os would not be a lot of work (being it's very minimal) but I still worry about adding that when we also have two other distros and their derivatives to maintain. This doesn't mean we should avoid it, but I think we need to consider this carefully. |
One thing I'm wondering about is to what degree we can entirely hide BootstrapOS - for example, let's say that BootstrapOS has a very simplified Ignition-like system which only accepts the data necessary to execute the pivot, and does not write anything persistently to disk (e.g. we slap an overlayfs-tmpfs over |
@ashcrow Detecting when to perform the pivot is an issue regardless of approach. AIUI this bug contemplates solving it by writing an Ignition config which is actually two Ignition configs in one, which seems pretty unwieldy too. The maintenance requirement of a separate BootstrapOS would need to be traded off against the maintenance requirement of FCOS always needing to be able to replace itself with a different OS that potentially includes older components. |
Right, the example that jumps to mind here is that pretty soon XFS in Fedora is going to gain |
More generally, it introduces a direct coupling between FCOS and RHCOS instead of merely an upstream-downstream relationship. Do we have any details yet of how BootstrapOS versions are likely to be chosen across multiple OpenShift releases? We'll need to understand how much this will complicate the FCOS test matrix, which may in itself be enough reason to lifecycle BootstrapOS along with OpenShift. |
That's currently up to us to design and provide but we haven't gotten there yet. It's also up to us if we make bootstrapos different than FCOS or RHCOS. |
Another idea, which may or may not be feasible: during cluster bootstrap, OpenShift could generate an RHCOS image within the cluster's cloud account, and then instances could be launched directly from that image. That way we'd avoid having to pivot on each instance launch, which should be faster and more reliable. It also means that the image generation process could be arbitrarily clunky; e.g. it could attach a second disk and install onto it. |
My hesitations with that are:
But there are obviously upsides too like you said. |
Another random idea I had: what if we just encrypted the drive, and required the installer to use Ignition to unlock it? Offhand...one downside would be that we're now paying the penalty of dm-crypt for the OS. Fleshing this out a bit more, we'd have a key rotation scheme, where every |
@cgwalters I was thinking we'd still unpack a container image, just in advance. Creating AMIs would be another IAM permission, but not a dangerous one IMO. But yes, we'd need to prune etc. The dm-crypt thing could work. Note that Ignition support for dm-crypt is not complete, and the original implementation work had CL-specific code. We'd need to reprioritize it. |
This won't fly for business reasons. |
I suggest discussing this elsewhere because this is an argument about resources between two unrelated things - a community project and a commercial product. You know my position on this. |
Having a bootstrap who's job it is to pivot into a specific ostree content stream with minimal input seems pretty generally useful as the basis for a large class of "multipurpose uses". No one loves AMIs. NO ONE. An AMI that can quickly pivot is valuable for a wide range of "managing an AMI is hard, but managing a docker image that is similar across all the cloud providers is easy". |
There will always be a perf penalty for a pivot, but there also is a penalty for a docker image. |
@smarterclayton I don't quite follow what you mean about this. Our teams have to build and maintain them so, for us, there is some relation. What am I missing?
I don't disagree about AMIs. Would providing a BootstrapOS (not RHCOS) and pointing to a container image be easier for the installer folks to work with? If so then it does seem like a no brainier to change plans again. |
I don't know what you mean by change plans, but your concern was that maintenance of a minimal AMI everywhere + Fedora CoreOS everywhere was expensive. I was asking two questions (that were variants of your question):
Colin's
Should pivoting fundamentally be an ignition concern? Do we anticipate a world where ignition isn't used to fill out both metal and cloud machines with the correct content? Would we benefit in the long run if ignition runs faster (because the bootstrap OS is smaller) to get the right configuration + os content into place? |
An ask by Openshift was to have a pipeline providing RHCOS AMIs ASAP so folks to use and test it in the installer and a new CI. Installer work is somewhat moving. I'm not aware of any movement with CI. This proposed change sounds like we would scrap the RHCOS part of a pipeline and replace it with providing a minimal BootstrapOS as AMIs and continue providing container images with RHCOS content as we do today. To me, that's a change to the plan :-) |
The problem is that Ignition wants to run only once, and instance userdata can only be set once. There are workarounds for both, but we could avoid a whole set of complexity if we handled pivot in a separate tool and reserved the Ignition config for actually setting up the RHCOS node. |
Yes, until the pivot is ready, proven, and a plan is set, RHCOS AMIs are the only deliverable that is useful. The plan was never to stop at RHCOS AMIs, which is why I was confused that you brought it up that way. |
Are pivot and ignition concerns overlapping? Pivot is about getting the right version of the entire OS onto the machine, including ratcheting concerns like filesystem versions (are there other concerns like filesystems?). Ignition is required to set up disks. Is the part of ignition that sets up disks closer to initramfs / pivot for OS content? What is the purpose of ignition? Isi it to configure an OS? Or to configure a machine? Was there ever a container linux situation where a newer version dropped hardware support for something, such that the initial container linux boot would work but the next upgrade would fail? We'll probably hit cases like that more (if they're common enough) over time. |
To be clear I brought this up because I find the current situation confusing, hard to document, and hard to manage CI/CD for. But...that answer seems clear, we need to do both right? I think we can do that. |
That is inline with my previous and current understanding: We need to have RHCOS AMI, and content (via container) which can be used to pivot an OSTree system (RHCOS, FCOS, others) into RHCOS. |
@smarterclayton I think you're right 👍. This issue is meant to be about if pivot should be called in ignition or if it should be called during the installer. |
I have to say - if i have an OS that is designed to flexibly pull and use content, and is also designed to draw its config from ignition, it seems a bit strange that ignition-the-tool is at odds with the goals of the OS if the OS is designed around the tool. If we say ignition owns the OS disk, does that change how we think about what ignition should do? If ignition sets up the disk it has to choose content. If we say that ignition doesn't own the OS disk, does that change about how we think what ignition should do? If ignition doesn't own that disk, then it's the OS's problem. |
This dovetails with the question of whether we pivot everywhere, or only on EC2. If we're only pivoting on EC2, it seems as though the output of pivot should be a pristine, unprovisioned RHCOS image, since that's the starting point we'll have on other platforms. On those platforms, the use of ignition-disks is circumscribed anyway, since we can't e.g. blow away the root partition. If we're pivoting everywhere, we have more flexibility. If Ignition performed the pivot itself, I think there are fewer concerns; mainly a) we'd be provisioning using a kernel other than the one RHCOS ships with, and b) how we'd add that functionality to Ignition in a reasonably generic way. Having the installer arrange for userspace to perform the pivot leads to the complexity I mentioned before.
Historically, it's to configure an OS launched from a canned image.
Not intentionally, but it was certainly possible. In that case, the machine would upgrade, fail to come up after reboot, and roll back. (And try the upgrade again...) |
We would pivot on all public clouds, and I'd like to pivot on all private clouds and for the DIY metal crowd. |
Yeah, the closer we are to that the better I'd say. I was thinking about this a bit more and this actually seems like a great use for the "factory reset" model. Basically we:
There are just a few things we'd have to preserve like |
I think we are good on this now. The MCD is the flow through to the host and will understand how to execute |
I think this still needs to be discussed. The MCD only handles updates, not installations. It only uses Thanks for the ping; I had somehow forgotten about this issue and filed a similar one against the installer there: openshift/installer#267. As you can see from that issue, I had thought this was already decided and that the installer would take care of pivoting. |
No. The plan as of today is there will be no bootstrapos. The OS itself will join the cluster and update (or alternatively update, join the cluster, and then be checked for updates from then on via MCD).
NP 😄 |
Ahh gotcha. In that case, yeah I think we can close this issue! |
Thanks @jlebon. I'll defer to @cgwalters to close or denote it should remain open as the reporter of the issue. |
@cgwalters We can close this one, right? In openshift/installer#281, we agreed the installer will take care of pivoting. And we have a follow up in #307 on what needs to happen on the RHCOS side. |
Agreed. |
In all of our discussions of "Ignition runs once", in fact we will almost certainly need it to run twice when doing a pivot. Alternatively, it can be the installer's job. We will need to decide this soon.
I can imagine the installer doing this by basically laying down ssh creds, then adding a systemd unit like this:
Then the other systemd units all do the opposite:
So we don't do Ignition twice, but all of the systemd units are conditionalized to only run in one case or the other.
The text was updated successfully, but these errors were encountered: