-
Notifications
You must be signed in to change notification settings - Fork 198
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
β²π¦ Implement rpm-ostree rojig #1081
Comments
First, let's use the term
But for supportability reasons, I think we need both rpmostree-jigdo and "traditional" ostree repos (can I call that traditional yet?). Also implementation wise, I think it will be easier to reason about things if this is implemented as a magical postprocessing step given an ostree commit. Hence:
or so? And then we have a:
Which inverts things back, for ease of testing. |
is the plan for us to support both traditional ostree and OIRPM distribution methods forever? I know at least one thing that doesn't work for us in Fedora is that we don't keep around old rpms so deploying older comits would probably not be possible (or even the 'latest' released commit that was released two weeks ago and a newer version of an rpm within that commit has hit stable). Fedora is not the only use case and it's possible things could change there. I see OIRPM as an augmentation to traditional repository. i.e. the client could choose which one they want to use. The traditional repo would still be the source of truth, but the rpms could be used to distribute things via RPM channels if some people couldn't mirror around ostrees within a big organization for some reason. How do you see it? Augmenting a traditional ostree server or completely replacing it? Also, how would we handle deltas? Maybe they are no longer needed if the rpms from the previous jigdo are cached and you only need to download the drpms of the new ones. |
Doesn't currently keep older RPMs on the mirrors - yes, today, but it certainly seems to me we can fix that, or just use S3/Cloudfront/whatever. Or create a repo subset - the things interesting to pkglayer on AH I suspect is a lot smaller than the total. |
We already called the `workaround_selinux_cross_labeling_recurse()` in the postprocessing path, there's no need to do it again during commit. Just making this change as I was going to do some SePolicy stuff for the [jigdo work](coreos#1081) and stumbled again into the ugly mess that is the cross-labeling hack.
We already called the `workaround_selinux_cross_labeling_recurse()` in the postprocessing path, there's no need to do it again during commit. Just making this change as I was going to do some SePolicy stuff for the [jigdo work](#1081) and stumbled again into the ugly mess that is the cross-labeling hack. Closes: #1082 Approved by: jlebon
Just to briefly respond regarding deltas: My thoughts are that we have another OIRPM type which has ostree deltas, and not deltaRPM, because...deltaRPM has a lot of flaws, or conversely ostree deltas are way better than deltaRPM. IOW we'd have
Briefly the big flaws of deltaRPM are:
|
Having carefully reread over the jigdo docs, I have a somewhat better grasp of the design here and it definitely sounds interesting. How large do you expect the OIRPM to be here? Also, what are your thoughts about how signature verification will work? We don't want to download all the RPMs ahead of time -- so it seems like we have to download the OIRPM first, unpack it, check the signature, and only then proceed to download new RPMs? Or alternatively use RPM signatures? Either way, it sounds like we'd have to download the whole OIRPM before checking that it has a valid signature, which is unfortunate. Or are you approaching this a different way? |
The baseline OIRPM for Fedora Atomic Host is currently: As far as signatures β I think it's going to be annoying to people doing "mirror all RPMs with gpgcheck=1" unless we do RPM signatures. But just like static deltas we'd embed the ostree commit meta (hm I need to implement that). We'd unpack the RPM and check the ostree signature too (if enabled). |
The "--ex" prefix here means it's an experimental option. A tremendous change here is that start to support non-uid 0, but there are various things to fix there; the unpacker for example needs to learn to set imported objects fully based on the rpmfi information (i.e. default to uid 0, since libarchive gives the current uid by default). And even when run as uid 0, there are some bugs, though I'm not sure of any showstoppers yet. For example, dracut's `dracut-install` calls `cp --preserve=xattrs` which fails to copy the `user.ostreemeta` xattrs from a checkout (it shouldn't be copying that anyways...) Nevertheless, the infrastructure behind this really helps (is almost a hard requirement for) the [jigdo effort](coreos#1081). Which is really only true due to SELinux - we need to import the packages, then generate the final tree to get the final policy, then use that policy to relabel all of the packages.
The "--ex" prefix here means it's an experimental option. A tremendous change here is that start to support non-uid 0, but there are various things to fix there; the unpacker for example needs to learn to set imported objects fully based on the rpmfi information (i.e. default to uid 0, since libarchive gives the current uid by default). And even when run as uid 0, there are some bugs, though I'm not sure of any showstoppers yet. For example, dracut's `dracut-install` calls `cp --preserve=xattrs` which fails to copy the `user.ostreemeta` xattrs from a checkout (it shouldn't be copying that anyways...) Nevertheless, the infrastructure behind this really helps (is almost a hard requirement for) the [jigdo effort](coreos#1081). Which is really only true due to SELinux - we need to import the packages, then generate the final tree to get the final policy, then use that policy to relabel all of the packages.
The "--ex" prefix here means it's an experimental option. A tremendous change here is that start to support non-uid 0, but there are various things to fix there; the unpacker for example needs to learn to set imported objects fully based on the rpmfi information (i.e. default to uid 0, since libarchive gives the current uid by default). And even when run as uid 0, there are some bugs, though I'm not sure of any showstoppers yet. For example, dracut's `dracut-install` calls `cp --preserve=xattrs` which fails to copy the `user.ostreemeta` xattrs from a checkout (it shouldn't be copying that anyways...) Nevertheless, the infrastructure behind this really helps (is almost a hard requirement for) the [jigdo effort](coreos#1081). Which is really only true due to SELinux - we need to import the packages, then generate the final tree to get the final policy, then use that policy to relabel all of the packages.
The "--ex" prefix here means it's an experimental option. A tremendous change here is that start to support non-uid 0, but there are various things to fix there; the unpacker for example needs to learn to set imported objects fully based on the rpmfi information (i.e. default to uid 0, since libarchive gives the current uid by default). And even when run as uid 0, there are some bugs, though I'm not sure of any showstoppers yet. For example, dracut's `dracut-install` calls `cp --preserve=xattrs` which fails to copy the `user.ostreemeta` xattrs from a checkout (it shouldn't be copying that anyways...) Nevertheless, the infrastructure behind this really helps (is almost a hard requirement for) the [jigdo effort](#1081). Which is really only true due to SELinux - we need to import the packages, then generate the final tree to get the final policy, then use that policy to relabel all of the packages. Closes: #940 Approved by: jlebon
I'm still trying to wrap my head around this. IIUC, this means that in order to support deploying older commits, the content provider will also have to keep around every RPM referenced in those commits, right? Seems like a pretty big change in RPM management. Or I suppose we could at the very least just keep "delta-OIRPMs" to go from stable to the specific version? But those would still have to carefully managed separately. Of course, right now on traditional yum/dnf-managed Fedora systems, you only have access to stable and updates snapshots too. Though having access to older versions is definitely one of the nice features of AH. |
Both CentOS and RHEL keep older RPM versions. It's only Fedora that's a standout. That said clearly we would need to drive versioning of the rpm-md repos more consistently and expose it. |
BTW just from today: https://lwn.net/Articles/740052/ |
Tracking issue: coreos#1081 To briefly recap: Let's experiment with doing ostree-in-RPM, basically the "compose" process injects additional data (SELinux labels for example) in an "ostree image" RPM, like `fedora-atomic-host-27.8-1.x86_64.rpm`. That "ostree image" RPM will contain the OSTree commit+metadata, and tell us what RPMs we need need to download. For updates, like `yum update` we only download changed RPMs, plus the new "oirpm". But SELinux labeling, depsolving, etc. are still done server side, and we still have a reliable OSTree commit checksum. This is a lot like [Jigdo](http://atterer.org/jigdo/) Here we fully demonstrate the concept working end-to-end; we use the "traditional" `compose tree` to commit a bunch of RPMs to an OSTree repo, which has a checksum, version etc. Then the new `ex commit2jigdo` generates the "oirpm". This is the "server side" operation. Next simulating the client side, `jigdo2commit` takes the OIRPM and uses it and downloads the "jigdo set" RPMs, fully regenerating *bit for bit* the final OSTree commit. If you want to play with this, I'd take a look at the `test-jigdo.sh`; from there you can find other useful bits like the example `fedora-atomic-host.spec` file (though the canonical copy of this will likely land in the [fedora-atomic](http://pagure.io/fedora-atomic) manifest git repo.
Tracking issue: coreos#1081 To briefly recap: Let's experiment with doing ostree-in-RPM, basically the "compose" process injects additional data (SELinux labels for example) in an "ostree image" RPM, like `fedora-atomic-host-27.8-1.x86_64.rpm`. That "ostree image" RPM will contain the OSTree commit+metadata, and tell us what RPMs we need need to download. For updates, like `yum update` we only download changed RPMs, plus the new "oirpm". But SELinux labeling, depsolving, etc. are still done server side, and we still have a reliable OSTree commit checksum. This is a lot like [Jigdo](http://atterer.org/jigdo/) Here we fully demonstrate the concept working end-to-end; we use the "traditional" `compose tree` to commit a bunch of RPMs to an OSTree repo, which has a checksum, version etc. Then the new `ex commit2jigdo` generates the "oirpm". This is the "server side" operation. Next simulating the client side, `jigdo2commit` takes the OIRPM and uses it and downloads the "jigdo set" RPMs, fully regenerating *bit for bit* the final OSTree commit. If you want to play with this, I'd take a look at the `test-jigdo.sh`; from there you can find other useful bits like the example `fedora-atomic-host.spec` file (though the canonical copy of this will likely land in the [fedora-atomic](http://pagure.io/fedora-atomic) manifest git repo.
OK so let's start talking about how this looks/feels on the client side. Strawman: Actually before we dive in here let's make a presupposition: we rework the Upgrading
Well OK I omitted layered pkg updates but eh. deploy
rebase
Adding the repo here automatically enables it; this is like |
Tracking issue: coreos#1081 To briefly recap: Let's experiment with doing ostree-in-RPM, basically the "compose" process injects additional data (SELinux labels for example) in an "ostree image" RPM, like `fedora-atomic-host-27.8-1.x86_64.rpm`. That "ostree image" RPM will contain the OSTree commit+metadata, and tell us what RPMs we need need to download. For updates, like `yum update` we only download changed RPMs, plus the new "oirpm". But SELinux labeling, depsolving, etc. are still done server side, and we still have a reliable OSTree commit checksum. This is a lot like [Jigdo](http://atterer.org/jigdo/) Here we fully demonstrate the concept working end-to-end; we use the "traditional" `compose tree` to commit a bunch of RPMs to an OSTree repo, which has a checksum, version etc. Then the new `ex commit2jigdo` generates the "oirpm". This is the "server side" operation. Next simulating the client side, `jigdo2commit` takes the OIRPM and uses it and downloads the "jigdo set" RPMs, fully regenerating *bit for bit* the final OSTree commit. If you want to play with this, I'd take a look at the `test-jigdo.sh`; from there you can find other useful bits like the example `fedora-atomic-host.spec` file (though the canonical copy of this will likely land in the [fedora-atomic](http://pagure.io/fedora-atomic) manifest git repo.
Tracking issue: coreos#1081 To briefly recap: Let's experiment with doing ostree-in-RPM, basically the "compose" process injects additional data (SELinux labels for example) in an "ostree image" RPM, like `fedora-atomic-host-27.8-1.x86_64.rpm`. That "ostree image" RPM will contain the OSTree commit+metadata, and tell us what RPMs we need need to download. For updates, like `yum update` we only download changed RPMs, plus the new "oirpm". But SELinux labeling, depsolving, etc. are still done server side, and we still have a reliable OSTree commit checksum. This is a lot like [Jigdo](http://atterer.org/jigdo/) Here we fully demonstrate the concept working end-to-end; we use the "traditional" `compose tree` to commit a bunch of RPMs to an OSTree repo, which has a checksum, version etc. Then the new `ex commit2jigdo` generates the "oirpm". This is the "server side" operation. Next simulating the client side, `jigdo2commit` takes the OIRPM and uses it and downloads the "jigdo set" RPMs, fully regenerating *bit for bit* the final OSTree commit. If you want to play with this, I'd take a look at the `test-jigdo.sh`; from there you can find other useful bits like the example `fedora-atomic-host.spec` file (though the canonical copy of this will likely land in the [fedora-atomic](http://pagure.io/fedora-atomic) manifest git repo.
Tracking issue: #1081 To briefly recap: Let's experiment with doing ostree-in-RPM, basically the "compose" process injects additional data (SELinux labels for example) in an "ostree image" RPM, like `fedora-atomic-host-27.8-1.x86_64.rpm`. That "ostree image" RPM will contain the OSTree commit+metadata, and tell us what RPMs we need need to download. For updates, like `yum update` we only download changed RPMs, plus the new "oirpm". But SELinux labeling, depsolving, etc. are still done server side, and we still have a reliable OSTree commit checksum. This is a lot like [Jigdo](http://atterer.org/jigdo/) Here we fully demonstrate the concept working end-to-end; we use the "traditional" `compose tree` to commit a bunch of RPMs to an OSTree repo, which has a checksum, version etc. Then the new `ex commit2jigdo` generates the "oirpm". This is the "server side" operation. Next simulating the client side, `jigdo2commit` takes the OIRPM and uses it and downloads the "jigdo set" RPMs, fully regenerating *bit for bit* the final OSTree commit. If you want to play with this, I'd take a look at the `test-jigdo.sh`; from there you can find other useful bits like the example `fedora-atomic-host.spec` file (though the canonical copy of this will likely land in the [fedora-atomic](http://pagure.io/fedora-atomic) manifest git repo. Closes: #1103 Approved by: jlebon
Let me try to answer this:
For Fedora/CentOS/RHEL: my instinct here is that the answer has to be "completely replacing" - otherwise we're not really gaining the intended benefit of reducing releng/management overhead right?. For other organizations...I don't know? My instinct here is that the people interested in rpm-ostree are going to be happiest with jigdo but I can't say for sure. I would like rpm-ostree to work in "pure ostree" mode (I heard someone compain that rpm-ostree fails on start if there's no rpmdb). And the people who are using libostree today are likely happy enough with it...the audience there is a lot of Debian and OpenEmbedded. It seems not unlikely that some of those OpenEmbedded users will also want jigdo though? That said I don't see us deleting any of the existing functionality; or stated more positively, we're just adding an alternative in the code for now. The whole test suite continues to pass. But it is possible that some new feature would only be supported in jigdo mode perhaps? |
That seems kinda cool, but feels a bit too heuristicky/mysterious. It seems conceptually simpler to just stick this information in the origin, no? E.g. have a In that case, rebasing to a jigdo origin that's not in the currently enabled repos would just use
How does this work? Is it just doing a pure pkg search for
It would be cool if we supported seamlessly moving back and forth between "ostree-based" and "jigdo-based" remotes. If we don't overload what remotes means, then keeping the
? |
How about Basically I really like the idea of "repo pinning" the jigdo origin. (Not implemented in
Ooh those are cool ideas π. In fact I may just go implement them now! |
What's happened up till now is supporting `rojig://` in the same way as `ostree://`. However, part of the high level goal here is to reduce the need for system administrators to understand ostree. This patch set starts to introduce some of the ideas for client-side changes as part of jigdo β²π¦: coreos#1081 (comment) Basically, we can use the `NEVRA` of the jigdoRPM instead of displaying `Version`. Also, let's be opinionated here and entirely drop the `Commit` checksum by default. I believe the Cockpit guys were right here - versions are for humans. The fact that we have a checksum is powerful; and we still show it with `status -v`. The way I think of it is: the checksum shows we're really an image system. But we don't need to show it by default.
There are a few cases for knowing whether a commit has identical content to another commit. Some people want to do a "promotion workflow", where the content of a commit on a tesitng branch is then "promoted" to a production branch with `ostree commit --tree=ref`. Another use case I just hit in rpm-ostree deals with [jigdo](coreos/rpm-ostree#1081) where we're importing RPMs on both the client and server, and will be using the content checksum, since the client/server cases inject different metadata into the commit object. Closes: #1315 Closes: #1449 Approved by: jlebon
What's happened up till now is supporting `rojig://` in the same way as `ostree://`. However, part of the high level goal here is to reduce the need for system administrators to understand ostree. This patch set starts to introduce some of the ideas for client-side changes as part of jigdo β²π¦: coreos#1081 (comment) Concretely, we start using `${repo}:${nevra}` instead of `rojig://`. (v2): Keep `Version` (plus timestamp) as a split out field for maximum visual aid. Also, let's be opinionated here and entirely drop the `Commit` checksum by default. I believe the Cockpit guys were right here - versions are for humans. The fact that we have a checksum is powerful; and we still show it with `status -v`. The way I think of it is: the checksum shows we're really an image system. But we don't need to show it by default.
What's happened up till now is supporting `rojig://` in the same way as `ostree://`. However, part of the high level goal here is to reduce the need for system administrators to understand ostree. This patch set starts to introduce some of the ideas for client-side changes as part of jigdo β²π¦: coreos#1081 (comment) Concretely, we start using `${repo}:${nevra}` instead of `rojig://`. (v2): Keep `Version` (plus timestamp) as a split out field for maximum visual aid. Also, let's be opinionated here and entirely drop the `Commit` checksum by default. I believe the Cockpit guys were right here - versions are for humans. The fact that we have a checksum is powerful; and we still show it with `status -v`. The way I think of it is: the checksum shows we're really an image system. But we don't need to show it by default.
What's happened up till now is supporting `rojig://` in the same way as `ostree://`. However, part of the high level goal here is to reduce the need for system administrators to understand ostree. This patch set starts to introduce some of the ideas for client-side changes as part of jigdo β²π¦: coreos#1081 (comment) Concretely, we start using `${repo}:${nevra}` instead of `rojig://`. (v2): Keep `Version` (plus timestamp) as a split out field for maximum visual aid. Also, let's be opinionated here and entirely drop the `Commit` checksum by default. I believe the Cockpit guys were right here - versions are for humans. The fact that we have a checksum is powerful; and we still show it with `status -v`. The way I think of it is: the checksum shows we're really an image system. But we don't need to show it by default.
What's happened up till now is supporting `rojig://` in the same way as `ostree://`. However, part of the high level goal here is to reduce the need for system administrators to understand ostree. This patch set starts to introduce some of the ideas for client-side changes as part of jigdo β²π¦: #1081 (comment) Concretely, we start using `${repo}:${nevra}` instead of `rojig://`. (v2): Keep `Version` (plus timestamp) as a split out field for maximum visual aid. Also, let's be opinionated here and entirely drop the `Commit` checksum by default. I believe the Cockpit guys were right here - versions are for humans. The fact that we have a checksum is powerful; and we still show it with `status -v`. The way I think of it is: the checksum shows we're really an image system. But we don't need to show it by default. Closes: #1240 Approved by: jlebon
As part of [rpm-ostree jigdo β²π¦](coreos/rpm-ostree#1081) I'd like to make it cheaper to fetch metadata. For Fedora currently the filelists metadata is enormous. In jigdo mode, we don't need it, so let's add APIs to avoid fetching it.
As part of [rpm-ostree jigdo β²π¦](coreos/rpm-ostree#1081) I'd like to make it cheaper to fetch metadata. For Fedora currently the filelists metadata is enormous. In jigdo mode, we don't need it, so let's add APIs to avoid fetching it. Closes: #420 Approved by: cgwalters
As part of [rpm-ostree jigdo β²π¦](coreos/rpm-ostree#1081) I'd like to make it cheaper to fetch metadata. For Fedora currently the filelists metadata is enormous. In jigdo mode, we don't need it, so let's add APIs to avoid fetching it. Closes: #420 Approved by: cgwalters
As part of [rpm-ostree jigdo β²π¦](coreos/rpm-ostree#1081) I'd like to make it cheaper to fetch metadata. For Fedora currently the filelists metadata is enormous. In jigdo mode, we don't need it, so let's add APIs to avoid fetching it.
As part of [rpm-ostree jigdo β²π¦](coreos/rpm-ostree#1081) I'd like to make it cheaper to fetch metadata. For Fedora currently the filelists metadata is enormous. In jigdo mode, we don't need it, so let's add APIs to avoid fetching it.
As part of [rpm-ostree jigdo β²π¦](coreos/rpm-ostree#1081) I'd like to make it cheaper to fetch metadata. For Fedora currently the filelists metadata is enormous. In jigdo mode, we don't need it, so let's add APIs to avoid fetching it. Closes: #420 Approved by: <try>
As part of [rpm-ostree jigdo β²π¦](coreos/rpm-ostree#1081) I'd like to make it cheaper to fetch metadata. For Fedora currently the filelists metadata is enormous. In jigdo mode, we don't need it, so let's add APIs to avoid fetching it.
With this, rojig becomes significantly closer to a peer of ostree mode. Previously, providing a `--repo` argument was required. Now, if an OSTree repository is not specified, then the compose process assumes that *just* rojig output is desired. One of the nicer parts of rpm-ostree I think is the "change detection" bit. That's now implemented for the rojig side by performing a libdnf query of the input rpm-md repo to find the last rojig rpm, and since we now have the inputhash as a `Provides`, we can quickly and efficiently implement change detection there. The biggest gap left in my opinion is that rpm-md repositories don't have any *native* concept of versioning. coreos#1081
rojig hasn't really gained traction so far as FCOS and most other variants use vanilla OSTree repos, and RHCOS uses containers. Anyway, as far as this ticket is concerned, it's been implemented now, so closing. |
Update 2018-02-22: This effort has been renamed to "rojig" to clarify the distinction from the original "jigdo" project.
Fetching content via both ostree and libdnf (RPM) ends up mixing the tradeoffs of both, requires release engineering to manage both, and makes it harder to mirror content. Not to mention the fact that there's the whole OCI/Docker thing which also works in a completely different way, and admins need to manage/mirror that too.
Now while the "obvious" thing would be to try to align with OCI in some way, the complete lack of wire deltas there is very problematic for uses outside of server clusters, and given that we already have lots of extensive linkage to RPM via libdnf, it makes the most sense to move that direction. Another reason for rojig is to address the package layering cadence issue.
Hence, let's experiment with doing ostree-in-RPM, basically the "compose" process injects additional data (SELinux labels for example) in an "ostree image" RPM, like
fedora-atomic-host-27.8-1.x86_64.rpm
. That "ostree image" RPM will have extra data, including information of the "rojig set" of the other RPMs that are needed to download - basically we only download changed RPMs, like a traditionalyum
system. But SELinux labeling, depsolving, etc. are still done server side, and we still have a reliable OSTree commit checksum.We're adopting the idea of Jigdo, hence the terminology. Though to be clear, much of the design and all of the implementation is fundamentally different, so we call this "rojig".
First PR: #1103
Current status: Proof of concept code works: we can take a set of RPMs + extra data (postprocessing), generating an OSTree commit (with reliable checksum), then convert that back into a "rojig set", which is the original RPMs plus a new "rojig RPM". Then a client side can take the "rojig set", just transferring RPM on the wire, reassembling bit for bit the original OSTree commit.
External discussion:
Followup issues:
External work trackers:
Implementation: server side
When doing a
compose tree
, import the packages, and after import check out the final tree and SELinux relabel the imports so that they're reliably updated (currently depends on some unified core π work). Then, we can scan the pkg trees for objects, and find net-new objects that need to go in the oirpm.I implemented that pretty quickly and hit on the issue that we have the initramfs 3 times in the tree (due to selinux labels), so I then implemented code to detect that; while we have 3 copies on disk, we can easily avoid that on the wire.
The big fundamental issue is SELinux - without that this would be way easier. The basic problem there is that we need to carry the SELinux labels in the oirpm, because the packages aren't labeled. So we need to carry a mapping from (pkg, file) β xattrs in the oirpm, and (on the client side) apply those when unpacking.
Implementation: client side
emacs-filesystem
androotfiles
etc., though we could just import all of them)ostree pull
- after this we proceed with the rest of the logic ideally unchangedTask list
The text was updated successfully, but these errors were encountered: