-
Notifications
You must be signed in to change notification settings - Fork 305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial composefs integration #2640
Conversation
Cool. There's a lot going on here, and my understanding at the high level is still a bit fuzzy. But, if I'm understanding things correctly, this "checkout" is really something more like "translate ostree commit into image", right? If so would it be clearer as its own subcommand, like How many of the checkout options would we really want to use/support for this? |
Yeah, maybe composefs-generate is better. I'm not sure how important all the features are, but we want to support everything that rpm-ostree uses, which is at least the union stuff. You would know better than I here. |
Basically, wherever rpm-ostree now does a checkout, we're likely to want to make a composefs instead, using the same features. |
This is really interesting stuff :). FUSE has a "passthrough" optimisation where userspace deals with permissions, etc. but then can return a file descriptor to the kernel such that all reads/writes don't have any FUSE overhead: https://lwn.net/Articles/843873/ . I'd planned to use this in my ostree-fuse filesystem, but I've mostly lost interested due to the amount of work that would be required to bring the rust FUSE library (fuser) up to scratch.
|
Rebased on top of the latest composefs tree. This version has a nicer integration with the build-system, and now has a This is still WIP though. For this to be more complete we would need:
|
With the latest commit I was able to boot an ostree system build from the latest fedora coreos (built using coreos-assembler) with a bunch of workaround hacks:
Note that the above still runs without fserity enabled for the files. That is gonna need more FCOS plumbing to ensure we get fsverity enabled on the files in the ostree repo. |
Also, with the above rpm-ostreed.service fails with: Status: "error: Couldn't start daemon: Error setting up sysroot: loading sysroot: Unexpected state: /run/ostree-booted found and in / sysroot, but bootloader entry not found" But I believe this is fallout from the disabling of rdcore. |
Nice! I think it is likely indeed that we'd need to adapt |
Its not just that overlayfs can have multiple sources, they in turn can be weird filesystems (like another overlayfs) so you would have to recurse through the whole stack. |
Yep that's fine, the rootmap code already understands recursion because block devices can already be stacked in that way (dm-crypt on raid etc.) |
@alexlarsson: PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@cgwalters How do you currently find stacked block devices like that? We might need to change composefs so that its possible to find the underlying fs easily. For example, by canonicalizing the mount source path (its currenly just what was given). |
WDYT about making this a build time option? I think it'd make merging easier. |
I rebased this work on the latest composefs userspace and switched it to use the new overlay+erofs approach. This means that this code works as-is on a non-modified kernel, although it will require kernel changes to use the fs-verity validating aspects of composefs. For now, this is just the initial work of using composfs at all, there is not work on fs-verity. For that to happen we need further work to:
Additionally we need to consider exactly how/when to enable composefs. Right now the patch just always uses it when built-in, but that may not always work or be wanted. However, we also don't want anyone to be able to just disable composefs if the OS relies on it for security. Another question is about the deploy dir, currently we both deploy and create the composefs image. Maybe we should only deploy the /etc directory if composefs is used? |
I'm currently working on composefs use in ostree: ostreedev/ostree#2640 I'm running into issues with ignition when testing it. Several places look at /sysroot and try to find the backing filesystem for it. This works because /sysroot is typically a bind-mount of some deploy directory in the real sysroot. However, in the composefs case, /sysroot is a composefs mount and doesn't have a real backing block device. This change switches these cases to look at /sysroot/sysroot instead. This is the actual real sysroot after ostree-prepare-root did its work (and before pivot-rooting into it).
I've been testing this with fedora-coreos, but that needs some fixes to work properly: |
49dafee
to
7aaf24c
Compare
The composefs code will need this.
This supports checking out a commit into a tree which is then converted into a composefs image containing fs-verity digests for all the regular files, and payloads that are relative to a the `repo/objects` directory of a bare ostree repo. Some specal files are always created in the image. This ensures that various directories (usr, etc, boot, var, sysroot) exists in the created image, even if they were not in the source commit. These are needed (as bindmount targets) if you want to boot from the image. In the non-composefs case these are just created as needed in the checked out deploydir, but we can't do that here. This is all controlled by the new ex-integrity config section, which has the following layout: ``` [ex-integrity] fsverity=yes/no/maybe composefs=yes/no/maybe composefs-apply-sig=yes/no composefs-add-metadata=yes/no composefs-keyfiile=/a/path composefs-certfile=/a/path ``` The `fsverity` key overrides the old `ex-fsverity` section if specified. The default for all these is for the new behaviour to be disabled. Additionally, enabling composefs implies fsverity defaults to `maybe`, to avoid having to set both.
If `composefs-apply-sig` is enabled (default no) we add an ostree.composefs digest to the commit metadata. This can be verified on deploy. This is a separate option from the generic `composefs` option which controls whether composefs is used during deploy. It is separate because we want to not have to force use of fs-verity, etc during the build. If the `composefs-certfile` and `composefs-keyfile` keys in the ex-integrity group are set, then the commit metadata also gets a ostree.composefs-sig containing the signature of the composefs file.
This can be used as a composefs source for the root fs instead of the checkout by pointing the basedir to /ostree/repo/objects. We only write the file is `composefs` is enabled. We enable ensure_rootfs_dirs when building the image which adds the required root dirs to the image. In particular, this includes /etc which often isn't in ostree commits in use. We also create an (empty) .ostree.mnt directory, where composefs will mount the erofs image that will be used as overlayfs lowerdir for the root overlayfs mount. This way we can find the deploy dir from the root overlayfs mount options. If the commit has composefs digests recorded we verify those with the created file. It also applies the fs-verity signature if it is recorded, unless this is disabled with the ex-integrity.composefs-apply-sign=false option.
In many cases, such as when using osbuild, we are not preparing the final deployment but rather a rootfs tree that will eventually be copied to the final location. In that case we don't want to apply the signature directly but when the deployment is copied in place. To make this situateion workable we also write the signature to a file next to the composefs image file. Then whatever mechanism that does the final copy can apply the signature.
This changes it into read_proc_cmdline_key(), as this will later be used to read additional keys.
This changes ostree-prepare-root to use the .ostree.cfs image as a composefs filesystem, instead of the checkout. By default, composefs is used if support is built in and the .ostree.cfs file exists in the deploy dir, otherwise we fall back to the old method. However, if the ot-composefs kernel option is specified this can be tweaked as per: * off: Never use composefsz * maybe: Use if possible * on: Fail if not possible * signed: Fail if the cfs image is not fs-verity signed with a key in the keyring. * digest=....: Fail if the cfs image does not match the specified digest. The final layout when composefs is active is: / ro overlayfs mount for composefs /sysroot "real" root /etc rw bind mount to $deploydir/etc /var rw bind mount to $vardir We also specify the $deploydir/.ostree-mnt directory as the (internal) mountpoint for the erofs mount for composefs. This can be used to map the root fs back to the deploy id/dir in use, A further note: I didn't test the .usr-ovl-work overlayfs case, but a comment mentions that you can't mount overlayfs on top of a readonly mount. That seems incompatible with composefs. If this is needed we have to merge that with the overlayfs that composefs itself sets up, which is possible with the libcomposefs APIs.
In the case of composefs, we cannot compare the devino of the rootfs and the deploy dir, because the root is the composefs mount, not a bind mount. Instead we check the devino of the etc subdir of the deploy, because this is a bind mount even when using composefs.
When using composefs the root fs will always be read-only, but in this case we should still continue remounting /sysroot. So, we record a /run/ostree-composefs-root.stamp file in ostree-prepare-root if composefs is used, and then react to it in ostree-remount.
Instead of using pkg-config, etc we just include composefs. In the end the library is just 5 c source files, and it is set up to be easy to use as a submodule. For now, composefs support is disabled by default.
This enables --with-composefs on: * Fedora Latest * Debian Testing * Ubuntu Latest These all should have new enough version of dependencies.
I have a rebased version of this almost ready, but i'm finishing up some details. Hopefully done soon. |
We can't safely apply the fs-verity with signature until we have booted with the new initrd, because the public key that matches the signature is loaded from it. So, instead we save the .sig file next to the compoosefs, and on the first boot we detect that it is there, and the composefs file isn't fs-verity, so we apply it. Things get a bit more complex due to having to temporarily make /sysroot read-write for the fsverity operation too.
Ok, I verified this patchset both with FCOS as well as the rhivos manifest, the later including both a signed commit (but not actually a signed initrd) as well as an update cycle. I think this is good to go in as a "version zero" experimental support. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just nits
* image during deploy. */ | ||
char buf[sizeof (struct fsverity_digest) + OSTREE_SHA256_DIGEST_LEN]; | ||
|
||
if (G_IS_UNIX_INPUT_STREAM (input)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should change this to an assertion, and then just error out very early on if we're not operating on a bare mode repository.
* This is the typical case when we're pulled into the target | ||
* system repo with verity on and are recreating the composefs | ||
* image during deploy. */ | ||
char buf[sizeof (struct fsverity_digest) + OSTREE_SHA256_DIGEST_LEN]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a best practice to use FS_VERITY_MAX_DIGEST_SIZE
here instead of OSTREE_SHA256_DIGEST_LEN
?
(We could also add a static assertion that the latter is less than the former)
g_autoptr (GVariant) dirtree = NULL; | ||
g_autoptr (GVariant) dirmeta = NULL; | ||
g_autoptr (GVariant) xattrs = NULL; | ||
struct lcfs_node_s *directory; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here and everywhere - in newer ostree code we prefer C99-style declare-and-initialize.
OstreeRepoFile *source, GCancellable *cancellable, GError **error) | ||
{ | ||
#ifdef HAVE_COMPOSEFS | ||
char *root_dirs[] = { "usr", "etc", "boot", "var", "sysroot" }; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll probably want this to be more configurable in the future.
return TRUE; | ||
#else | ||
g_set_error (error, G_IO_ERROR, G_IO_ERROR_NOT_SUPPORTED, | ||
"Composeefs is not supported in this ostree build"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/composeefs/composefs/
(Also this error is duplicated in a few places, let's add a single helper function to set the error for this case)
return glnx_throw (error, "Expected composefs fs-verity in metadata has the wrong size"); | ||
|
||
expected_digest = g_variant_get_data (metadata_composefs); | ||
if (memcmp (fsverity_digest, expected_digest, OSTREE_SHA256_DIGEST_LEN) != 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be good to have a helper function for this that centralizes our checksum comparisons.
@@ -799,14 +799,26 @@ parse_deployment (OstreeSysroot *self, const char *boot_link, OstreeDeployment * | |||
if (looking_for_booted_deployment) | |||
{ | |||
struct stat stbuf; | |||
struct stat etc_stbuf = {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the {}
just for this one?
{ | ||
int fd; | ||
|
||
fd = open (path, O_RDONLY); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
O_CLOEXEC
too...
@@ -121,4 +129,65 @@ touch_run_ostree (void) | |||
(void)close (fd); | |||
} | |||
|
|||
static inline unsigned char * | |||
read_file (const char *path, size_t *out_len) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm maybe we should just link in glib again here.
if (signature != NULL) | ||
{ | ||
/* If we're read-only we temporarily make it read-write to sign the image */ | ||
if (!sysroot_currently_writable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems a bit ugly. I am not sure I see the value in supporting the IMO weird historical case of "mount rootfs readonly in initramfs, remount rw in real root" if we're just playing games like this by mounting it readonly, then writable, then readonly again, then writable again!
IOW, we can just require rw
on the kernel cmdline.
Test just failed on a single flake. |
@cgwalters: Overrode contexts on behalf of cgwalters: continuous-integration/jenkins/pr-merge In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Yes, agreed. There'll be a lot of followup, but it'd be good to start making it easier to play with "hands on" in git main. |
This pairs with ostreedev/ostree#2640 It's all off by default (to state the obvious). But one can do e.g.: ``` $ cat >> src/config/image.yaml << EOF rootfs: ext4verity composefs: unsigned EOF ``` You can also try out `composefs: signed` and also do: ``` $ mkdir -p secrets $ openssl req -newkey rsa:4096 -nodes -keyout secrets/root-composefs-key.pem -x509 -out secrets/root-composefs-cert.pem ``` (But this is not *yet* a focus) More in ostreedev/ostree#2867
To aid ostreedev/ostree#2640 Signed-off-by: Colin Walters <[email protected]>
This pairs with ostreedev/ostree#2640 It's all off by default (to state the obvious). But one can do e.g.: ``` $ cat >> src/config/image.yaml << EOF rootfs: ext4verity composefs: unsigned EOF ``` You can also try out `composefs: signed` and also do: ``` $ mkdir -p secrets $ openssl req -newkey rsa:4096 -nodes -keyout secrets/root-composefs-key.pem -x509 -out secrets/root-composefs-cert.pem ``` (But this is not *yet* a focus) More in ostreedev/ostree#2867
This pairs with ostreedev/ostree#2640 It's all off by default (to state the obvious). But one can do e.g.: ``` $ cat >> src/config/image.yaml << EOF composefs: true EOF ``` To test out the fsverity bits, you also want `rootfs: ext4verity` More in ostreedev/ostree#2867
fsverity builtin signatures (CONFIG_FS_VERITY_BUILTIN_SIGNATURES) aren't the only way to do signatures with fsverity, and they have some major limitations. Yet, more users have tried to use them, e.g. recently by ostreedev/ostree#2640. In most cases this seems to be because users aren't sufficiently familiar with the limitations of this feature and what the alternatives are. Therefore, make some updates to the documentation to try to clarify the properties of this feature and nudge users in the right direction. Note that the Integrity Policy Enforcement (IPE) LSM, which is not yet upstream, is planned to use the builtin signatures. (This differs from IMA, which uses its own signature mechanism.) For that reason, my earlier patch "fsverity: mark builtin signatures as deprecated" (https://lore.kernel.org/r/[email protected]), which marked builtin signatures as "deprecated", was controversial. This patch therefore stops short of marking the feature as deprecated. I've also revised the language to focus on better explaining the feature and what its alternatives are. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Eric Biggers <[email protected]>
fsverity builtin signatures (CONFIG_FS_VERITY_BUILTIN_SIGNATURES) aren't the only way to do signatures with fsverity, and they have some major limitations. Yet, more users have tried to use them, e.g. recently by ostreedev/ostree#2640. In most cases this seems to be because users aren't sufficiently familiar with the limitations of this feature and what the alternatives are. Therefore, make some updates to the documentation to try to clarify the properties of this feature and nudge users in the right direction. Note that the Integrity Policy Enforcement (IPE) LSM, which is not yet upstream, is planned to use the builtin signatures. (This differs from IMA, which uses its own signature mechanism.) For that reason, my earlier patch "fsverity: mark builtin signatures as deprecated" (https://lore.kernel.org/r/[email protected]), which marked builtin signatures as "deprecated", was controversial. This patch therefore stops short of marking the feature as deprecated. I've also revised the language to focus on better explaining the feature and what its alternatives are. Link: https://lore.kernel.org/r/[email protected] Signed-off-by: Eric Biggers <[email protected]>
fsverity builtin signatures (CONFIG_FS_VERITY_BUILTIN_SIGNATURES) aren't the only way to do signatures with fsverity, and they have some major limitations. Yet, more users have tried to use them, e.g. recently by ostreedev/ostree#2640. In most cases this seems to be because users aren't sufficiently familiar with the limitations of this feature and what the alternatives are. Therefore, make some updates to the documentation to try to clarify the properties of this feature and nudge users in the right direction. Note that the Integrity Policy Enforcement (IPE) LSM, which is not yet upstream, is planned to use the builtin signatures. (This differs from IMA, which uses its own signature mechanism.) For that reason, my earlier patch "fsverity: mark builtin signatures as deprecated" (https://lore.kernel.org/r/[email protected]), which marked builtin signatures as "deprecated", was controversial. This patch therefore stops short of marking the feature as deprecated. I've also revised the language to focus on better explaining the feature and what its alternatives are. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Colin Walters <[email protected]> Signed-off-by: Eric Biggers <[email protected]>
fsverity builtin signatures (CONFIG_FS_VERITY_BUILTIN_SIGNATURES) aren't the only way to do signatures with fsverity, and they have some major limitations. Yet, more users have tried to use them, e.g. recently by ostreedev/ostree#2640. In most cases this seems to be because users aren't sufficiently familiar with the limitations of this feature and what the alternatives are. Therefore, make some updates to the documentation to try to clarify the properties of this feature and nudge users in the right direction. Note that the Integrity Policy Enforcement (IPE) LSM, which is not yet upstream, is planned to use the builtin signatures. (This differs from IMA, which uses its own signature mechanism.) For that reason, my earlier patch "fsverity: mark builtin signatures as deprecated" (https://lore.kernel.org/r/[email protected]), which marked builtin signatures as "deprecated", was controversial. This patch therefore stops short of marking the feature as deprecated. I've also revised the language to focus on better explaining the feature and what its alternatives are. Link: https://lore.kernel.org/r/[email protected] Reviewed-by: Colin Walters <[email protected]> Reviewed-by: Luca Boccassi <[email protected]> Signed-off-by: Eric Biggers <[email protected]>
This pairs with ostreedev/ostree#2640 It's all off by default (to state the obvious). But one can do e.g.: ``` $ cat >> src/config/image.yaml << EOF composefs: true EOF ``` To test out the fsverity bits, you also want `rootfs: ext4verity` More in ostreedev/ostree#2867
This is some work in progress for supporting composefs in ostree.
The initial version just supports checking out a commit to a composefs image.