-
Notifications
You must be signed in to change notification settings - Fork 247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Additional Layer Store no longer works for layers without TOC Digest annotations #2188
Comments
Cc: @ktock , this is from #1924 . “Truly lazy” pulling (only reading files on-demand) requires basing layer identities on the TOC identity to authenticate the file contents. FWIW c/image needs to know which of the digests is being used by the backend driver, see e.g. https://github.com/containers/image/blob/5263462aa97ae23d3a620b4dcdbbc557665a42c8/storage/storage_dest.go#L90-L110 . I suppose that would significantly affect the design of the filesystem interface. |
Not necessarily. See Below TLDR; about the underlying AuriStor distributed file system and how AuriStorFS "Layer Volumes" are generated AuriStor's LVGS could instead
However, there is still the question of whether it should be possible for an ALS to support images with layers that do not have a TOC (ie not estargz or zstd:chunked) TLDR; Background on the AuriStor distributed file system, AuriStorFS: In AuriStorFS, the unit of policy/management are named "AFS Volumes" correspond to a rooted directory of content (which can be replicated for high availability, for example). Volumes are scoped to a (typically) organizational 'cell'. So volume named 'foo' in the cell 'myorg.com' can be universally addressed via the following path /afs/.@mount/myorg.com/foo which is handled by the AuriStorFS Kernel module. This can leverage AuriStorFS DNS Service records for myorg.com.- TLDR; Background on AuriStorFS ALS:
|
There are two places where the validation could occur
|
Ah, thanks for the explanation. To summarize, the layers are not (necessarily) pulled on-demand at all, the layer is potentially populated in advance, making the TOC not essential to the backend; ordinary tar layers can be supported. I’m not at all authoritative (and probably not even well-informed) in this space. With that disclaimer, it seems possible to me to support both, by querying the ALS filesystem both by the compressed digest (as in the old version), and by the TOC digest (as in the new version); the backend could then support one or the other, depending on how it authenticates the layer. This would probably have to be another change to the ALS filesystem conventions, so that the backend can tell whether the value is a layer digest or a TOC digest (they can be ambiguous, at least for not very strict readers). That’s not ideal, but probably unavoidable — just reverting the change (is not at really an option and) would also be a change to the ALS filesystem conventions. That’s the limit of my expertise, I’ll defer to @ktock and @giuseppe for anything more detailed, and figuring out coordination / migration for other backends like stargz-snapshotter. |
That all depends on your definition of "on-demand". The "demand" for the AuriStorFS case is triggered by regular OS open/read/write/etc and is handled for the AFS distributed file sytem on the kernel level (perhaps with a bit of optimistic read-ahead). As an FYI: AuriStorFS manages its own pre-allocated cache on the host machine. The AFS cache has some nice features including surviving reboots and cache consistency File Server's callbacks if/when files have changed. Since AuriStorFS maintains its own cache, this also begs the question of what the AuriStor ALS should report as the uncompressed size (in .../info)? Arguably, if the uncompressed size is only used for things like helping OpenShift make image garbage collection decisions, then AuriStor ALS should always report the uncompressed size as ZERO.
Yes, as I said earlier, we could support both by creating two target paths (corresponding to compressed digest and TOC digest), though it feels a little hacky. Regardless, this goes back to the question of validation. For example there is always the chance that a malicious actor could poison the container runtime cache of "read-only" diff layers. My understanding is that composeFS+fsVerity would address, but borrowing your words "I’m not at all authoritative (and probably not even well-informed)" @cgwalters and Jeffrey Altman (the real AuriStorFS expert) discussed this in the Spring at RedHat Summit.
Perhaps it could just be additive rather than a change. For example ALS capabilities could be added to the additionallayerstore option in storage.conf, something like:
with these dangling off of the additionalLayerStore struct and realized by AdditionalLayer interface functionality (along with appropriate logic in storage_dest.go, etc) That way the end-user can configure whether or not they will allow the ALS endpoints that support compressed digests for ALS targets. If something like this makes sense, I think it could be surgically done on the existing codebase without changing too many things. Of course further design discussions and feedback would be necessary before proceeding. Finally, the ability to support 'legacy' ALS is important because our customer has a huge number of containers stored in their container registries, none of which were built with ```podman push -compress-format=zstd:chunked ...". It is unrealistic to require them to rebuild or pull/re-repush their existing images to leverage ALS |
(No, it is used to create a ~valid manifest for pushes from c/storage, and it must be exactly the size of the blob returned by Lines 138 to 145 in ebc19d3
That rather depends on who the “malicious actor” is. It is the exclusive responsibility of the ALS backend to ensure that the layer returned by an ALS digest query (whether TOC digest or uncompressed digest) precisely matches the requested digest; c/image+c/storage can’t do that itself without triggering an immediate full read of the layer, which would rather defeat the purpose (at least for TOC-based layers which support on-demand pulls of individual files). The user writing to c/storage, or configuring If the “owner” of the AFS filesystem is some other person, the trust relationship between the user deciding to use that AFS filesystem and the owner of that AFS filesystem is invisible to c/image+c/storage, and up to the ALS backend to define. (As one possibility, the user choosing to use the ALS might be, by that choice, delegating trust to the AFS filesystem owner, and then the AFS filesystem owner would be responsible for running, and not interfering with, tooling that manages the layer pulls and verifies the digests. But the tooling must be verifying the digests.)
If the goal is to protect against the AFS filesystem’s owner, that might be possible in the future, if we were verifying image signatures per the local system’s policy, and those image signatures covered the composeFS/verity data. And we would need some way to integrate composefs with the ALS API. Most of this currently does not exist.
I think supporting the in-production customers, and managing the config file / API transition (if any, and possibly both the one that has already happened and one that we might need to do to fix things) is an important conversation, but probably for a different, product-specific, channel. |
Background: The AuriStor's Additional Layer Store (ALS) FUSE driver simply mounts layer 'diff' directories directly from the the AuriStorFS distributed file system (based on image names and layer digests). The AuriStor ALS FUSE driver currently works with unmodified OCI images. It does not require nor does it use any TOC Digest or index file (as with stargz).
The AurIStorFS ALS FUSE driver currently services Layer content relative to the Layer Digest:
First Occurance: As of containers/image 5.31 (Podman 5.1), the ALS Semantics changed such that there is no longer an ALS Code path for images without a TOC Digest Annotation (containerd.io/snapshot/stargz/toc.tocJSONDigestAnnotation or containers/storage/pkg/chunked/internal/compression.ManifestChecksumKey or "containerd.io/snapshot/stargz/toc.digest" / "io.github.containers.zstd-chunked.manifest-checksum" respectively).
image/storage/storage_desc.go, tryReusingBlobAsPending() now requires options.TOCDigest != "" in order to execise/leverage ALS in its code path:
Other changes: The ALS Semantics seems to now be that, for those images with TOC annotations, ALS now expects the FUSE driver to service the layer based on the TOC Digest
Note: I did not participated in any of the design discussions around these changes to ALS, but it appears that ALS is being narrowed towards support of zstd:chunked, composefs, and fs-verity.
Issue: If the intention is to continue allowing ALS plugins (such as the AuriStorFS ALS) to be able to support unmodified OCI Images (ie without TOC annotations), then this is a bug that was introduced in image:5.31 and storage:1.54
Request: Some clarity on the intended ALS paths and path semantics that an ALS Fuse driver should be servicing (going forward) would be appreciated. I could then modify the AuriStorFS ALS Fuse driver to service layers appropriately. - Thanks
Note: While our current AuriStorFS ALS implementation uses a FUSE driver, our plan is to ultimately provide ALS support natively from the AuriStorFS Kernel module (along with also adding fs-verity support to AuriStorFS)
The text was updated successfully, but these errors were encountered: