-
Notifications
You must be signed in to change notification settings - Fork 794
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new experimental-image-proxy
hidden command
#1476
Conversation
If we agree to do this I'll take a look at things like wiring up https://github.com/cgwalters/container-image-proxy/blob/main/demo.py as a test case, moving over the README.md docs etc. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My initial reaction is that I’m rather torn.
Pragmatically, because c/image tries hard to keep a stable API, it would probably be cheap enough for us to just carry this as long as you care about it and keep maintaining it — and the size savings are a real benefit, of course.
OTOH, as a matter of principle, this looks very specialized (a single-image HTTP server), and it would be somewhat a bad precedent for Skopeo, making it harder to reject a dozen more of such specialized commands — also considering how little of even Skopeo’s CLI it actually needs to share.
@vrothberg @rhatdan WDYT?
Looking at the code as is:
- The HTTP server thing is probably working well enough, but it’s so minimal and full of traps for the unwary. (To an extent, sure, we don’t need to care.) I haven’t looked into this in detail (
net/http
is pretty big), but it seems that it might be possible to get much closer to a “real” HTTP server with anet/http/server.Serve()
and a specialized single-shotnet.Listener
implementation. - Experimental or not, how do you plan to evolve the interface over time? Version-lock an exact Skopeo version with an exact client version? If not, there needs to be at the very least a “server version” endpoint.
(The individual review comments are probably irrelevant at this stage.)
cmd/skopeo/proxy.go
Outdated
} | ||
w.Header().Add("Manifest-Digest", digest.String()) | ||
|
||
ociManifest, err := manifest.OCI1FromManifest(rawManifest) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is invalid if the original image is not in fact OCI; the code should do at least a MIME type check or manifest.GuessMIMEType
.
(To do a conversion, use types.Image.UpdatedImage
, possibly having to actually compute DiffID values of all layer content if the input is schema1.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, if this doesn’t specifically need an OCI-formatted manifest, types.Image.Inspect
provides some information from an image of any format; in particular that’s a reasonable way to list all layer digests, enabling use of /blobs/
, if that were the only thing necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So...part of my intent here was to allow callers to pretend the world is OCI only.
Now one thing I think we're running up against here is that a lot of the machinery in c/image is (AFAICS) oriented towards copying images (which makes total sense). So I tried making a "dummy" oci
formatted destination and getting the manifest out of that, but I still get the Docker schema mime types in layers
. Haven't debugged why that is.
Now, the more I think about this though the more I think inverting the client/server may be right. I don't yet have a sense for how hard that'd be.
I think if we chose to do so, supporting a flow like a CopyImage
method where the client passes another socketpair()
and acts as a server on that fd or so (while keeping the current code intact) would be doable.
Anyways, so I think this is OK to ship as is, but if you know of the dance I need to do to convert MIME types at least, that'd be good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ll read the linked proposal (and the updated code) later, for now just a very quick first impression:
-
I’m very reluctant to turn all of
ImageDestination
into a stable RPC API. We are actually already starting to hide parts of it from the public type (comparec/image/internal/types
) just because we don’t know how it is going to evolve. Sure, API versioning and capability checks and…, but that’s an extra maintenance that’s basically known to materialize at least once this year again. -
Could you describe what are you building, at a … higher but still somewhat low :) … level, please?
I kinda thought, to be honest without investigating at all, that the “encapsulate host operating system updates” implies specially-built images only usable with a specialized consumer, so it should be possible to to hard-code the assumption that the images are in the OCI (or schema2) format in the producer of the image as well as the consumer. So I was expecting this to just refuse non-OCI, or to be a dumb wrapper around
ImageSource.GetManifest
returning the MIME type + byte array as is (and for the client to then refuse non-OCI).Alternatively, I was thinking that the application doesn’t care about the internals of the manifest format and can just use
types.Image.Inspect
to extract the format-independent aspects of the manifest/config, like the layer list.If neither is true, there are probably some extra requirements I’m not thinking about.
(To be clear about the cost, if this is supposed to accept schema1 images and always provide an OCI-formatted image, the server will need, for the conversion, to read all of the layers and decompress them, and then keep a state of the generated config+manifest. Compare the earlier discussions WRT consuming a “streaming OCI archive” destination — the data must be buffered at some point because non-byte-for-byte-identical image copies can, in general, only produce the manifest last, but the consumers need the manifest first; that’s the case for compression/decompression of the image, as well as format conversions, at least if schema1 is involved.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you describe what are you building, at a … higher but still somewhat low :) … level, please?
I sent a provisional calendar invite for Monday 9:30am EST. (Anyone else following along who is interested, please say if you want to join)
I kinda thought, to be honest without investigating at all, that the “encapsulate host operating system updates” implies specially-built images only usable with a specialized consumer,
Yes, this was the original goal. I call this the "encapsulation" flow. It's like wrapping a letter (OS update) in an envelope (container) - the envelope helps get the letter to the destination, then the envelope is discarded.
That said, even with this "encapsulation" model, it is very useful to be able to run this container as a container - i.e. via podman run
or in Kubernetes etc. mainly for testing and inspection. We don't actually expect it to be an application base image though.
However, we more recently started an effort to support derivation (that's ostreedev/ostree-rs-ext#12 ) - because when you're shipping OS updates in a container, it's pretty natural to support container-style derivation - and this is where the letter/envelope metaphor no longer applies.
And more specifically, it becomes more of a need to support a generic container build tool just reusing our base blob.
That said, even when doing derivation, we would not primarily support people using these derived containers as containerized applications. We have "traditional" OS base images for that.
--
OK now, let's talk about schema1. My instinct says that we do not need to support it. Basically, we need to support pushing/pulling images from docker hub/quay.io etc., the openshift registry, recent versions of upstream docker/distribution, things like Artifactory. If anyone tries to do this with either an old registry or an old version of docker
, I am happy to tell them to upgrade. But, you would probably know better than me - is that plausible? Is there something I'm missing?
I'll investigate changing the code to accept oci or d2s2, and rejecting anything else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK now, let's talk about schema1. My instinct says that we do not need to support it.
…
But, you would probably know better than me - is that plausible? Is there something I'm missing?
Yes, I think that’s a very plausible path; e.g. recent docker/distribution server implementations reject schema1 by default already.
It might not be entirely costless, e.g. due to some bugs in c/image we were falling back to schema1 on uploads to Quay.io a few years ago, and something similar (as yet undiagnosed) has shown up recently (basically if a registry rejects both OCI and schema2, we do try to upload as schema1; OTOH the recent case didn’t end up with a successful upload anyway, so it is not a known case in the wild).
I am not aware of any specific bugs right now, or any reason why schema1 should have to be supported; basically I’m just saying that the code in this PR should actually check the MIME type and fail somewhat cleanly — and if it ever did fail, the cause would need to be investigated and fixed, but I feel quite confident that it could be fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I pushed an update which now accepts only oci and schema2 (converted to oci) and rejects everything else.
Hmm...have there been any other related requests? EDIT: Also let's be clear this particular one is really only intended to be used by ostree to start. If (as discussed in the other ticket) something like https://github.com/krustlet/krustlet ends up looking at it, we can revisit things.
Well, I've been thinking about the API more, and given the desire for file descriptor passing, I think DBus may be a good fit (but happy to hear other suggestions). Agree on the need for something like a |
Is there a reason this could not be done, just talking to a podman-service rest api? |
Not of this kind, IIRC. (From time to time, there are RFEs for very specialized features that should probably be a site-specific shell script.) It’s quite fair to argue that “a matter of principle” is not a pragmatic reason not to do this.
[I’m not sure file descriptor passing makes that much sense, the Go API is not very strongly tied to OS-level file descriptors, in particular the HTTP stack is fairly happy to build stacks of WRT the disk-space-saving concept of sharing a single Skopeo binary, a HTTP server would actually work better than adding a DBus dependency. But that’s not something I’m too concerned about.
Right, I don’t think it’s a big problem, just something to account for in the design. E.g. keeping the two components in exact lockstep might work perfectly fine, as long as the exact lockstep is actually consciously maintained. |
The REST api works on the granularity of images but doesn't allow for fetching specific manifests, configs and blobs. But we could certainly end endpoints for that. I have a slight preference actually do that in Podman since it's already doing REST-y things with CI in place for that etc. |
I more mean that we have the Go side write to a pipe, and the client read it. And more generally, separating control and data channels is a good pattern. |
Well, you guys need to pick 😉 Though, I do notice podman already vendors godbus whereas skopeo doesn't. |
Bringing @mheon since he blocked that PR. To get his opinion. |
FWIW if this ends up in either of Podman/Skopeo (and does not introduce, vaguely, excessive dependencies), I’m personally fine with it being added to Skopeo rather than Podman. |
I continue to be of the opinion that, feature-wise, Skopeo is a better fit than Podman - this still strikes me as a dedicated image management command, and that's definitely Skopeo's area of specialty. However, given this is only a hidden command, if the maintainers here are strongly against including it, I won't argue too strongly about including in Podman. |
FWIW, I'd vote for keeping this in Skopeo. |
OK, Skopeo it is. |
OK, let’s do this. @cgwalters If this is for exclusive use of Elsewhere you mentioned that you are trying an entirely different interface — do you want to move this PR to a draft status in the meantime, so that it doesn’t get merged in the current form? |
Copying from cgwalters/container-image-proxy#1 (comment) OK I spent about 2 hours trying to do a DBus interface It's not trivial to implement. Then I tried to do the "fake net.Listener" but ran into issues I had trouble debugging in the Go http flow, and it was clearly a hack. And then what I realized is that there's the better option for this case I used for https://github.com/coreos/bootupd/blob/main/src/ipc.rs - SOCK_SEQPACKET where we do something super simple like JSON inside. I've updated this PR to implement that new protocol, and I also have some Rust client code in https://github.com/cgwalters/container-image-proxy/blob/seqpacket/example-rust/src/main.rs Unless anyone has further thoughts/objections I'll work on refining this more and e.g. try to move over the README.md docs and have some basic tests as part of this PR. I also need to go through and address the non-IPC-mechanics part of the PR review so far. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to confirm, this looks good to me.
Up to you when you think this should be merged.
OK ostreedev/ostree-rs-ext#118 is updated to use this.
Thanks! I'm still working through the backlog of your comments, in particular I think the the OCI-upconverting is a must-fix. But I think I should be done with it all by today. |
bd41021
to
11737ee
Compare
786c178
to
16171ce
Compare
I split out the Rust client code I've been working on into a separate sub-crate; I plan to publish to crates.io at some point. See later commit in ostreedev/ostree-rs-ext#118 |
The osx CI failure looks spurious, right? Other than that I think this should be good to go if there are no other comments. Thanks for all the detailed review so far! |
Can you run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, last round, I promise.
This imports the code from https://github.com/cgwalters/container-image-proxy First, assume one is operating on a codebase that isn't Go, but wants to interact with container images - we can't just include the Go containers/image library. The primary intended use case of this is for things like [ostree-containers](ostreedev/ostree-rs-ext#18) where we're using container images to encapsulate host operating system updates, but we don't want to involve the [containers/image](github.com/containers/image/) storage layer. Vendoring the containers/image stack in another project is a large lift; the stripped binary for this proxy standalone weighs in at 16M (I'm sure the lack of LTO and the overall simplicity of the Go compiler is a large factor). Anyways, I'd like to avoid shipping another copy. This command is marked as experimental, and hidden. The goal is just to use it from the ostree stack for now, ideally shipping at least in CentOS 9 Stream relatively soon. We can (and IMO should) change and improve it later. A lot more discussion in cgwalters/container-image-proxy#1
Ah, wow. I read most of that and sure, the C preprocessor has traps, but the Rust style cfg approach seems just better. Anyways, updated for latest comments! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This code is extracted from https://github.com/ostreedev/ostree-rs-ext/pull/118/files See containers/skopeo#1476 for more information.
This uses a custom JSON-over-SOCK_SEQPACKET with fd passing for data. For more information, see containers/skopeo#1476
OK I've split out the Rust bindings to https://github.com/cgwalters/containers-image-proxy-rs Would it make sense to move this into the containers/ GH organization? I've been thinking about this whole thing more, and I suspect at least krustlet could probably use this too mostly as is. However, another obvious use case occurred to me around e.g. the containrs project. And there's an interesting question there around whether such a project would want to do a similar RPC style thing for containers/storage, or do that on its own. And another interesting sub-thread here is: we could probably significantly slim down skopeo by having it delegate e.g. I've also been thinking about write support...it would seem natural to do, but honestly not urgent. What we do right now in writing an Anyways, there's a lot of bigger picture questions opened by this. We don't need to solve any of them now - simply having this ship in Fedora/C9S in the relatively near future will unblock my immediate needs. But I'm just writing down things here to think about. |
This uses a custom JSON-over-SOCK_SEQPACKET with fd passing for data. For more information, see containers/skopeo#1476
This uses a custom JSON-over-SOCK_SEQPACKET with fd passing for data. For more information, see containers/skopeo#1476
Doesn’t that end up pretty similar to CRI-O, in size and dependency extent? Sure, it doesn’t have quite the same RPC interface, and combining the functionality of CRI-O + Kubelet into a single daemon sounds appealing at least at a first glance. But I’d expect that a
[Well, we want the world of shared libraries again :) and we can’t get that, it seems.] c/storage has been the primary driver for API extensions to
I could similarly argue that all c/storage use should be through Surprisingly uses of |
BTW it is possible to compile-out (most of) the c/storage transport, for system builders that care to that level. ( containers/image#1395 ) |
Oops, that’s of course irrelevant because the RPC would be over the c/image transport concept, not the c/storage API. |
You may interleave invocations of these methods, e.g. one | ||
can also invoke `OpenImage` multiple times, as well as | ||
starting multiple GetBlob requests before calling `FinishPipe` | ||
on them. The server will stream data into the pipefd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW individual ImageSource
objects are not maintained to be re-entrant/thread-safe, unless HasThreadSafeGetBlob()
returns true
, in which case it is OK to have several concurrent GetBlob()
calls outstanding (but not concurrently with anything else like GetManifest
or the calls done by image.FromSource
). E.g. in “initiate GetBlob
; call img.Manifest
; close ImageSource
; finish consuming GetBlob
” the two middle calls are invalid.
(Similarly for destinations and HasThreadSafePutBlob
. It is OK, on a single goroutine, to create multiple ImageSource
s from an ImageReference
, and then use all of the ImageSource
s concurrently — with the full overhead of initializing a separate source, e.g. in the docker://
case a version check, credential lookup, getting a separate token, and the like.)
This uses https://copr.fedorainfracloud.org/coprs/g/CoreOS/experimental/ "Contains builds of coreos/rpm-ostree#3139 and containers/skopeo#1476 for CentOS8, plus dependencies."
This uses https://copr.fedorainfracloud.org/coprs/g/CoreOS/experimental/ "Contains builds of coreos/rpm-ostree#3139 and containers/skopeo#1476 for CentOS8, plus dependencies."
This uses https://copr.fedorainfracloud.org/coprs/g/CoreOS/experimental/ "Contains builds of coreos/rpm-ostree#3139 and containers/skopeo#1476 for CentOS8, plus dependencies."
This imports the code from https://github.com/cgwalters/container-image-proxy
Why ?
First, assume one is operating on a codebase that isn't Go, but wants
to interact with container images - we can't just include the Go containers/image
library.
The primary intended use case of this is for things like
ostree-containers
where we're using container images to encapsulate host operating system
updates, but we don't want to involve the containers/image
storage layer.
Vendoring the containers/image stack in another project is a large lift; the stripped
binary for this proxy standalone weighs in at 16M (I'm sure the lack
of LTO and the overall simplicity of the Go compiler is a large factor).
Anyways, I'd like to avoid shipping another copy.
Immediate future
This command is marked as experimental, and hidden. The goal is
just to use it from the ostree stack for now, ideally shipping at least
in CentOS 9 Stream relatively soon. We can (and IMO should)
change and improve it later.
Longer term
A lot more discussion in cgwalters/container-image-proxy#1