Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: Make auth/pull_manifest public, add API to stream layer #564

Closed
wants to merge 1 commit into from

Conversation

cgwalters
Copy link

For a project I'm working on, I want to stream the layer
content, not collect it into a Vec<u8>.

For a project I'm working on, I want to stream the layer
content, not collect it into a `Vec<u8>`.
@ghost
Copy link

ghost commented Apr 6, 2021

CLA assistant check
Thank you for your submission, we really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.

❌ cgwalters sign now
You have signed the CLA already but the status is still pending? Let us recheck it.

@cgwalters
Copy link
Author

Hi, filing this as draft since I think some discussion is required. I can imagine for the wasm case loading everything into a Vec<u8> is fine but for my project I want to stream the content, since it can be large.

This exposes the raw reqwest::Response type; I think to hide it we'd need to make a new type right? I can do that if you prefer.

Also, AFAICS currently this code doesn't actually verify the blob sha256 - is that intentional?

@thomastaylor312
Copy link
Member

Hi there @cgwalters! Thanks for the PR. I am all for streaming support, but I definitely think we should return our own type or an impl Stream for streaming a layer rather than the underlying Response object.

Also, AFAICS currently this code doesn't actually verify the blob sha256 - is that intentional?

Basically most features that are missing in this crate are not intentional. If you'd like to add this, please feel free to! But it is not needed for this PR by any means.

Also, could you explain the need for making auth and pull_manifest public? I think that makes sense to me but it would be good to capture the justification here just in case

@bacongobbler
Copy link
Collaborator

Also, could you explain the need for making auth and pull_manifest public? I think that makes sense to me but it would be good to capture the justification here just in case

I'd also like to know - these methods were intentionally left private so that credentials could not be injected by any malicious code. The calls to pull_manifest and auth are private because the user first provides the credentials in the client, then just makes calls to pull() or other calls. I'm concerned about leaking internal functions to the public.

@cgwalters
Copy link
Author

Unrelated to this PR, I have stumbled onto the fact there's a variety of crates out there for this, like https://github.com/camallo/dkregistry-rs and when I went to upload one of my crates, https://crates.io/crates/ant_king_image had just been uploaded too.

Over here camallo/dkregistry-rs#200 (comment) I was proposing to move some code into the github.com/containers organization which I believe is intended as a generic home for this type of stuff, although of course it's very Go focused over there.

I definitely think we should return our own type or an impl Stream for streaming a layer rather than the underlying Response object.

Yep, will do.

Also, could you explain the need for making auth and pull_manifest public? I think that makes sense to me but it would be good to capture the justification here just in case

In order to find out which layers to pull, we need the manifest right? You can see how I'm using this code here
https://github.com/ostreedev/ostree-rs-ext/blob/07515091921d871edfba5bd69e9d34fe689366c9/lib/src/container/client.rs#L48
(The high level goal is embedding CoreOS update images in a container image, that's the need for streaming)

I'd also like to know - these methods were intentionally left private so that credentials could not be injected by any malicious code.

I don't understand this concern - if you have malicious code in your address space you've already lost (I mean, absent things like wasm sandboxing). What malicious code are you thinking of and how would it be blocked by Rust method visibility?

@cgwalters
Copy link
Author

Actually I just put two and two together - is krustlet actually just running all the the wasm apps in the same process it uses as a control plane to talk to the kube control plane etc.? That seems...suboptimal if so. WASM is good and all but it seems like it'd be very useful to fork off a worker process that hosts apps and can be much more sandboxed using Linux container features too etc.

@thomastaylor312
Copy link
Member

can be much more sandboxed using Linux container features too etc.

Unfortunately this won't be the case. Krustlet is built explicitly to be cross platform. It can run on Linux, Windows, and macOS, which don't support the same container features

I can't answer your other questions as well as @bacongobbler can, but we have discussed the execution model at length. We chose to run all of the wasm "processes" within the same parent process because those threads can be completely parked and low overhead. We aren't opposed to splitting it out, but each provider can implement how they'd like to do it on their own. We are waiting for the Wasm ecosystem to stabilize some more and address process isolation before we do it for the wasi-provider. If you want to discuss this more though, we should probably do it in an issue as we don't want to clog up this PR that much

@cgwalters
Copy link
Author

For a variety of reasons I'm going to instead work on forking off https://github.com/containers/skopeo as a subprocess instead (support for mirroring, signature verification, etc.). There's some code related to this in: https://github.com/ostreedev/ostree-rs-ext/tree/main/lib/src/container

At some point in the future once skopeo has a saner out-of-process API I may try to extract that code to a crate, at which time perhaps krustlet could consider using it optionally.

@cgwalters
Copy link
Author

(time passes)

OK, https://crates.io/crates/containers-image-proxy is a thing now. This is still listed as a private/experimental API in skopeo (more in containers/skopeo#1476 ), but it may be interesting for you (krustlet) at least optionally down the line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants