-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stream data more granularly #4
Comments
I wonder how granular we want to go 🤔 For example, imagine build-time multiple SBOMs attached to an image -- if the user asks for all go packages used in the build environments, should we download all of the SBOMs (which are likely to be large 🎉) before returning anything? Or should we be able to start returning data immediately as soon as it's available? I think we probably want to make use of the Then we could do something like: img := loader.Load("moby/buildkit:latest")
eg.Go(func() error {
name, err := img.Name(ctx)
if err != nil {
return err
}
// display name
})
eg.Go(func() error {
sbom, err := img.SBOM(ctx)
if err != nil {
return err
}
// display sbom
}) Alternatively, we could go for something that used channels so we didn't require the client to use a bunch of different go routines: img := loader.Load(ctx, "moby/buildkit:latest")
select {
case sbom := <-img.SBOM():
// display sbom
case sbom := <-img.Name():
// display name
case err := <-img.Error():
// handle error
return err
} I think I'd prefer something channel-based like the second, since it requires less thread management from the client, and also lets us progressively load packages, so we could return results from multiple SBOMs as they become available: case package := <-img.Packages():
// display a package Any thoughts? |
The initial idea for this library was that for an image source for any transport (registry, contentstore, oci layout) it will parse it and give a structured representation of that image. Current type https://github.com/docker/go-imageinspect/blob/main/types.go#L27 . When doing that, the library may use other APIs like Github or Hub API for additional data sources. The caller doesn't need to know how many APIs there are and if certain values are loaded from a specific object in the registry. For example, image description could be loaded from the annotation of the index, or manifest, or descriptor(2x), or Github, or Hub. If we just add a bunch of helper functions for loading specific items, then this goal is not really achieved. The caller would need to know what separate objects are in the registry and other APIs, ask them individually and then try to combine their results. They would need to set up a bunch of An alternative is that we define a bunch of data states that can be asked separately. They can be thought as selectors or capabilities. In principle, they can exist for every field that types image structure defines, but we probably don't need this level of granularity.
Whether the user wants properties for a specific platform(current in most cases) or for a map of all platforms can also be added as a parameter either to |
So, as mentioned by @tonistiigi earlier, we could consider using graphql as an API interface for this. Couple of benefits:
Some potential downsides:
|
graphql doesn't really fix the streaming aspect. The client would still need to know to ask for some things earlier as it predicts this data could be loaded faster. It works for custom requests but not for the cases of "give me all data but start sending as soon as you have even some of it". As this is a library, it is also a question of how easy it would be to call it from the clients then. When doing something like #4 (comment) it would probably be easy to wrap this with graphql for service that wants to expose that API. |
Getting the full result takes lots of requests from different services.
How to refactor the library so that basic information can be retrieved quickly and the result is updated as more info becomes available.
Some info is always needed. For example, signatures need to be validated before anything can be shown about the image. Otoh, full SBOM does not need to be loaded before more basic data can be already shown.
Usually, only one architecture is shown at a time, but we still need to do validation of other architectures as well for some fields so that user doesn't make assumptions about other architectures if they are very different.
The text was updated successfully, but these errors were encountered: