-
Notifications
You must be signed in to change notification settings - Fork 383
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP - support multi-image docker archives #975
Conversation
libpod PR to illustrate how I imagine it to be used -> containers/podman#6811 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a very quick first pass for now.
Signed-off-by: Valentin Rothberg <[email protected]>
@mtrmac, mind taking another look? I've been starring at it too long. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From a quick look, this still seems to lean rather towards inheritance, almost towards monkey-patching rather than restructuring; most importantly I think we should not need two more ImageReference
implementations.
I was thinking:
- For sources:
- Split the part of
tarfile.Source
that contains archive-wide state (tarPath
,removeTarPathOnClose
, maybe some of the layer info) into adocker/internal/tarfile.ReadableArchive
or so (no need to make this public? I might have to think about the public/private structure more. The name can certainly be better.) - Keep the existing public
tarfile.Source
working, on top of that, but also add a way to create it with a caller-suppliedReadableArchive
(in which casetarfile.Source
does not drive deleting temporary files). - Add a
*ReadableArchive
toarchiveReference
; inarchive.newImageSource
, ifref.readableArchive
is set, use it to create atarfile.Source
; if not, use the old code path. - Add support for lookup by tag and ID/index/something to
archiveReference
; it automatically applies both to the single-image and multi-image case. - Then make
MultiImageSource
, which creates aReadableArchive
and provides an API to enumerate / create references.
- Split the part of
And similarly for destinations.
|
||
// Reference returns an ImageReference embedding the MultiImageDestination. | ||
func (m *MultiImageDestination) Reference() types.ImageReference { | ||
ref := &archiveReference{path: m.path} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does not include the tag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tags are set via AdditionalTags
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, but not on the reference = not in copy.Image
error messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OTOH there is still the case of saving untagged images, so references don’t always have any extra data anyway.
I’ll read all of this more carefully a bit later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Much appreciated, thanks!
I am on PTO tomorrow but would will go back to this on Monday morning. Ideally, we need to get the feature in next week.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mtrmac, ideas how to proceed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please proceed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’m sorry for not getting back, I did promise I would.
Still, #975 (review) was I think fairly clear to the general direction. See #991 for an unfinished prototype of the read side. Yes, the PR is larger, but it already includes the ability to read any (even untagged) image in an archive using a textual reference, and a lot of the “new” code is actually only moved — tarfile.Source
was just split into two, the only non-trivial net new code at that layer is just chooseManifest
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I am still not sure how to proceed.
Now we have two PRs and I am worried we are running out of time; in fact, we might not get it into RHEL 8.3 any more.
If you agree, I want to focus on Podman-only needs first. We can still make follow-up cards for a more generally applicable solution, in case that will buy us some time. Certainly, the API shouldn't break.
// Reset the repoTags to prevent them from leaking into a following | ||
// image/manifest. | ||
d.repoTags = []reference.NamedTagged{} | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still stateful, then…
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More detailed comments after a careful read, finally. Conceptually, I’d still prefer for the code paths to be shared as much as possible, instead of a layer on top that partially overrides behavior like the destination here.
return s.loadTarManifest() | ||
if s.manifests != nil { | ||
return s.manifests, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is somewhat contra to the documentation of the function.
// MultiImageSourceItem is a reference to _one_ image in a multi-image archive. | ||
// Note that MultiImageSourceItem implements types.ImageReference. It's a | ||
// long-lived object that can only be closed via it's parent MultiImageSource. | ||
type MultiImageSourceItem struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conceptually, I’m concerned about having two ImageReference
implementations with different behavior but the same string syntax. ref.Transport().ParseReference(ref.StringWithinTransport())
is documented to be “equivalent to” ref
; that’s easiest to do when there is just one implementation, and not the case here (with a docker-archive:$path
-formatted references that successfully access images in multi-image archives, but fail when used from the CLI).
} | ||
|
||
// Manifest returns the tarfile.ManifestItem. | ||
func (m *MultiImageSourceItem) Manifest() (*tarfile.ManifestItem, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing but RepoTags
is really usable by callers, see e.g. the use of Config
in containers/podman#6811 ; so I’m a bit reluctant to commit to this as an API. OTOH, we clearly can keep this stable.
// MultiImageDestinations allows for creating and writing to docker archives | ||
// that include more than one image. | ||
type MultiImageDestination struct { | ||
*archiveImageDestination |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn’t this externally expose PutBlob
and everything else?
|
||
// Close is a NOP. Please use Finalize() for committing the archive and | ||
// closing the underlying resources. | ||
func (m *MultiImageDestination) Close() error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Close()
is conceptually a bit different from Finalize()
(or ImageDestination.Commit
); it allows cleaning up the temporary files even on error.
Sure, this is a possible way to structure the API, but it’s a bit inconvenient to use: A typically caller will use something like defer multiDest.Close()
and might not even check for errors if there is already a failure with a more important root cause to preserve, whereas on success the caller really wants to check that multiDest.Finalize()
did succeed. Having “commit” and “deallocate” be the same operation forces every such caller to have a committed
flag that’s checked inside the defer
, or to have a critical part of the process in a defer
.
// Reset the repoTags to prevent them from leaking into a following | ||
// image/manifest. | ||
d.repoTags = []reference.NamedTagged{} | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
… it’s also not documented that the caller is supposed to use (only) AddRepoTags
after each PutManifest
AFAICS.
Closing as @mtrmac took over. Thanks again! |
Add a
MultiImageArchive{Reader,Writer}
todocker/archive
to supportdocker archives with more than one image.
To allow the new archive reader/writer to be used for copying images,
add an
Image{Destination,Source}
tocopy.Options
. When set, thedestination/source referenced will be ignored and the specified
Image{Destination,Source}
will be used instead.Signed-off-by: Valentin Rothberg [email protected]