Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support reading any/all images from docker-archive: tarballs #991

Merged
merged 15 commits into from
Aug 20, 2020

Conversation

mtrmac
Copy link
Collaborator

@mtrmac mtrmac commented Jul 25, 2020

  • Add support for choosing an image to read from a docker-archive:, either by reference (using the existing syntax) or by index (docker-archive:/some/path:@0).
  • Add docker/archive.Reader that allows listing all images in a docker-archive-formatted file, and reading any/all of them, only copying/extracting the archive once.

See individual commit messages for details.

Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the direction and embedding the index into the reference!

docker/internal/tarfile/src.go Outdated Show resolved Hide resolved
docker/archive/multi_source.go Outdated Show resolved Hide resolved
docs/containers-transports.5.md Show resolved Hide resolved
// MultiImageSource manages a single Docker archive, allows listing its contents
// and accessing individual images with less overhead than creating image references
// individually (because the archive is, if necessary, copied or decompressed only once).
type MultiImageSource struct {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m not entirely happy about this name - it’s not an ImageSource. docker/archive.Reader? MultiReader? MultiImageReader?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying docker/archive.Reader — it’s not just for multi-image readers, a caller that wants to open a single-image archive, extract the tag from the manifest, and read the single image, would benefit as well.

)

// ReadableArchive is a ((docker save)-formatted) tar archive that allows random access to any component.
type ReadableArchive struct {
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name feels a bit weird, but so does internal/tarfile.Reader. I’m very much open to suggestions.

OTOH it’s internal, so it’s not as critical.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying tarfile.Reader, with variable name archive. The internal/tarfile.Reader/archive.Reader pair is somewhat symmetrical, but it might be too confusing; I’m not sure.

@mtrmac
Copy link
Collaborator Author

mtrmac commented Jul 28, 2020

@vrothberg This should now be code-complete (full transport tests, leading to fixes in reference parsing), but I still need to rebase to move some changes across commits. And it remains mostly untested, hoping Podman’s tests will handle the basics.

@mtrmac mtrmac force-pushed the tarfile-read branch 4 times, most recently from 4d46560 to dcfdda0 Compare July 29, 2020 17:34
@mtrmac
Copy link
Collaborator Author

mtrmac commented Jul 29, 2020

@vrothberg Ready for review (but still basically untested).

@mtrmac mtrmac marked this pull request as ready for review July 29, 2020 17:37
@mtrmac mtrmac changed the title WIP: Support reading any/all images from docker-archive: tarballs Support reading any/all images from docker-archive: tarballs Jul 29, 2020
@vrothberg
Copy link
Member

Thanks, I'll test #998 and will comment there.

Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@rhatdan
Copy link
Member

rhatdan commented Aug 11, 2020

Needs a rebase

@mtrmac
Copy link
Collaborator Author

mtrmac commented Aug 12, 2020

Rebased, now includes all of #998.

@vrothberg
Copy link
Member

@rhatdan ... merge me :)

Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add means to figure out if a given reference refers to a file? Ultimately, if it's a file then we have to create a Reader and query for the reference list.

Currently, I don't find any other way than checking if (ref).StringWithinTransport() is on disk or not. However, that's not ideal as it may very well be a typo or the file really doesn't exit.

@mtrmac
Copy link
Collaborator Author

mtrmac commented Aug 12, 2020

It wouldn’t be unprecedented… but I kind of think that the caller should know that in the first place. IIRC the only really ambiguous case is podman pull docker-archive:$path, where I’d argue “treat the transport:name syntax always as a single image” is closest to a consistent semantics.

Could we add means to figure out if a given reference refers to a file? Ultimately, if it's a file then we have to create a Reader and query for the reference list.

Currently, I don't find any other way than checking if (ref).StringWithinTransport() is on disk or not.

Well, if you want an ugly hack, the syntax is documented to split path:rest on a :. But I’d recommend against that.

@mtrmac
Copy link
Collaborator Author

mtrmac commented Aug 12, 2020

Added another commit, “Add docker/archive.NewReaderForReference”. DO NOT MERGE until it is verified it is useful for Podman.

@mtrmac mtrmac changed the title Support reading any/all images from docker-archive: tarballs DO NOT MERGE: Support reading any/all images from docker-archive: tarballs Aug 12, 2020
@mtrmac
Copy link
Collaborator Author

mtrmac commented Aug 13, 2020

Also added Reader.ManifestTagsForReference to accommodate Podman’s handling of unqualified short names in the archives (which are only created by projectatomic/docker AFAIK). Ugh.

Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ... ready to merge from my side

@rhatdan
Copy link
Member

rhatdan commented Aug 19, 2020

Need @mtrmac to remove the "DO NOT Merge:" from the commit message.

@mtrmac mtrmac changed the title DO NOT MERGE: Support reading any/all images from docker-archive: tarballs Support reading any/all images from docker-archive: tarballs Aug 19, 2020
mtrmac added 15 commits August 19, 2020 21:23
It's a possible user error, not a supposedly-unreachable internal
error.

Signed-off-by: Miloslav Trmač <[email protected]>
Should not change behavior.

Signed-off-by: Miloslav Trmač <[email protected]>
... so that we can add more functionality to it without making it
public API.

Keep the original docker/tarfile.Source as a compatibility shim.

Similarly, move docker/tarfile/src_test.go as well, to keep it
close to the implementation being tested.

Should not change behavior.

Signed-off-by: Miloslav Trmač <[email protected]>
…archive

... instead of the forwarding shims.

Should not change behavior.

Signed-off-by: Miloslav Trmač <[email protected]>
…rfile

This is a private API now, so we can just drop it; rename the
...With[System]Context variants to use the shorter names, and update
all callers.

Should not change behavior.

Signed-off-by: Miloslav Trmač <[email protected]>
... to eventually allow creating multiple Sources from a single
Reader (= a temporary file containing a seekable/uncompressed
copy of the archive).

Should not change behavior.

Signed-off-by: Miloslav Trmač <[email protected]>
At least docker/archive will need to deal with the two
optionally separately, so such a separate constructor must exist;
and the other callers are not much more complex if they separate
it as well.

This will also allow us to add reference lookup without having
to duplicate the API.

As a concession to the simpler callers, add a closeArchive parameter
(currently always true) that allows them not to worry about
the lifetime of the tarfile.Reader.

Should not change behavior.

Signed-off-by: Miloslav Trmač <[email protected]>
Every caller will need that data; this way it can be shared across
several consumers, and we don't need to synchronize access/creation
of the parsed data structure.

Should not change behavior, except that errors are now reported
earlier.

Signed-off-by: Miloslav Trmač <[email protected]>
We can drop the unused error return value now.

Signed-off-by: Miloslav Trmač <[email protected]>
We already accept the syntax for docker-archive: references,
now implement the lookup instead of warning and ignoring the value.

Implement the lookup in tarfile.Reader, not tarfile.Source,
because we will want to provide an API to obtain tags from a
Reader+Reference, without constructing a Source.

Signed-off-by: Miloslav Trmač <[email protected]>
Add support for path:@index (e.g. path:@0, path:@1 ...) reference syntax
to docker-archive.

This will allow reading even untagged images from multi-image archives.

Signed-off-by: Miloslav Trmač <[email protected]>
with only two operations: Close(), and List() which
returns a set of ImageReference objects that allow accessing
the individual images.

For now, use of every reference triggers creation of a new
tarfile.Reader; that will be fixed momentarily.

Signed-off-by: Miloslav Trmač <[email protected]>
In archive.Reader, embed a reference to tarfile.Reader to
the created ImageReference objects, and use them in NewImageSource.

Signed-off-by: Miloslav Trmač <[email protected]>
To allow accessing the same image multiple times, if given
a (probably user-specified) ImageReference.

Signed-off-by: Miloslav Trmač <[email protected]>
to allow Podman default to localhost/$tag for unqualified $tag
values in docker-archive tarnballs.

(Note that only projectatomic/docker creates such tarballs;
almost no-one else should use this.)

Signed-off-by: Miloslav Trmač <[email protected]>
@mtrmac
Copy link
Collaborator Author

mtrmac commented Aug 19, 2020

OK, rebased and “DO NOT MERGE” dropped (although I haven’t had time to fully review the Podman PR)

@rhatdan
Copy link
Member

rhatdan commented Aug 19, 2020

LGTM
Merge once it passes the tests.

Copy link
Member

@vrothberg vrothberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks a lot for the great work and your support, @mtrmac !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants