-
Notifications
You must be signed in to change notification settings - Fork 380
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework how we use containers/storage #305
Conversation
Related currently-do-not-merge PRs in consumers: |
613bb2b
to
5114857
Compare
@nalind Did the tests freeze on this? |
04bed1c
to
0330049
Compare
@rhatdan Travis CI passes, but the code-review/pullapprove check is manual. |
9e3edf0
to
55adff5
Compare
3fbfcb4
to
419105c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I like that ImageSource
now produces consistent images; the code is, though, working way too hard to make types.Image
fit for these uses, and it is IMHO not worth it—it’s not just the c/i/storage complexity, but also the fragility of c/i/image.Image
now having to care about these not-quite-complete images as well.
Most of the complexity and dependencies of types.Image
is due to the format conversions between different manifest formats, which this code does not need to deal with at all.
The ImageSource
path could be much clearer by splitting the relevant part of UpdatedImage
into a separate manifest-only (i.e. not “consistent image”) updater, something like
type containers/image/manifest.Manifest interface {
// as in types.Image
ConfigInfo() types.BlobInfo
LayerInfos() []types.BlobInfo
// as in types.ManifestUpdateOptions.LayerInfos
UpdateLayerInfos([]types.BlobInfo)
// as in genericManifest.serialize()
Serialize() []byte
}
func ManifestInstanceFromBlob([]byte, mimeType string) Manifest
and moving the raw parsers/serializers from c/i/image to c/i/manifest .
This would work for ImageDestination.Commit
as well, as far as extracting the layer list goes… the creation time is awkward, though. Perhaps that could be done with a specialized “getConfig” callback, which on the storage side would only have to deal with non-layer blobs and thus be much simpler than storageImageUnnamedDestination.GetBlob
storage/storage_image.go
Outdated
Signatures []byte `json:"-"` // Signature contents, temporary | ||
SignatureSizes []int `json:"signature-sizes"` // List of sizes of each signature slice | ||
directory string // Temporary directory where we store blobs until Commit() time | ||
counter int32 // A counter that we use for computing filenames to assign to blobs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could this member have a more descriptive name, nextTempFileID
or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Changing it.
storage/storage_image.go
Outdated
// Check if the blob corresponds to a diff that was used to initialize any layers, either | ||
// before or after decompression, since we don't care. | ||
uncompressedLayers, err := s.transport.store.LayersByUncompressedDigest(info.Digest) | ||
compressedLayers, err := s.transport.store.LayersByCompressedDigest(info.Digest) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this need to look up the compressed layer? newImage
ensures that the manifest can only contain uncompressed digests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I suppose that though storageUnnamedImageReader
is not dealing with an updated manifest, it's not going to get called to read blobs using the digests from the not-yet-updated manifest that it would return. Reworking it to only look for layers using the index of uncompressed digests.
storage/storage_image.go
Outdated
if decompress { | ||
noCompression := archive.Uncompressed | ||
diffOptions = &storage.DiffOptions{ | ||
Compression: &noCompression, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICS DiffOptions.Compression
is only used in layerStore.Diff
’s maybeCompressReadCloser
, which implies that the input to this is always uncompressed, and layerStore.Diff
compresses it unless told not to, burning CPU.
Why would we ever want the compressed blobs instead of the originals here?
(For push to a remote repository, the blobs can be compressed in copy.Image
, which also takes care of updating the manifests as needed.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lately, layerStore.Diff
only compresses a layer diff if it was originally applied in compressed form. The files that comprise the layer are stored on disk in uncompressed form, of course, but a reader would historically ask for it using the digest from the unmodified manifest, so it was doing the best that it could by defaulting to using the same type of compression which was originally used for the layer.
A storageImageSource
always asks for compression to be skipped, and since we're removing the ability to read layers using compressed digests from storageUnnamedImageReader
, we don't need to bother with having it do things differently, so the GetBlob
logic can be simplified.
storage/storage_image.go
Outdated
signature, err := s.transport.store.ImageBigData(s.ID, "signatures") | ||
if err != nil { | ||
if !os.IsNotExist(errors.Cause(err)) { | ||
logrus.Debugf("got error %v looking up signatures for image %q", err, s.ID) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this ignore the error intentionally? (This doesn’t hurt signature security, of course; it seems to be just a tradeoff between not failing on inability to read signatures images which don’t need to be signed at all vs. failing at the appropriate place with a correct error message for images which do need to be signed but the signatures can't be read.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A file-not-found error would be ignored, but I'll restructure it to not do that, and instead skip the attempt to read a file if we don't expect to find anything.
storage/storage_image.go
Outdated
diffOptions = &storage.DiffOptions{ | ||
Compression: &noCompression, | ||
} | ||
if layer.UncompressedSize < 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
storageUnnamedImageReader.getSize
fails if it finds UncompressedSize < 0
; can this happen, or is perhaps every layer guaranteed to have a valid UncompressedSize
? (I don’t know, maybe it can happen: getSize
is walking the image’s Parent
list while this is looking up an image by digest, so the two sets of layers are not necessarily equal.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW the documentation of UncompressedSize
says “If UncompressedDigest is not set, this should be treated as if it were an uninitialized value.”, but nothing in this file checks for a set UncompressedDigest
before accessing UncompressedSize
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Layers that have an UncompressedDigest
value will have a valid UncompressedSize
. Layers that don't would have it set to an uninitialized 0, but we can assume it's there for anything returned by LayersByUncompressedDigest
. That's not an assumption we can make when walking the parent list, so I'll add those checks.
storage/storage_image.go
Outdated
// parse the manifest to get a list of which blobs are filesystem layers, leaving any cached | ||
// files that aren't filesystem layers to be saved as data items. | ||
if s.image == nil { | ||
img, err := image.FromSource(&storageImageUnnamedDestination{named: s}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also quite awkwkard, if the point is primarily to read the list of blobs.
storage/storage_image.go
Outdated
if inspect, err := s.image.Inspect(); err == nil { | ||
logrus.Debugf("setting image creation date to %s", inspect.Created) | ||
options = &storage.ImageOptions{ | ||
CreationDate: inspect.Created, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ouch ouch ouch. This is the only place which requires constructing a types.Image
(and actually makes storageImageUnnamedDestination.GetBlob
necessary for config.json
).
storage/storage_image.go
Outdated
// GetManifest reads the manifest that we intend to store. If we haven't been given one (yet?), | ||
// generate one. | ||
func (s *storageImageUnnamedDestination) GetManifest() ([]byte, string, error) { | ||
if len(s.named.manifest) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other transports just fail if the client did not call .PutManifest()
; are there users of containers-storage who need the manifest generated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pathological case, I guess. If Commit()
is called without a manifest having been written, we would fail, but if that's what's expected for that case, it'll probably simplify things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There have been a few cases suggested to allow adding / modifying signatures in an ImageDestination
without doing anything about the blobs/manifests… but that’s not quite relevant to this case.
In theory calling PutBlob
is expected to also call PutManifest
; if buildah or some other caller isn’t doing that currently, I guess I’d rather have them explicitly construct a manifest (using the discussed manifest interface I guess) than to have every destination deal with constructing a manifest from its own data, and do the otherwise often unnecessary bookkeeping to track all uploaded / reapplied blobs.
@@ -30,6 +33,8 @@ func newReference(transport storageTransport, reference, id string, name referen | |||
reference: reference, | |||
id: id, | |||
name: name, | |||
tag: tag, | |||
digest: digest, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why aren’t the tag and digest values stored inside the reference.Named
? That would eliminate the failure paths in DockerReference
which now have to be silently ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Also reference
vs. name
+tag
+digest
seem redundant at a first glance, but I haven’t checked in detail.)
@@ -101,60 +104,124 @@ func (s *storageTransport) DefaultGIDMap() []idtools.IDMap { | |||
// relative to the given store, and returns it in a reference object. | |||
func (s storageTransport) ParseStoreReference(store storage.Store, ref string) (*storageReference, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FIXME: Didn’t really review the reference changes yet.
FWIW I’m working on this in https://github.com/mtrmac/image/tree/manifest-editing . |
Okay, I've just repushed with some of the smaller changes discussed above. Is the |
I’d call it a moderately useful proof of concept; it shows that the API basically works, but it’s missing a top-level dispatcher ( After grafting a stub for the dispatcher it could be useful for checking that the proposed approach handles everything needed for this PR, but it can’t be used to get a complete working implementation yet I’m afraid. |
Switch from using a custom diffID type to digest.Digest, and from local image and rootfs types to manifest.Schema2Image. Signed-off-by: Nalin Dahyabhai <[email protected]>
Switch from using custom image, rootfs, imageHistory, and v1Image types to using Schema2Image, Schema2RootFS, Schema2History, and Schema2V1Image types from the manifest package. Signed-off-by: Nalin Dahyabhai <[email protected]>
The Config field in a Schema2Image is a pointer, so we need to check that it's not nil before attempting to read its Labels field when populating an ImageInspectInfo structure. Signed-off-by: Nalin Dahyabhai <[email protected]>
If the OCI image includes a DockerVersion field, attempt to preserve its value for the ImageInspectData that we're building. Signed-off-by: Nalin Dahyabhai <[email protected]>
Populate the layers list when we inspect a manifest. Signed-off-by: Nalin Dahyabhai <[email protected]>
Replace imageInspectInfo() implementations in the image package with versions that call the manifest Inspect() methods. Signed-off-by: Nalin Dahyabhai <[email protected]>
Move most of the storageImageDestination.computeID() logic into an ImageID() method of manifest types. For Schema1, break it into a two-step process, so that we construct a new configuration blob first, and digest it to produce the ID in a wrapper. Update image/manifestSchema1.convertToManifestSchema2() to use that same configuration blob. Signed-off-by: Nalin Dahyabhai <[email protected]>
When attempting to locate an image, if we have a name+digest (i.e., canonical reference), use the store's new ImagesByDigest() method to search for images that match a given digest, and which have at least one name that matches the specified name, treating the reference as an implicit name so long as it has an explicit name that matches (give or take a tag). Signed-off-by: Nalin Dahyabhai <[email protected]>
Teach storageImageSource ImageSource objects to return their size, too. A consumer that can do the type assertion can avoid some parsing that we do when wrapping an ImageSource to create an Image if it was previously only initializing the Image to get this one piece of information. Signed-off-by: Nalin Dahyabhai <[email protected]>
Now that the storage library lets us set the digest of an image, set it to the digest of a preprocessed copy of the image's digest, in case it differs from the manifest as we store it. Signed-off-by: Nalin Dahyabhai <[email protected]>
f2968c9
to
2e783a3
Compare
Fixed some issues around parsing references that were possibly-truncated image IDs. |
usage example: skopeo copy docker-daemon:busybox:latest containers-storage:\[overlay2@/var/lib/containers/storage+/var/run/containers/storage:overlay2.ostree_repo=/var/lib/containers/storage/overlay2/ostree/.repo,overlay2.override_kernel_check=1\]\busybox Skopeo needs a bugfix from: containers/image#305 which is not merged yet. Just apply this patch: diff --git a/vendor/github.com/containers/image/storage/storage_image.go b/vendor/github.com/containers/image/storage/storage_image.go index 08fa71b..5edccce 100644 --- a/vendor/github.com/containers/image/storage/storage_image.go +++ b/vendor/github.com/containers/image/storage/storage_image.go @@ -174,7 +174,7 @@ func (s *storageImageDestination) putBlob(stream io.Reader, blobinfo types.BlobI id = ddigest.Canonical.FromBytes([]byte(parentLayer + "+" + digest.String())).Hex() } // Attempt to create the identified layer and import its contents. - layer, uncompressedSize, err := s.imageRef.transport.store.PutLayer(id, parentLayer, nil, "", true, multi) + layer, uncompressedSize, err := s.imageRef.transport.store.PutLayer(id, parentLayer, nil, "", false, multi) if err != nil && errors.Cause(err) != storage.ErrDuplicateID { logrus.Debugf("error importing layer blob %q as %q: %v", blobinfo.Digest, id, err) return errorBlobInfo, err Signed-off-by: Giuseppe Scrivano <[email protected]>
usage example: skopeo copy docker-daemon:busybox:latest containers-storage:\[overlay2@/var/lib/containers/storage+/var/run/containers/storage:overlay2.ostree_repo=/var/lib/containers/storage/overlay2/ostree/.repo,overlay2.override_kernel_check=1\]\busybox Skopeo needs a bugfix from: containers/image#305 which is not merged yet. Just apply this patch: diff --git a/vendor/github.com/containers/image/storage/storage_image.go b/vendor/github.com/containers/image/storage/storage_image.go index 08fa71b..5edccce 100644 --- a/vendor/github.com/containers/image/storage/storage_image.go +++ b/vendor/github.com/containers/image/storage/storage_image.go @@ -174,7 +174,7 @@ func (s *storageImageDestination) putBlob(stream io.Reader, blobinfo types.BlobI id = ddigest.Canonical.FromBytes([]byte(parentLayer + "+" + digest.String())).Hex() } // Attempt to create the identified layer and import its contents. - layer, uncompressedSize, err := s.imageRef.transport.store.PutLayer(id, parentLayer, nil, "", true, multi) + layer, uncompressedSize, err := s.imageRef.transport.store.PutLayer(id, parentLayer, nil, "", false, multi) if err != nil && errors.Cause(err) != storage.ErrDuplicateID { logrus.Debugf("error importing layer blob %q as %q: %v", blobinfo.Digest, id, err) return errorBlobInfo, err Signed-off-by: Giuseppe Scrivano <[email protected]>
usage example: skopeo copy docker-daemon:busybox:latest containers-storage:\[overlay2@/var/lib/containers/storage+/var/run/containers/storage:overlay2.ostree_repo=/var/lib/containers/storage/overlay2/ostree/.repo,overlay2.override_kernel_check=1\]\busybox Skopeo needs a bugfix from: containers/image#305 which is not merged yet. Just apply this patch: diff --git a/vendor/github.com/containers/image/storage/storage_image.go b/vendor/github.com/containers/image/storage/storage_image.go index 08fa71b..5edccce 100644 --- a/vendor/github.com/containers/image/storage/storage_image.go +++ b/vendor/github.com/containers/image/storage/storage_image.go @@ -174,7 +174,7 @@ func (s *storageImageDestination) putBlob(stream io.Reader, blobinfo types.BlobI id = ddigest.Canonical.FromBytes([]byte(parentLayer + "+" + digest.String())).Hex() } // Attempt to create the identified layer and import its contents. - layer, uncompressedSize, err := s.imageRef.transport.store.PutLayer(id, parentLayer, nil, "", true, multi) + layer, uncompressedSize, err := s.imageRef.transport.store.PutLayer(id, parentLayer, nil, "", false, multi) if err != nil && errors.Cause(err) != storage.ErrDuplicateID { logrus.Debugf("error importing layer blob %q as %q: %v", blobinfo.Digest, id, err) return errorBlobInfo, err Signed-off-by: Giuseppe Scrivano <[email protected]>
CI is still failing, I'll merge this once it's green |
Modify storage.newImageSource to not bother attempting to decode an image's metadata if it's empty. At this point, it only contains the list of signature sizes, and for unsigned images, that's empty. Signed-off-by: Nalin Dahyabhai <[email protected]>
Ci is green, I'm merging, @nalind please update skopeo and CRI-O (and c/storage if needed) |
usage example: skopeo copy docker-daemon:busybox:latest containers-storage:\[overlay2@/var/lib/containers/storage+/var/run/containers/storage:overlay2.ostree_repo=/var/lib/containers/storage/overlay2/ostree/.repo,overlay2.override_kernel_check=1\]\busybox Skopeo needs a bugfix from: containers/image#305 which is not merged yet. Just apply this patch: diff --git a/vendor/github.com/containers/image/storage/storage_image.go b/vendor/github.com/containers/image/storage/storage_image.go index 08fa71b..5edccce 100644 --- a/vendor/github.com/containers/image/storage/storage_image.go +++ b/vendor/github.com/containers/image/storage/storage_image.go @@ -174,7 +174,7 @@ func (s *storageImageDestination) putBlob(stream io.Reader, blobinfo types.BlobI id = ddigest.Canonical.FromBytes([]byte(parentLayer + "+" + digest.String())).Hex() } // Attempt to create the identified layer and import its contents. - layer, uncompressedSize, err := s.imageRef.transport.store.PutLayer(id, parentLayer, nil, "", true, multi) + layer, uncompressedSize, err := s.imageRef.transport.store.PutLayer(id, parentLayer, nil, "", false, multi) if err != nil && errors.Cause(err) != storage.ErrDuplicateID { logrus.Debugf("error importing layer blob %q as %q: %v", blobinfo.Digest, id, err) return errorBlobInfo, err Signed-off-by: Giuseppe Scrivano <[email protected]>
#305 cleanup 2: Reference handling
using buildah for push until containers/image#305 is fixed/sorted
This is a fairly large patch, but I'm hoping the new version reads better than the old one.
HasBlob() learns to check for a blob in storage using its index-by-digest APIs, so we should be able to avoid pulling layers more often.
It reworks the storageImageDestination implementation to cache blobs in a temporary directory until Commit() is called. This lets us do a couple of useful things at Commit() time:
When combined, the effect of these is that pulling a given image twice results in one image record in the local storage, and the second pull downloads only the manifest, configuration, and signatures.
It reworks the storageImageSource to return an updated version of the image's manifest that presents the layers in uncompressed form, so it can be used by copy.Image() now, and uses the newer DiffOptions to avoid needlessly recompressing things. This turned up a bug in containers/storage, so we bump the version we vendor to one that includes a fix for that.