Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems trying to load cached images into minikube #655

Closed
afbjorklund opened this issue Apr 22, 2018 · 33 comments
Closed

Problems trying to load cached images into minikube #655

afbjorklund opened this issue Apr 22, 2018 · 33 comments
Labels
locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.

Comments

@afbjorklund
Copy link
Contributor

Is this a BUG REPORT or FEATURE REQUEST?:

/kind bug

Description

Not being able to load images, not getting any error reports (only "failed, and dir:")

Steps to reproduce the issue:

  1. CRI: try to use "sudo podman load" instead of "docker load" kubernetes/minikube#2757

  2. make out/minikube out/minikube.iso

  3. ./out/minikube --container-runtime=cri-o --iso-url=file://$PWD/out/minikube.iso

Describe the results you received:

$ sudo podman load -i /tmp/busybox_latest 
Getting image source signatures
Copying blob sha256:4febd3792a1fb2153108b4fa50161c6ee5e3d16aa483a63215f936a113a88e9a
 1.30 MB / 706.12 KB [======================================================] 0s
Failed
Failed
error pulling "dir:/tmp/busybox_latest": unable to pull dir:/tmp/busybox_latest

Describe the results you expected:

Sucessfully being able to load all the images from cache, just like docker load

Additional information you deem important (e.g. issue happens only occasionally):

Seems to pull the images correctly from the network, just not from the load disk.

Output of podman version:

Version:       0.4.3
Go Version:    go1.9
Git Commit:    "9dd76697ccd2bac65a78fd7687899e1c9ca14465"
Built:         Sat Apr 21 07:27:19 2018
OS/Arch:       linux/amd64

Output of podman info:

$ podman info
could not get runtime: mkdir /var/run/containers/storage: permission denied
$ sudo podman info
host:
  MemFree: 22269952
  MemTotal: 2097205248
  SwapFree: 983404544
  SwapTotal: 1048571904
  arch: amd64
  cpus: 2
  hostname: minikube
  kernel: 4.9.64
  os: linux
  uptime: 4h 10m 21.66s (Approximately 0.17 days)
insecure registries:
  registries: []
registries:
  registries:
  - docker.io
store:
  ContainerStore:
    number: 27
  GraphDriverName: overlay
  GraphOptions: null
  GraphRoot: /var/lib/containers/storage
  GraphStatus:
    Backing Filesystem: extfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
  ImageStore:
    number: 13
  RunRoot: /var/run/containers/storage

Additional environment details (AWS, VirtualBox, physical, etc.):

VirtualBox

@mheon
Copy link
Member

mheon commented Apr 23, 2018

@afbjorklund Any chance we can get a reproducer without Minikube? It would be a lot easier if we could pull the image it's trying to import out so we can directly inspect it and try and figure out what's going on.

Also, could you add --log-level=debug and see if you get more detailed error information?

@afbjorklund
Copy link
Contributor Author

Sure, will try to do that (and the ISO?)

Also : pull + save + rmi + load works OK

@baude
Copy link
Member

baude commented Apr 23, 2018

does it work outside minicube ?

@rhatdan
Copy link
Member

rhatdan commented Apr 23, 2018

podman is not part of minikube. This example looks like he is prestaging the container with podman and then attempting to run a container with CRI-O. The problem is here.

$ sudo podman load -i /tmp/busybox_latest 
Getting image source signatures
Copying blob sha256:4febd3792a1fb2153108b4fa50161c6ee5e3d16aa483a63215f936a113a88e9a
 1.30 MB / 706.12 KB [======================================================] 0s
Failed
Failed
error pulling "dir:/tmp/busybox_latest": unable to pull dir:/tmp/busybox_latest

How did you create /tmp/busybox_latest

@rhatdan
Copy link
Member

rhatdan commented Apr 23, 2018

@umohnani8 PTAL

@afbjorklund
Copy link
Contributor Author

afbjorklund commented Apr 23, 2018

I think it was created by Docker in the client originally, and then copied from the cache...

I have kubeadm running OK with crio/crictl, so trying to replace the docker load now

@afbjorklund
Copy link
Contributor Author

The busybox was just an example, the main containers are all cached from kubernetes

@afbjorklund
Copy link
Contributor Author

@rhatdan : minikube 0.26 still had kpod

@mheon
Copy link
Member

mheon commented Apr 23, 2018

podman load on an image from docker export should work, though I'm not sure we verify that in our test suite right now. Worth adding if we don't.

@umohnani8
Copy link
Member

Looks like podman load cannot figure out what transport was used to create tmp/busybox_latest so it is trying the three we support (docker-archive, oci-archive, and dir) and then failing.
podman load works with docker save:

➜  libpod git:(user) sudo docker save -o busybox.tar busybox
➜  libpod git:(user) ✗ sudo podman load -i busybox.tar 
Getting image source signatures
Skipping fetch of repeat blob sha256:0314be9edf00a925d59f9b88c9d8ccb34447ab677078874d8c14e7a6816e21e1
Copying config sha256:8ac48589692a53a9b8c2d1ceaa6b402665aa7fe667ba51ccc03002300856d8c7
 1.46 KB / 1.46 KB [========================================================] 0s
Writing manifest to image destination
Storing signatures
Loaded image:  docker.io/busybox:latest

Most probably something with the /tmp/busybox_latest cached image.

@afbjorklund
Copy link
Contributor Author

Here is the image, had to gzip it to attach it: busybox_latest.gz

@afbjorklund
Copy link
Contributor Author

@mheon
Copy link
Member

mheon commented Apr 23, 2018

@afbjorklund Thanks, that helps a lot

@afbjorklund
Copy link
Contributor Author

Here is the entire log, with debugging enabled:

                         _             _            
            _         _ ( )           ( )           
  ___ ___  (_)  ___  (_)| |/')  _   _ | |_      __  
/' _ ` _ `\| |/' _ `\| || , <  ( ) ( )| '_`\  /'__`\
| ( ) ( ) || || ( ) || || |\`\ | (_) || |_) )(  ___/
(_) (_) (_)(_)(_) (_)(_)(_) (_)`\___/'(_,__/'`\____)

$ sudo podman --log-level=debug load -i /tmp/busybox_latest 
DEBU[0000] overlay test mount with multiple lowers succeeded 
DEBU[0000] backingFs=extfs, projectQuotaSupported=false, useNativeDiff=true 
INFO[0000] [graphdriver] using prior storage driver: overlay 
INFO[0000] CNI network rkt.kubernetes.io (type=bridge) is used from /etc/cni/net.d/k8s.conf 
INFO[0000] Initial CNI setting succeeded                
DEBU[0000] parsed reference to refname into "[overlay@/var/lib/containers/storage+/var/run/containers/storage]docker.io/library/busybox:latest" 
DEBU[0000] IsRunningImageAllowed for image docker-archive: 
DEBU[0000]  Using default policy section                
DEBU[0000]  Requirement 0: allowed                      
DEBU[0000] Overall: allowed                             
Getting image source signatures
DEBU[0000] Manifest has MIME type application/vnd.docker.distribution.manifest.v2+json, ordered candidate list [application/vnd.docker.distribution.manifest.v2+json, application/vnd.docker.distribution.manifest.v1+prettyjws, application/vnd.oci.image.manifest.v1+json, application/vnd.docker.distribution.manifest.v1+json] 
DEBU[0000] ... will first try using the original manifest unmodified 
Copying blob sha256:4febd3792a1fb2153108b4fa50161c6ee5e3d16aa483a63215f936a113a88e9a
DEBU[0000] Detected compression format gzip             
DEBU[0000] No compression detected                      
 0 B / 706.12 KB [-------------------------------------------------------------]DEBU[0000] Using original blob without modification     
 1.30 MB / 706.12 KB [======================================================] 0s
Failed
DEBU[0000] parsed reference to refname into "[overlay@/var/lib/containers/storage+/var/run/containers/storage]docker.io/tmp/busybox_latest:latest" 
Failed
ERRO[0000] error pulling "dir:/tmp/busybox_latest": unable to pull dir:/tmp/busybox_latest 
$ 

@afbjorklund
Copy link
Contributor Author

afbjorklund commented Apr 23, 2018

Here is the actual list of containers, from the minikube cache:

Attempting to cache image: k8s.gcr.io/k8s-dns-sidecar-amd64:1.14.4 at /home/anders/.minikube/cache/images/k8s.gcr.io/k8s-dns-sidecar-amd64_1.14.4
Attempting to cache image: k8s.gcr.io/kubernetes-dashboard-amd64:v1.8.1 at /home/anders/.minikube/cache/images/k8s.gcr.io/kubernetes-dashboard-amd64_v1.8.1
Attempting to cache image: k8s.gcr.io/kube-scheduler-amd64:v1.10.0 at /home/anders/.minikube/cache/images/k8s.gcr.io/kube-scheduler-amd64_v1.10.0
Attempting to cache image: k8s.gcr.io/etcd-amd64:3.0.17 at /home/anders/.minikube/cache/images/k8s.gcr.io/etcd-amd64_3.0.17
Attempting to cache image: gcr.io/k8s-minikube/storage-provisioner:v1.8.0 at /home/anders/.minikube/cache/images/gcr.io/k8s-minikube/storage-provisioner_v1.8.0
Attempting to cache image: k8s.gcr.io/kube-proxy-amd64:v1.10.0 at /home/anders/.minikube/cache/images/k8s.gcr.io/kube-proxy-amd64_v1.10.0
Attempting to cache image: k8s.gcr.io/pause-amd64:3.0 at /home/anders/.minikube/cache/images/k8s.gcr.io/pause-amd64_3.0
Attempting to cache image: k8s.gcr.io/kube-controller-manager-amd64:v1.10.0 at /home/anders/.minikube/cache/images/k8s.gcr.io/kube-controller-manager-amd64_v1.10.0
Attempting to cache image: k8s.gcr.io/kube-addon-manager:v6.5 at /home/anders/.minikube/cache/images/k8s.gcr.io/kube-addon-manager_v6.5
Attempting to cache image: k8s.gcr.io/kube-apiserver-amd64:v1.10.0 at /home/anders/.minikube/cache/images/k8s.gcr.io/kube-apiserver-amd64_v1.10.0
Attempting to cache image: k8s.gcr.io/k8s-dns-kube-dns-amd64:1.14.4 at /home/anders/.minikube/cache/images/k8s.gcr.io/k8s-dns-kube-dns-amd64_1.14.4
Attempting to cache image: k8s.gcr.io/k8s-dns-dnsmasq-nanny-amd64:1.14.4 at /home/anders/.minikube/cache/images/k8s.gcr.io/k8s-dns-dnsmasq-nanny-amd64_1.14.4
Successfully cached all images.

These are all copied over to /tmp in the minikube VM, and then:

Run: sudo podman load -i /tmp/kubernetes-dashboard-amd64_v1.8.1
Run: sudo podman load -i /tmp/storage-provisioner_v1.8.0
Run: sudo podman load -i /tmp/pause-amd64_3.0
Run: sudo podman load -i /tmp/k8s-dns-dnsmasq-nanny-amd64_1.14.4
Run: sudo podman load -i /tmp/k8s-dns-sidecar-amd64_1.14.4
Run: sudo podman load -i /tmp/kube-scheduler-amd64_v1.10.0
Run: sudo podman load -i /tmp/k8s-dns-kube-dns-amd64_1.14.4
Run: sudo podman load -i /tmp/kube-addon-manager_v6.5
Run: sudo podman load -i /tmp/kube-controller-manager-amd64_v1.10.0
Run: sudo podman load -i /tmp/kube-proxy-amd64_v1.10.0
Run: sudo podman load -i /tmp/kube-apiserver-amd64_v1.10.0
Run: sudo podman load -i /tmp/etcd-amd64_3.0.17

All of them seem to fail silently, and then it pulls them all again...

@mheon
Copy link
Member

mheon commented Apr 23, 2018

@umohnani8 Can you give this another look now that we have an image to test with?

@umohnani8
Copy link
Member

@mheon working on it.

@mheon
Copy link
Member

mheon commented Apr 24, 2018

Thanks!

@afbjorklund
Copy link
Contributor Author

afbjorklund commented Apr 24, 2018

The actual error is "Error writing blob: blob size mismatch", but it only writes "Failed".

Seems like it is not expecting a mix of uncompressed and compressed blobs perhaps ?

sha256:57310166fe88e0dc63a80ca5c219283a932db0f3969712e2f8a86ada143bf566: gzip compressed data
sha256:5b0d59026729b68570d99bc4f3f7c31a2e4f2a5736435641565d93e7c25bd2c3: ASCII text, with very long lines, with no line terminators
manifest.json:                                                           ASCII text, with no line terminators

Also seen by the progressbar overflowing while copying, 1.30 MB / 706.12 KB

method  crc     date  time           compressed        uncompressed  ratio uncompressed_name
defla 1634ab94 Jan  1 01:00              723070             1360384  46.8% ./sha256:57310166fe88e0dc63a80ca5c219283a932db0f3969712e2f8a86ada143bf566

@mheon
Copy link
Member

mheon commented Apr 24, 2018

This sounds like it could be a c/image bug

@rhatdan
Copy link
Member

rhatdan commented Apr 24, 2018

@mtrmac PTAL

@mtrmac
Copy link
Collaborator

mtrmac commented Apr 24, 2018

For starters, https://github.com/projectatomic/libpod/blob/27107fdac1d75f97caab47cd13efb1d9900cf350/libpod/image/image.go#L141 is swallowing the underlying error report.

The error is, per skopeo --insecure-policy copy docker-archive:$path dir:t

FATA[0000] Error writing blob: Size mismatch when copying sha256:4febd3792a1fb2153108b4fa50161c6ee5e3d16aa483a63215f936a113a88e9a, expected 723070, got 1360384 

@mheon
Copy link
Member

mheon commented Apr 24, 2018

So that's where our debug was going... Nice catch @mtrmac
I'll get a patch in to get more detailed logging output from there.

@mtrmac
Copy link
Collaborator

mtrmac commented Apr 24, 2018

First of all, please rebase the minikube version of c/image to something fresh, maybe master. The archive creation code has been extensively modified, and that includes decompressing the layers when creating docker-archive: tarballs.

(Note: You may want to add gziping of the created tarball after the update, otherwise the size of the archive may be much larger.)

@mtrmac
Copy link
Collaborator

mtrmac commented Apr 24, 2018

… the flip side, though, is that c/image/docker/tarfile.Source.GetBlob is trying to transparently decompress the blobs, while …/tarfile.Source.prepareLayerData creates a manifest with the compressed sizes.

That can’t work, and is mostly fixed (a bit stalled?) in containers/image#427 .

@afbjorklund
Copy link
Contributor Author

@mtrmac: please rebase the minikube version of c/image to something fresh, maybe master

Hmm, this is starting to have more far-reaching implications than a simple s/docker/podman/ :-)

@mtrmac
Copy link
Collaborator

mtrmac commented Apr 25, 2018

please rebase the minikube version of c/image to something fresh, maybe master

Hmm, this is starting to have more far-reaching implications than a simple s/docker/podman/ :-)

:) IIRC fixing either the minikube or the podman side would be sufficient—but the podman-side fix is not quite available right now.

@afbjorklund
Copy link
Contributor Author

afbjorklund commented Apr 27, 2018

Seems like minikube is aiming for switching to another library instead: kubernetes/minikube#2730
Not sure if this makes matters better or worse, or just different ? Or if it even changes the cache at all.

https://github.com/containers/image -->
https://github.com/google/go-containerregistry

@rhatdan
Copy link
Member

rhatdan commented Jun 4, 2018

@mheon What is the current state of this?

@mheon
Copy link
Member

mheon commented Jun 4, 2018

The c/image fix upstream seems to be stalled.

@afbjorklund
Copy link
Contributor Author

And minikube still has kpod-b85d0fa (Dec 2017)

@rhatdan
Copy link
Member

rhatdan commented Jul 12, 2018

Still blocked , but I have pinged @cyphar to see if the PR can get some movement.

@afbjorklund
Copy link
Contributor Author

containers/image#481

Seems to be working now (v0.7.4), thanks.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 24, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 24, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments.
Projects
None yet
Development

No branches or pull requests

7 participants