Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Podman Pull Images always error when docker not #4251

Closed
benyaminl opened this issue Oct 13, 2019 · 35 comments
Closed

Podman Pull Images always error when docker not #4251

benyaminl opened this issue Oct 13, 2019 · 35 comments
Assignees
Labels
do-not-close kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. stale-issue

Comments

@benyaminl
Copy link

benyaminl commented Oct 13, 2019

/kind bug
Hello. I'm quite frustated with podman when trying to pull images. I take hours and hours to take images but always fail. This's the podman version

host:
  BuildahVersion: 1.9.0
  Conmon:
    package: podman-1.4.4-4.el7.centos.x86_64
    path: /usr/libexec/podman/conmon
    version: 'conmon version 0.3.0, commit: unknown'
  Distribution:
    distribution: '"centos"'
    version: "7"
  MemFree: 2073849856
  MemTotal: 3806199808
  OCIRuntime:
    package: containerd.io-1.2.10-3.2.el7.x86_64
    path: /usr/bin/runc
    version: |-
      runc version 1.0.0-rc8+dev
      commit: 3e425f80a8c931f88e6d94a8c831b9d5aa481657
      spec: 1.0.1-dev
  SwapFree: 3877470208
  SwapTotal: 3892310016
  arch: amd64
  cpus: 4
  hostname: thinkpad.localdomain
  kernel: 5.3.5-1.el7.elrepo.x86_64
  os: linux
  rootless: true
  uptime: 30h 5m 1.65s (Approximately 1.25 days)
registries:
  blocked: null
  insecure: null
  search:
  - registry.access.redhat.com
  - docker.io
  - registry.fedoraproject.org
  - quay.io
  - registry.centos.org
store:
  ConfigFile: /home/ben/.config/containers/storage.conf
  ContainerStore:
    number: 1
  GraphDriverName: vfs
  GraphOptions: null
  GraphRoot: /home/ben/.local/share/containers/storage
  GraphStatus: {}
  ImageStore:
    number: 3
  RunRoot: /tmp/1000
  VolumePath: /home/ben/.local/share/containers/storage/volumes

I tried to pull docker.io/benyaminl/lap7-dev:lumen
When I'm on docker it's running okay, but with podman always error.

Copying blob 1e50d3b4ec49 done
Copying blob 3f410603f6a7 done
Copying blob 0d163c86f78f done
Copying blob 9375fe5d3314 done
Copying blob 32c10dbc9ed1 done
Copying blob a2badc3a0924 done
Copying blob a017eb278f96 done
Copying blob 9550adab8c09 done
ERRO[0171] Error pulling image ref //benyaminl/lap7-dev:latest: Error reading blob sha256:8988d7a8220b2f6081954001f8a4797c966251201a8629e7d4b92c619c0978b7: Get https://registry-1.docker.io/v2/benyaminl/lap7-dev/blobs/sha256:8988d7a8220b2f6081954001f8a4797c966251201a8629e7d4b92c619c0978b7: net/http: TLS handshake timeout 
Failed
Error: error pulling image "docker.io/benyaminl/lap7-dev": unable to pull docker.io/benyaminl/lap7-dev: unable to pull image: Error reading blob sha256:8988d7a8220b2f6081954001f8a4797c966251201a8629e7d4b92c619c0978b7: Get https://registry-1.docker.io/v2/benyaminl/lap7-dev/blobs/sha256:8988d7a8220b2f6081954001f8a4797c966251201a8629e7d4b92c619c0978b7: net/http: TLS handshake timeout

Can anyone enlighten me, not even single people reply my chat on #podman on freenode. I really confused. Any help is appriciated. Thanks

2nd When I try to add or run container inside pod, an error is generated like this

ERRO[0001] error starting some container dependencies   
ERRO[0001] "error from slirp4netns while setting up port redirection: map[desc:bad request: add_hostfwd: slirp_add_hostfwd failed]" 
Error: error starting some containers: internal libpod error

seems have connection to #2964

Slirp4netns version 1.41, still can't run on rootless mode.

@benyaminl
Copy link
Author

I Also just move to version 1.6.2-dev but still can't run. What'sthe problem anyway. I'm on centos 7.7.

@benyaminl
Copy link
Author

I do something stupid. But the way podman pull still is a problem. It's not working when the connection is really bad, where docker will retry untill the images is pulled.

The rootless is working, https://www.reddit.com/r/Fedora/comments/bl100f/problem_with_podman_and_lamp_server/

@openshift-ci-robot openshift-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Oct 14, 2019
@giuseppe
Copy link
Member

it seems you are hitting two different problems.

@mtrmac could you please take a look at the TLS handshake timeout error? Is it something that is already tracked in containers/image?

About the slirp4netns error, how are you creating the container? Are you trying to use a port that is already used on the host?

@benyaminl
Copy link
Author

it seems you are hitting two different problems.

@mtrmac could you please take a look at the TLS handshake timeout error? Is it something that is already tracked in containers/image?

About the slirp4netns error, how are you creating the container? Are you trying to use a port that is already used on the host?

I already fix it. It's my wrong doing I forgot when I do

podman pod create --name=local -p 80:8080

That means bind in local on port 80 and on container 8080 for container on that pod
It should be

podman pod create --name=local -p 8080:80

8080 on localhost and 80 on container, and I forgot when on rootles mode, I can't bind on root port, but I get an answer like thi rootless-containers/slirp4netns#154 (comment)
I don't test it but anyway it should work.

Now the other problem is, the container can't connect to the outside world, so I can't use xdebug on mycontainer. I tried to see podman network ls, the podman network is there, but when do ip add, no podman bridge network. Is there anyway to fix it? Thanks I really desperate on working podman, as I want to break free from docker as fast as I can, and teach podman to my student on my unversity.

@mtrmac
Copy link
Collaborator

mtrmac commented Oct 14, 2019

package: containerd.io-1.2.10-3.2.el7.x86_64
is that expected to work?

@benyaminl
Copy link
Author

package: containerd.io-1.2.10-3.2.el7.x86_64
is that expected to work?

Now it works and I can bind both mysql and benyamin/lap7-dev using podman pod
image

[ben@thinkpad ~]$ podman info
host:
  BuildahVersion: 1.11.3
  CgroupVersion: v1
  Conmon:
    package: Unknown
    path: /usr/local/bin/conmon
    version: 'conmon version 2.0.3-dev, commit: bc758d8bd98a29ac3aa4f62a886575bfec0e39a1'
  Distribution:
    distribution: '"centos"'
    version: "7"
  IDMappings:
    gidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 110000
      size: 65536
    uidmap:
    - container_id: 0
      host_id: 1000
      size: 1
    - container_id: 1
      host_id: 110000
      size: 65536
  MemFree: 356634624
  MemTotal: 3806199808
  OCIRuntime:
    package: containerd.io-1.2.10-3.2.el7.x86_64
    path: /usr/bin/runc
    version: |-
      runc version 1.0.0-rc8+dev
      commit: 3e425f80a8c931f88e6d94a8c831b9d5aa481657
      spec: 1.0.1-dev
  SwapFree: 3881861120
  SwapTotal: 3892310016
  arch: amd64
  cpus: 4
  eventlogger: file
  hostname: thinkpad.localdomain
  kernel: 5.3.5-1.el7.elrepo.x86_64
  os: linux
  rootless: true
  slirp4netns:
    Executable: /usr/bin/slirp4netns
    Package: Unknown
    Version: |-
      slirp4netns version 0.4.1
      commit: 4d38845e2e311b684fc8d1c775c725bfcd5ddc27
  uptime: 2h 36m 25.22s (Approximately 0.08 days)
registries:
  blocked: null
  insecure: null
  search:
  - registry.access.redhat.com
  - docker.io
  - registry.fedoraproject.org
  - quay.io
  - registry.centos.org
store:
  ConfigFile: /home/ben/.config/containers/storage.conf
  ContainerStore:
    number: 3
  GraphDriverName: vfs
  GraphOptions: {}
  GraphRoot: /home/ben/.local/share/containers/storage
  GraphStatus: {}
  ImageStore:
    number: 5
  RunRoot: /tmp/1000
  VolumePath: /home/ben/.local/share/containers/storage/volumes

@benyaminl
Copy link
Author

benyaminl commented Oct 18, 2019

Still happend again. This even worst, my images is crossing with mysql images cause my images to run mysql even it doesn't have one. So i tried to repull the images, and now the problem occur again.

Trying to pull docker.io/benyaminl/lap7-dev...
Getting image source signatures
Copying blob 614cc2da428c done
Copying blob a68329665e32 done
Copying blob 0f1dd1514e73 done
Copying blob 3f410603f6a7 done
Copying blob 1e50d3b4ec49 done
Copying blob 0d163c86f78f done
Copying blob 0d991f0de0c1 done
Copying blob 32c10dbc9ed1 done
Copying blob a2badc3a0924 done
Copying blob cb069c6f8fa4 done
Copying blob a017eb278f96 done
Copying blob 8988d7a8220b done
Copying blob 961dc2dd165b done
Copying blob 961dc2dd165b done
  read tcp 192.168.101.125:47454->104.18.124.25:443: read: connection reset by peer
Error: error pulling image "docker.io/benyaminl/lap7-dev": unable to pull docker.io/benyaminl/lap7-dev: unable to pull image: Error writing blob: error storing blob to file "/var/tmp/storage439069159/4": read tcp 192.168.101.125:47454->104.18.124.25:443: read: connection reset by peer

on the 2nd try, the old problem occur again

[ben@thinkpad ~]$ podman pull docker.io/benyaminl/lap7-dev
Trying to pull docker.io/benyaminl/lap7-dev...
Getting image source signatures
Copying blob 1e50d3b4ec49 done
Copying blob 3f410603f6a7 done
Copying blob a68329665e32 done
Copying blob 0f1dd1514e73 done
Copying blob 0d991f0de0c1 done
Copying blob 32c10dbc9ed1 done
Copying blob cb069c6f8fa4 done
Copying blob 0d163c86f78f done
Copying blob 614cc2da428c done
Copying blob a2badc3a0924 done
Copying blob a017eb278f96 done
Copying blob 8988d7a8220b done
Copying blob 961dc2dd165b done
Copying blob 9550adab8c09 done
  Get https://production.cloudflare.docker.com/registry-v2/docker/registry/v2/blobs/sha256/93/9375fe5d3314d14e8d64c0f915e21f921f4f8b90a115b598a9602bba3bce08aa/data?verify=1571372519-4g9MI%2BDX1gjM%2Bb7H6ClVq%2FtJJVA%3D: net/http: TLS handshake timeout
Error: error pulling image "docker.io/benyaminl/lap7-dev": unable to pull docker.io/benyaminl/lap7-dev: unable to pull image: Error reading blob sha256:9375fe5d3314d14e8d64c0f915e21f921f4f8b90a115b598a9602bba3bce08aa: Get https://production.cloudflare.docker.com/registry-v2/docker/registry/v2/blobs/sha256/93/9375fe5d3314d14e8d64c0f915e21f921f4f8b90a115b598a9602bba3bce08aa/data?verify=1571372519-4g9MI%2BDX1gjM%2Bb7H6ClVq%2FtJJVA%3D: net/http: TLS handshake timeout

And After some try it happend to be like this

  Get https://registry-1.docker.io/v2/benyaminl/lap7-dev/blobs/sha256:8f91359f1fffbf32b24ca854fb263d88a222371f38e90cf4583c5742cfdc3039: dial tcp: lookup registry-1.docker.io on [::1]:53: read udp [::1]:35282->[::1]:53: read: connection refused
Error: error pulling image "docker.io/benyaminl/lap7-dev:lumen": unable to pull docker.io/benyaminl/lap7-dev:lumen: unable to pull image: Error reading blob sha256:8f91359f1fffbf32b24ca854fb263d88a222371f38e90cf4583c5742cfdc3039: Get https://registry-1.docker.io/v2/benyaminl/lap7-dev/blobs/sha256:8f91359f1fffbf32b24ca854fb263d88a222371f38e90cf4583c5742cfdc3039: dial tcp: lookup registry-1.docker.io on [::1]:53: read udp [::1]:35282->[::1]:53: read: connection refused

@github-actions
Copy link

This issue had no activity for 30 days. In the absence of activity or the "do-not-close" label, the issue will be automatically closed within 7 days.

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 18, 2019

(Without actually understanding the specific root cause, given the available information), the fix basically needs to be to improve the reliability of the network connection.

It might, perhaps, be reasonable for the clients to retry on handshake or transfer timeouts (although even in that case it’s unclear whether it is justified, or whether the caller should be informed immediately, so that it can e.g. decide to fall back to a different registry quickly without long pointless retries), but things like

  Get https://registry-1.docker.io/v2/benyaminl/lap7-dev/blobs/sha256:8f91359f1fffbf32b24ca854fb263d88a222371f38e90cf4583c5742cfdc3039: dial tcp: lookup registry-1.docker.io on [::1]:53: read udp [::1]:35282->[::1]:53: read: connection refused

are undistinguishable from deliberate rejections (because servers are down or because network policy prohibits the connection), and it does not seem warranted to me for the code to retry on such failures.

@benyaminl
Copy link
Author

(Without actually understanding the specific root cause, given the available information), the fix basically needs to be to improve the reliability of the network connection.

It might, perhaps, be reasonable for the clients to retry on handshake or transfer timeouts (although even in that case it’s unclear whether it is justified, or whether the caller should be informed immediately, so that it can e.g. decide to fall back to a different registry quickly without long pointless retries), but things like

  Get https://registry-1.docker.io/v2/benyaminl/lap7-dev/blobs/sha256:8f91359f1fffbf32b24ca854fb263d88a222371f38e90cf4583c5742cfdc3039: dial tcp: lookup registry-1.docker.io on [::1]:53: read udp [::1]:35282->[::1]:53: read: connection refused

are undistinguishable from deliberate rejections (because servers are down or because network policy prohibits the connection), and it does not seem warranted to me for the code to retry on such failures.

Uhm the problem is, on docker it keep trying and it succed, but on podman it tried and fail whole time if the image is big, especially on poor connection.

@rhatdan
Copy link
Member

rhatdan commented Nov 19, 2019

@mtrmac Would adding a --retry flag or something to podman make sense? Or since we can not distinquish we can just fail.

On Docker, how does it work if their are multiple registries?

@mtrmac
Copy link
Collaborator

mtrmac commented Nov 19, 2019

@mtrmac Would adding a --retry flag or something to podman make sense? Or since we can not distinquish we can just fail.

Well, we are just failing.


Actually , looking at the error message more closely,

read udp [::1]:35282->[::1]:53

These are both localhost addresses. So, it’s not just that the network is so bad, it’s that (presumably) the network has already brought down the localhost DNS server to the point of complete inoperability. How does anything work in such an environment? curl? wget? A web browser? How is Podman supposed to work in such an environment? Bundle a fallback DNS client inside c/image, contacting the root DNS servers directly instead of whatever is in /etc/resolv.conf?

(Given the limited information so far), it’s just not remotely practical; this must be fixed in the environment external to c/image and Podman.


(BTW, the way pulls work now, we download all blobs to a temporary directory, and only apply them to c/storage when all are successfully downloaded; on failure, the blobs are deleted and when retrying, the download starts anew from the start. Completely incidentally, containers/image#611 , among many other things, changes the process to create a c/storage layer immediately after the necessary blob is downloaded; on failure, the layers stay around and when retrying, only the remaining layers need to be downloaded. So, c/image might eventually behave a bit better when externally induced to retry — OTOH, arguably, leaving orphaned layers around is a bug and we should clean them up on failure. So, this is not a promise that just re-running podman pull will behave better in the future, and, to an extent, it is an argument in favor of adding retries — but still, the network infrastructure must be basically working, more than this one seems to be.)

@benyaminl
Copy link
Author

@mtrmac Would adding a --retry flag or something to podman make sense? Or since we can not distinquish we can just fail.

Well, we are just failing.

Actually , looking at the error message more closely,

read udp [::1]:35282->[::1]:53

These are both localhost addresses. So, it’s not just that the network is so bad, it’s that (presumably) the network has already brought down the localhost DNS server to the point of complete inoperability. How does anything work in such an environment? curl? wget? A web browser? How is Podman supposed to work in such an environment? Bundle a fallback DNS client inside c/image, contacting the root DNS servers directly instead of whatever is in /etc/resolv.conf?

(Given the limited information so far), it’s just not remotely practical; this must be fixed in the environment external to c/image and Podman.

(BTW, the way pulls work now, we download all blobs to a temporary directory, and only apply them to c/storage when all are successfully downloaded; on failure, the blobs are deleted and when retrying, the download starts anew from the start. Completely incidentally, containers/image#611 , among many other things, changes the process to create a c/storage layer immediately after the necessary blob is downloaded; on failure, the layers stay around and when retrying, only the remaining layers need to be downloaded. So, c/image might eventually behave a bit better when externally induced to retry — OTOH, arguably, leaving orphaned layers around is a bug and we should clean them up on failure. So, this is not a promise that just re-running podman pull will behave better in the future, and, to an extent, it is an argument in favor of adding retries — but still, the network infrastructure must be basically working, more than this one seems to be.)

Some of the conversation on the #podman irc, many of them said on RH based labs, or enterprise system they will always use local dns or corporate DNS so I think it;s practical to retry certain times until failure like docker does.

@vrothberg
Copy link
Member

@mtrmac, how shall we proceed?

I suggest we open an issue over at c/image to clean up the temp files in case of an error.

We could move this issue over to c/image to discuss if (and in which cases) a retry might be justified.

WDYT?

@mtrmac
Copy link
Collaborator

mtrmac commented Jan 31, 2020

I think we do clean up the temporary files in case of an error. (We won’t remove correctly-created intermediate layers in c/storage if creating the rest of the image fails — but, well, that would actually help avoid redundant downloads in this case.)

As for discussing automatic retries in c/image — sure, that would be more accurate than Podman, I guess. Still, I can’t (yet?) see that it makes sense to retry on localhost DNS failures (#4251 (comment) ), so I’m not sure we can make the reporter happy.

I’d prefer to just close this “won’t fix”, but if someone figures out a clean way to retry in the right cases, sure, I guess… It should be easier with containers/image#703 (but that could help with pulls, not pushes).

@vrothberg
Copy link
Member

I trust your guts and agree to close the issue. If someone comes up with a clean way forward, we'd be more than happy and welcome contributions!

@Aruscha
Copy link

Aruscha commented Jul 6, 2020

Hey,

I use Centos8 and have problems with podman as well.

Via WLAN I can download everything from podman but not via LAN (which is important for a server)

Smaller pulls such as : hello-world, nginx, apline etc. go. Mysql, Nextcloud, Owncloud as example do not work.

Always get connection reset by peer.

I am at cloudflare and have heard that there are problems in this regard, is that true?

Docker doesn't cause me any problems at all, but I'd like to use podman - any ideas?

@rhatdan
Copy link
Member

rhatdan commented Jul 6, 2020

This looks like we need the retry stuff that @QiWang19 was working with in Skopeo?

@Aruscha
Copy link

Aruscha commented Jul 6, 2020

IMG_20200706_230058_061

After Skopeo, my IP always gets kicked out... Which means I have no right to use it.

@QiWang19
Copy link
Contributor

QiWang19 commented Jul 6, 2020

It might be helpful to have the default retry behavior like buildah retryCopyImage https://github.com/containers/buildah/blob/8c807dd1450069ddde8e21bec2bd996dc7a7969a/common.go#L125

@rhatdan
Copy link
Member

rhatdan commented Jul 7, 2020

SGTM
@QiWang19 Could you open up a PR to add this to Podman.

@ngdio
Copy link

ngdio commented Aug 26, 2020

The patch doesn't fix the issue for me. The problem is that all chunks are redownloaded, even if just one of them fails. The same error occurs again and again on each try, so this isn't helping at all, unfortunately.

@QiWang19
Copy link
Contributor

@ngdio can you provide more information about the errors you get?

@ngdio
Copy link

ngdio commented Aug 26, 2020

Basically the same thing.

Trying to pull registry.opensuse.org/homeassistant/home-assistant:stable...
  name unknown
Trying to pull docker.io/homeassistant/home-assistant:stable...
Getting image source signatures
Copying blob df20fa9351a1 skipped: already exists
Copying blob 1f4c023abf28 done
Copying blob 120bcf18546f done
Copying blob 93146756f5f0 done
Copying blob 471877730a1b [=========>----------------------------] 423.0b / 1.5KiB
Copying blob ea9d2faca304 done
Copying blob e10a31dbe020 [======================================] 43.0KiB / 43.6KiB
Copying blob df9e17b1831c [===================================>--] 3.7MiB / 3.8MiB
Copying blob 6b87c9a33ccb [======================================] 5.2MiB / 5.2MiB
Copying blob b414c7868ddc [======================================] 39.2MiB / 39.2MiB
Copying blob 3267a8c43b18 done
Copying blob 6e4b124e92eb done
Copying blob 63cc34c55acb done
Copying blob 907902c8f2df done
Copying blob f9586f677651 done
Copying blob da039c516da1 done
Copying blob 6e37e4627aaa done
Copying blob a4be8ff2809f [======================================] 219.0MiB / 219.0MiB
Copying blob 90e92f5708f8 done
Copying blob c3f55eb68657 done
Getting image source signatures
Copying blob df20fa9351a1 skipped: already exists
Copying blob 1f4c023abf28 done
Copying blob 120bcf18546f [======================================] 47.2MiB / 47.2MiB
Copying blob ea9d2faca304 done
Copying blob 471877730a1b done
Copying blob e10a31dbe020 done
Copying blob 93146756f5f0 [======================================] 21.3MiB / 21.3MiB
Copying blob df9e17b1831c done
Copying blob 6b87c9a33ccb [======================================] 5.2MiB / 5.2MiB
Copying blob b414c7868ddc done
Copying blob 3267a8c43b18 done
Copying blob 63cc34c55acb done
Copying blob 6e4b124e92eb done
Copying blob f9586f677651 done
Copying blob da039c516da1 done
Copying blob 907902c8f2df done
Copying blob 6e37e4627aaa done
Copying blob a4be8ff2809f [======================================] 219.0MiB / 219.0MiB
Copying blob c3f55eb68657 done
Copying blob 90e92f5708f8 done
Getting image source signatures
Copying blob df20fa9351a1 skipped: already exists
Copying blob 120bcf18546f [======================================] 47.2MiB / 47.2MiB
Copying blob ea9d2faca304 done
Copying blob 1f4c023abf28 done
Copying blob 93146756f5f0 [======================================] 21.3MiB / 21.3MiB
Copying blob 471877730a1b done
Copying blob e10a31dbe020 done
Copying blob df9e17b1831c done
Copying blob b414c7868ddc done
Copying blob 6b87c9a33ccb done
Copying blob 3267a8c43b18 done
Copying blob 6e4b124e92eb [======================================] 1.3MiB / 1.3MiB
Copying blob 63cc34c55acb done
Copying blob f9586f677651 [======================================] 527.2KiB / 534.0KiB
Copying blob 907902c8f2df [====================================>-] 411.2KiB / 418.0KiB
Copying blob da039c516da1 done
Copying blob 6e37e4627aaa done
Copying blob a4be8ff2809f [======================================] 219.0MiB / 219.0MiB
Copying blob c3f55eb68657 done
Copying blob 90e92f5708f8 done
Getting image source signatures
Copying blob df20fa9351a1 skipped: already exists
Copying blob e10a31dbe020 done
Copying blob 120bcf18546f [======================================] 47.2MiB / 47.2MiB
Copying blob 471877730a1b done
Copying blob ea9d2faca304 done
Copying blob 1f4c023abf28 done
Copying blob 93146756f5f0 [======================================] 21.3MiB / 21.3MiB
Copying blob 6b87c9a33ccb done
Copying blob df9e17b1831c done
Copying blob b414c7868ddc [======================================] 39.2MiB / 39.2MiB
Copying blob 3267a8c43b18 done
Copying blob 6e4b124e92eb done
Copying blob 63cc34c55acb done
Copying blob f9586f677651 done
Copying blob 907902c8f2df done
Copying blob da039c516da1 done
Copying blob a4be8ff2809f done
Copying blob 6e37e4627aaa [======================================] 4.5MiB / 4.5MiB
Copying blob c3f55eb68657 done
Copying blob 90e92f5708f8 done
  read tcp redacted->104.18.124.25:443: read: connection reset by peer
Error: unable to pull homeassistant/home-assistant:stable: 2 errors occurred:
	* Error initializing source docker://registry.opensuse.org/homeassistant/home-assistant:stable: Error reading manifest stable in registry.opensuse.org/homeassistant/home-assistant: name unknown
	* Error writing blob: error storing blob to file "/var/tmp/storage691889961/6": read tcp redacted->104.18.124.25:443: read: connection reset by peer

Some chunks don't complete and make it fail completely. This is probably due to my internet connection (I've had the same issue on multiple machines) but I experience zero issues anywhere else (downloads, streaming, docker all fine). It might be caused by the parallel downloads.

Version is 2.0.5 with the retry patch applied on top.

@mtrmac
Copy link
Collaborator

mtrmac commented Aug 26, 2020

Yeah, if only some layer blobs are downloaded, they are not extracted into c/storage (that currently happens only at ImageDestination.Commit), and removed before trying again. Retrying at the high level can’t help with cases where, on average, a complete image is never downloaded.

(Valentin was working on c/image changes that would apply layers as they were coming, that would indirectly help here.)

@ngdio
Copy link

ngdio commented Aug 26, 2020

If that change would effectively mean the process works just as in Docker (individual layers are redownloaded if necessary), then it would help indeed. Is there any issue/pull request where I can track progress, or do you have an ETA? I can't use Podman in my network at all right now, so that would be helpful.

@abaschen
Copy link

I have the same problem with Fedora CoreOS.
I've disabled ipv6 and cgroup2. trying to get openshift images and get the same timeout randomly. Cannot finish to download 800MB even with a 1GB/s connexion. If this issue is closed, I don't see a solution, could someone help?

@Gottox
Copy link

Gottox commented Nov 28, 2020

Same issue for me:

~ podman pull octoprint/octoprint
Trying to pull docker.io/octoprint/octoprint...
Getting image source signatures
Copying blob dc97f433b6ed done  
Copying blob cb732bb8dce0 done  
Copying blob bb79b6b2107f done  
Copying blob d8634511c1f0 done  
Copying blob 0065b4712c38 done  
Copying blob e4ab79a0ba11 done  
Copying blob 3d27de0ca1e3 done  
Copying blob 093359127cd2 done  
Copying blob 978d8fd02815 done  
Copying blob b8abb99ca1cc done  
Copying blob e1cde2378a2b done  
Copying blob 1ee9298ab334 done  
Copying blob 35e30c3f3e2b [======================================] 2.6MiB / 2.6MiB
Copying blob 38a288cfa675 [====================================>-] 287.0KiB / 293.3KiB
Copying blob 7fcd1d230ec9 done  
Copying blob 30e377ec29ea done  
Getting image source signatures
Copying blob dc97f433b6ed done  
Copying blob d8634511c1f0 done  
Copying blob cb732bb8dce0 done  
Copying blob bb79b6b2107f done  
Copying blob 3d27de0ca1e3 done  
Copying blob 35e30c3f3e2b done  
Copying blob 0065b4712c38 [======================================] 1.8MiB / 1.8MiB
Copying blob e4ab79a0ba11 done  
Copying blob 978d8fd02815 done  
Copying blob 093359127cd2 [======================================] 11.4MiB / 11.4MiB
Copying blob 1ee9298ab334 done  
Copying blob b8abb99ca1cc [======================================] 1.3MiB / 1.4MiB
Copying blob e1cde2378a2b [=============================>--------] 1.8KiB / 2.2KiB
Copying blob 38a288cfa675 [====================================>-] 288.0KiB / 293.3KiB
Copying blob 7fcd1d230ec9 done  
Copying blob 30e377ec29ea done  
Getting image source signatures
Copying blob bb79b6b2107f done  
Copying blob dc97f433b6ed done  
Copying blob 3d27de0ca1e3 [======================================] 242.1MiB / 242.1MiB
Copying blob 35e30c3f3e2b [======================================] 2.6MiB / 2.6MiB
Copying blob 0065b4712c38 done  
Copying blob cb732bb8dce0 [======================================] 10.1MiB / 10.1MiB
Copying blob d8634511c1f0 done  
Copying blob e4ab79a0ba11 done  
Copying blob 093359127cd2 done  
Copying blob 978d8fd02815 done  
Copying blob 1ee9298ab334 [======================================] 42.3MiB / 42.3MiB
Copying blob b8abb99ca1cc done  
Copying blob e1cde2378a2b done  
Copying blob 38a288cfa675 done  
Copying blob 7fcd1d230ec9 [======================================] 183.9KiB / 184.1KiB
Copying blob 30e377ec29ea done  
Getting image source signatures
Copying blob 35e30c3f3e2b done  
Copying blob 3d27de0ca1e3 done  
Copying blob d8634511c1f0 done  
Copying blob bb79b6b2107f done  
Copying blob dc97f433b6ed done  
Copying blob cb732bb8dce0 [======================================] 10.1MiB / 10.1MiB
Copying blob 0065b4712c38 done  
Copying blob e4ab79a0ba11 [======================================] 1.7MiB / 1.7MiB
Copying blob 093359127cd2 done  
Copying blob 978d8fd02815 done  
Copying blob 1ee9298ab334 done  
Copying blob b8abb99ca1cc [======================================] 1.4MiB / 1.4MiB
Copying blob e1cde2378a2b done  
Copying blob 7fcd1d230ec9 [====================================>-] 179.7KiB / 184.1KiB
Copying blob 30e377ec29ea done  
Copying blob 38a288cfa675 [====================================>-] 287.0KiB / 293.3KiB
  read tcp 192.168.155.178:46000->104.18.123.25:443: read: connection reset by peer
Error: unable to pull octoprint/octoprint: 1 error occurred:
	* Error writing blob: error storing blob to file "/var/tmp/storage241651785/6": read tcp 192.168.155.178:46000->104.18.123.25:443: read: connection reset by peer

@benyaminl
Copy link
Author

Hello, seems there's a lot of people still face this problem. I think this need to be reopen, and maybe podman could see what docker do that can have a work arround for this kind of broken connection.

Thanks

@Gottox
Copy link

Gottox commented Dec 6, 2020

@Nurlan199206
Copy link

Nurlan199206 commented Aug 11, 2021

someone can help me with copying blob 0 bytes? maybe need open additional ports on network device?

current open TCP 443/80

podman pull quay.io/openshift-release-dev/ocp-release@sha256:3e59cff6101b0f0732540d9f2cf1fe9c7ea5ab1e8737df82e789eeb129d1a9af
Trying to pull quay.io/openshift-release-dev/ocp-release@sha256:3e59cff6101b0f0732540d9f2cf1fe9c7ea5ab1e8737df82e789eeb129d1a9af...
Getting image source signatures
Copying blob 88623262ec21 skipped: already exists  
Copying blob 64607cc74f9c skipped: already exists  
Copying blob 13897c84ca57 skipped: already exists  
Copying blob 09bec785a242 [--------------------------------------] 0.0b / 0.0b
Copying blob 9785c7b8e46d [--------------------------------------] 0.0b / 0.0b
Copying blob 87307f0f97c7 [--------------------------------------] 0.0b / 0.0b
Copying config e1937394eb done  
Writing manifest to image destination
Storing signatures
e1937394eb3e79a898931ba8aee8a66ccd05eae8b4fdf4cbbbc32ccc3b91e32c

@mtrmac
Copy link
Collaborator

mtrmac commented Aug 11, 2021

@Nurlan199206 Please open a separate issue, don’t pile onto this one, unless there’s a clean connection. It’s always easier to merge two conversations than to teas them apart.

And in the new issue please explain what exactly is failing; this looks like the image was pulled correctly, and we are only showing incorrect progress bars (containers/image#1013 ).

@benyaminl
Copy link
Author

@mtrmac please lock this conversation sir so not more people are bumping here.

@ravendererugu
Copy link

Hi all, one of developer push the code to the azure storage Blob account through git. but unfortunately, they are missing in Github he was able to recover 2 images. but we don't know the root cause of why it happened. anyone can help me with this

@benyaminl
Copy link
Author

Hi all, one of developer push the code to the azure storage Blob account through git. but unfortunately, they are missing in Github he was able to recover 2 images. but we don't know the root cause of why it happened. anyone can help me with this

Please create a new issue regarding it and post the output, log, etc, so people can see what's wrong.

This issue already solved via other patch. So please don't bump here. Thanks.

@github-actions github-actions bot added the locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. label Sep 20, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 20, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
do-not-close kind/bug Categorizes issue or PR as related to a bug. locked - please file new issue/PR Assist humans wanting to comment on an old issue or PR with locked comments. stale-issue
Projects
None yet
Development

No branches or pull requests