Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Long running push sometimes fail with "use of closed network connection" #367

Closed
dubo-dubon-duponey opened this issue Aug 27, 2020 · 10 comments

Comments

@dubo-dubon-duponey
Copy link

This is a follow-up to #324.

Now using buildkit master, that does properly fix ^.

Now, the following happens regularly:

failed to solve: rpc error: code = Unknown desc = failed to copy: failed to do request: Put https://registry-1.docker.io/v2/dubodubonduponey/REDACTED/blobs/uploads/REDACTED?_state=XXX&digest=sha256%3AXXXX: write tcp 172.17.0.3:41240->52.1.121.53:443: use of closed network connection

This is with docker4mac. It is somewhat hard to reproduce but happens often enough with repeated tries (eg: long running pushes is the key).

Let me know what kind of information would be helpful to diagnose this.

@dubo-dubon-duponey
Copy link
Author

I have one image that is being specially hard to push to hub, with which I've hit the above error ^ multiple times.

On that same image, I'm also hitting the following now:

failed to solve: rpc error: code = Unknown desc = rpc error: code = Canceled desc = grpc: the client connection is closing: unknown

@boredazfcuk
Copy link

I'm seeing this issue too.

I've got about 15 containers that I frequently build using a script which builds the architectures I want for each container and then pushes to Docker hub.

I updated one of the containers this morning (literally added a single character to two scripts, so no major changes) and ran my build script, but it bombs out with the same error you're experiencing.

This container in question is being built for 5 architectures, using two nodes. I must have tried a dozen times now, but it's failed every single time.

I'm also seeing long running pushes (25mins!) , but I think that it's just idling until times out and reports the errors as closed connection. Each architecture is only ~40MB, so 210MB total so there's no chance it should be taking this long.

@andrewufrank
Copy link

same problem her - a build which run quickly (less than a minute) after it failed first time. for platforms, docker login just before the build, use of buildx from master. I hope I covere all - but failed twice in a row. the full error message:

(failed to solve: rpc error: code = Unknown desc = failed to copy: failed to do request: Put https://registry-1.docker.io/v2/andrewufrank/x11docker-lmde/blobs/uploads/a59439bc-e144-4b42-a432-f9725c1a4f9d?_state=Leg83mDFqN1rBrmda84C8tOHuzIqg_fkBIRlCHfB8D97Ik5hbWUiOiJhbmRyZXd1ZnJhbmsveDExZG9ja2VyLWxtZGUiLCJVVUlEIjoiYTU5NDM5YmMtZTE0NC00YjQyLWE0MzItZjk3MjVjMWE0ZjlkIiwiT2Zmc2V0IjowLCJTdGFydGVkQXQiOiIyMDIwLTA5LTEwVDEyOjIzOjMzLjQyODI1MjkwNloifQ%3D%3D&digest=sha256%3A7d9beb74056b685831827b5f21351f021873d8ba1a1170b378e5edf5f46d114e: write tcp 172.17.0.2:35736->34.238.187.50:443: use of closed network connection)

any workaround? thank you for improvements (before it was another error message)!

@boredazfcuk
Copy link

I actually fixed my problem by pulling the latest version of the registry container. I had set a static version to 2.7.0 (IIRC) due to htpasswd being removed from latest, and never changed it back.

@wezell
Copy link

wezell commented Oct 1, 2020

Same is happening to us:

=> [linux/amd64 stage-1 11/11] RUN mkdir -p /data/shared/assets && mkdir -p /data/local/dotsecure/license && chmod -R 660 /data && find /data/ -type d -exec chmod 770 {} ;                                                        0.1s
 => ERROR exporting to image                                                                                                                                                                                                      167.9s
 => => exporting layers                                                                                                                                                                                                            25.7s
 => => exporting manifest sha256:85146d5cbdc8e4b3284e42aabf384a7b076c871b912dc0a01c95da1d064c8a55                                                                                                                                   0.0s
 => => exporting config sha256:f5f7f1a16a6bf9dcebdd2ce20ddb53bdcf1d06d5242fe261887040446db7582f                                                                                                                                     0.0s
 => => exporting manifest sha256:93b5d8e3a9134df50fe4ce2e26390513b1362e79c8b15a28d216f4faa76fa774                                                                                                                                   0.0s
 => => exporting config sha256:f7e87f508634e993c881717b8ea2cfb82cab5e21f5a7f07bea59bd8a35807ce7                                                                                                                                     0.0s
 => => exporting manifest list sha256:aea5e8d14ac08a28a10a9fcc36e562909df9520be3a6191289e079431c68495a                                                                                                                              0.0s
 => => pushing layers                                                                                                                                                                                                             142.2s
------
 > exporting to image:
------
failed to solve: rpc error: code = Unknown desc = failed to copy: failed to do request: Put https://registry-1.docker.io/v2/xxxx/xxxx/blobs/uploads/cd8f1fad-be9c-425e-b90a-096cf54bc9bd?_state=oxMzL8R_TY7JJw1I8xF9L8HKyzcqlfWjROzYXJkMpQN7Ik5hbWUiOiJkb3RjbXMvZG90Y21zIiwiVVVJRCI6ImNkOGYxZmFkLWJlOWMtNDI1ZS1iOTBhLTA5NmNmNTRiYzliZCIsIk9mZnNldCI6MCwiU3RhcnRlZEF0IjoiMjAyMC0xMC0wMVQxODoxMjo0NS4xMzAxNDY1NzJaIn0%3D&digest=sha256%3Ad8de563de8cc0c7a8583b98b3b5f86610b86545a0cbff54980a43b67e42652d2: write tcp 172.17.0.2:58070->34.238.187.50:443: use of closed network connection

Seems like a load balancer or network timeout is breaking the docker push ?

Note: we are trying to push from outside the US.

@nollymar
Copy link

nollymar commented Oct 1, 2020

I upgraded to Docker version 19.03.13, build 4484c46d9d and now it works.

@dubo-dubon-duponey
Copy link
Author

@wezell I'm in the US and seeing this issue constantly with Docker Hub.
I simply gave up pushing on the Hub entirely.

It does not seem to happen at all with either my local registry or ghcr.io (that I started using recently), which does suggest something specific to Hub (more aggressive timeout for example, that would kick in with slower connections).

The root of the issue though is still containerd inability to resume uploads (through byte-range push).

@jesdynf
Copy link

jesdynf commented Oct 27, 2020

I have the same problem arc with Docker Desktop as described above -- multi-arch build hitting the "401 Unauthorized" error when I allowed it to default to :buildx-stable-1, turning into the "use of closed network" error when I made a new builder using :master.

Docker Desktop 2.4.0.0 if it's relevant (Docker 19.03.13).

@carlonluca
Copy link

I typically get this error at least twice per image push, from different OS and different network connections.

@tonistiigi
Copy link
Member

moby/buildkit#1791

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants