-
Notifications
You must be signed in to change notification settings - Fork 585
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ghcr.io: push times out unreproducible #251
Comments
@junghans Can you change the following step and let me know? Thanks: - name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
with:
driver-opts: image=moby/buildkit:buildx-stable-1
buildkitd-flags: --debug |
From https://github.com/votca/buildenv/runs/1538301715?check_suite_focus=true |
We'll take a look but someone might not get to it until Monday. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@junghans I took a look on our end in the GHCR logs and it looks like your layer was uploaded successfully. GHCR received the request at @crazy-max Is there some kind of |
@markphelps Yes there is a default timeout of @junghans Can you try with the latest stable release of buildx please? (v0.5.1 atm): - name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
with:
version: latest
buildkitd-flags: --debug |
Different image, but still fails:
(https://github.com/votca/buildenv/runs/1557317848?check_suite_focus=true) |
@junghans Ok I think I got it, that's because you can only have a single namespace with GHCR. So |
Huh? It works with build-push-action@v1 and the image shows up here: https://github.com/orgs/votca/packages/container/package/buildenv%2Ffedora |
Odd, this should work with GHCR. We don't have a limit on namespaces. |
My bad there is no limit actually yes! |
Any news on this? |
@junghans I've made some tests with several options linked to buildx and containerd and the following combination works for me ( - name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
with:
version: latest
buildkitd-flags: --debug |
Still failing: https://github.com/votca/buildenv/actions/runs/430519742, is there anything else you changed? |
@junghans Nothing special that could change the current behavior actually 🤔 |
Could it be a raise condition as all job run simultaneously? |
@junghans I don't think so but can you set |
Actually that worked!? |
@junghans That's strange, maybe an abuse prevention on bandwidth usage on GitHub Container Registry. Your images are quite huge ( |
Yeah, the intel compiler is just huge..... |
Now it fails even when running sequentially: https://github.com/votca/buildenv/runs/1609439618?check_suite_focus=true |
Same issue here with a 2.2GB image. I might be mistaken, but isn't rsync a possible technical solution for a transfer? In a bash world, I would use |
There shouldn't be an issue with abuse prevention or bandwidth usage but I will look into this more and @koppor 's issue as well tomorrow (just got back from vacation today). @koppor do you have an example of your push to ghcr failing I can see? |
I'm building many images with matrix, and having a similar issue when pushing the exported image to ghcr.io, even with My action runs automatically upon commit, and I'll manually tag the commit to kickoff another workflow for pushing the image. Therefore, I'm certain this has got to be the issue with the ghcr.io server. Notice that the following three commits runs on an identical dockerfile, and the timeout issue isn't always reproducible.
|
Sorry for the radio silence. We've made some changes in ghcr.io that should hopefully help alleviate these concurrency issues. Would you mind letting me know if your parallel builds succeed? |
docker/build-push-action#251 (comment) Signed-off-by: Gerhard Lazu <[email protected]>
hi @markphelps ran into the same problem today uploading a rather big but not huuge (~2GB) image to ghcr: |
as a stop gap is there any way to have a retry build on error functionality? |
Hi @jessfraz, is it about this workflow? |
There is a push retry in latest buildkit https://github.com/moby/buildkit/blob/master/util/resolver/retryhandler/retry.go#L51 The original "no response" error reported was fixed in containerd/containerd#4724 |
503 retry handler was added recently. It is in master but not in the latest release yet. 403 is weird though and could be misbehaving registry. Problems with auth should return 401. |
Yes agree with that. @jessfraz Do you use the |
Upon looking for your request that errored out in your linked workflow run, I see this error:
This still makes me think this is a client side (buildx) timeout issue. I looked in our load balancer logs for this request to upload this layer, and see that it (our server) did infact respond with a
Note fields So it looks like the server (GHCR) spent 19 seconds before returning the |
@crazy-max I am using a PAT but can switch to GITHUB_TOKEN since I turned that on as well! |
Buildkit has 30 sec timeout for tcp and 10 for tls https://github.com/moby/buildkit/blob/master/util/resolver/resolver.go#L190-L195 that matches the golang recommendation https://github.com/golang/go/blob/52bf14e0e8bdcd73f1ddfb0c4a1d0200097d3ba2/src/net/http/transport.go#L42-L53 . I guess we could increase it a bit if we think it is related but having some timeout is needed to not just keep build hanging and return errors when they are valid. |
I would have thought the TLS handshake would have already occurred though, which timeout do you think it would be? Its possible we also may not be sending an expected header value for keep-alive? |
@markphelps It's not tls timeout as that error is different https://github.com/golang/go/blob/master/src/net/http/transport.go#L2841 |
@jessfraz we have been experiencing some network issues the past couple days with GHCR, but think we have resolved them. Would you please let me know if you are still experiencing intermittent 403/503s? |
I think it's related (if not I can open another issue), for the last few builds I've been seeing a connection reset by peer and seems like it just doesn't want to work no matter how many times I restart it
|
@lucacome See docker/buildx#834 (comment), that might be linked. |
I've been experiencing that same problem for about a month now. I've even opened an issue with GitHub support about it. They changed something in an attempt to fix it, but it unfortunately hasn't worked yet. |
Have also just started experiencing However, AFAIK, the containers I'm working with are not as large as these ones in size (could be mistaken though) Possibly related to: containerd/containerd#6242 ? |
From https://github.com/votca/buildenv/runs/1534243887?check_suite_focus=true, I am getting a lot on unreproducible error when pushing a container:
Never happened with v1.
The text was updated successfully, but these errors were encountered: