Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Action is encountering internal issue and sits in a retry loop(?), chews up action minutes #498

Closed
South-Paw opened this issue Nov 13, 2021 · 7 comments

Comments

@South-Paw
Copy link

South-Paw commented Nov 13, 2021

This was working fine 7 days ago and there have been no changes to our workflows however now this action is having some issue occur internally which results in it sitting in what appears to be a retry loop or its hung. This has cost us a few hours total of paid action minutes.

Behaviour

Steps to reproduce this issue

  1. Repo has 2 dockerfiles (lets call them A and B), CI workflow has 6 jobs.
  2. Workflow (that last passed on Nov 7th)
    1. Code is linted then tested
    2. If lint/test is successful, container A builds a dev and test tag, push
    3. If lint/test is successful, container B builds a dev and test tag, push
  3. Note that 2.2 and 2.3 are occurring in parallel in the workflow

Expected behavior

This was originally (as of the 7th of Nov) working fine - the workflow would complete as expected and containers A and B would be published to the GHCR, both tagged dev and test correctly.

Actual behavior

Workflow runs today are now getting stuck in what appears to be a retry loop(?) and only one of the two parallel jobs will pass, the other gets stuck in a loop(?) and need to be cancelled.

When the jobs are cancelled, the job log has the following:

2021-11-13T02:04:53.2663646Z #19 exporting to image
2021-11-13T02:04:53.7166729Z #19 5.856 error: failed to copy: failed to do request: Put "https://ghcr.io/v2/[redacted]/blobs/upload/[uuid]?digest=sha256...": write tcp 172.17.0.2:42834->140.82.112.34:443: write: connection reset by peer
2021-11-13T02:04:53.7170084Z #19 5.856 retrying in 1s
2021-11-13T02:07:44.4163623Z ##[error]The operation was canceled.

Given the timestamps, it's clearly not retrying in on second ... or at all for that matter.

Run examples:

Configuration

  • Repository URL (if public): not public
  • Build URL (if public): not public
YAML workflow file
name: Push CI

on:
  push:
    branches:
      - "*"
    tags-ignore:
      - "*.*.*"

jobs:
  lint:
    name: Lint
    runs-on: ubuntu-latest
    if: github.event_name == 'push'
    steps:
      - name: Checkout repository
        uses: actions/checkout@v2

      - name: Setup Node
        uses: actions/setup-node@v2
        with:
          cache: "npm"

      - name: Install dependencies
        run: npm ci

      - name: Lint
        run: npm run lint

  test:
    name: Test
    runs-on: ubuntu-latest
    if: github.event_name == 'push'
    steps:
      - name: Checkout repository
        uses: actions/checkout@v2

      - name: Setup Node
        uses: actions/setup-node@v2
        with:
          cache: "npm"

      - name: Install dependencies
        run: npm ci

      - name: Test
        run: npm run test
        env:
          CI: true

  build-api-docker-development:
    name: Build and push development API docker image
    runs-on: ubuntu-latest
    if: github.event_name == 'push' && github.ref == 'refs/heads/master'
    needs:
      - lint
      - test
    steps:
      - name: Checkout repository
        uses: actions/checkout@v2

      - name: Docker meta
        id: meta
        uses: docker/metadata-action@v3
        with:
          images: ghcr.io/[redacted]
          tags: |
            type=raw,value=dev
          flavor: |
            latest=false

      - name: Set up QEMU
        uses: docker/setup-qemu-action@v1

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v1

      - name: Login to GitHub Container Registry
        uses: docker/login-action@v1
        with:
          registry: ghcr.io
          username: ${{ github.repository_owner }}
          password: ${{ secrets.GH_PACKAGES_TOKEN }}

      - name: Docker build and push
        uses: docker/build-push-action@v2
        with:
          context: .
          file: "./api.Dockerfile"
          push: true
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          build-args: |
            APP_VERSION_ENV=dev
            APP_VERSION_SHA=${{ github.sha }}

  build-bootstrapper-docker-development:
    name: Build and push development Bootstrapper docker image
    runs-on: ubuntu-latest
    if: github.event_name == 'push' && github.ref == 'refs/heads/master'
    needs:
      - lint
      - test
    steps:
      - name: Checkout repository
        uses: actions/checkout@v2

      - name: Docker meta
        id: meta
        uses: docker/metadata-action@v3
        with:
          images: ghcr.io/[redacted]
          tags: |
            type=raw,value=dev
          flavor: |
            latest=false

      - name: Set up QEMU
        uses: docker/setup-qemu-action@v1

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v1

      - name: Login to GitHub Container Registry
        uses: docker/login-action@v1
        with:
          registry: ghcr.io
          username: ${{ github.repository_owner }}
          password: ${{ secrets.GH_PACKAGES_TOKEN }}

      - name: Docker build and push
        uses: docker/build-push-action@v2
        with:
          context: .
          file: "./bootstrapper.Dockerfile"
          push: true
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}

  build-api-docker-test:
    name: Build and push test API docker image
    runs-on: ubuntu-latest
    if: github.event_name == 'push' && github.ref == 'refs/heads/master'
    needs:
      - build-api-docker-development
    steps:
      - name: Checkout repository
        uses: actions/checkout@v2

      - name: Docker meta
        id: meta
        uses: docker/metadata-action@v3
        with:
          images: ghcr.io/[redacted]
          tags: |
            type=raw,value=test
          flavor: |
            latest=false

      - name: Set up QEMU
        uses: docker/setup-qemu-action@v1

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v1

      - name: Login to GitHub Container Registry
        uses: docker/login-action@v1
        with:
          registry: ghcr.io
          username: ${{ github.repository_owner }}
          password: ${{ secrets.GH_PACKAGES_TOKEN }}

      - name: Docker build and push
        uses: docker/build-push-action@v2
        with:
          context: .
          file: "./api.Dockerfile"
          push: true
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          build-args: |
            APP_VERSION_ENV=test
            APP_VERSION_SHA=${{ github.sha }}

  build-bootstrapper-docker-test:
    name: Build and push test Bootstrapper docker image
    runs-on: ubuntu-latest
    if: github.event_name == 'push' && github.ref == 'refs/heads/master'
    needs:
      - build-bootstrapper-docker-development
    steps:
      - name: Checkout repository
        uses: actions/checkout@v2

      - name: Docker meta
        id: meta
        uses: docker/metadata-action@v3
        with:
          images: ghcr.io/[redacted]
          tags: |
            type=raw,value=test
          flavor: |
            latest=false

      - name: Set up QEMU
        uses: docker/setup-qemu-action@v1

      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v1

      - name: Login to GitHub Container Registry
        uses: docker/login-action@v1
        with:
          registry: ghcr.io
          username: ${{ github.repository_owner }}
          password: ${{ secrets.GH_PACKAGES_TOKEN }}

      - name: Docker build and push
        uses: docker/build-push-action@v2
        with:
          context: .
          file: "./bootstrapper.Dockerfile"
          push: true
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
@South-Paw South-Paw changed the title Action is encountering internal issue and sits in a retry loop, chews up action minutes Action is encountering internal issue and sits in a retry loop(?), chews up action minutes Nov 13, 2021
@South-Paw
Copy link
Author

South-Paw commented Nov 13, 2021

Wanted to see if the workflow was at fault so changed it to be all sequential and received a new error...

#19 exporting to image
#19 5.415 error: failed to copy: failed to do request: Put "https://ghcr.io/v2/[redacted]/blobs/upload/[uuid]?digest=sha256...": write tcp 172.17.0.2:54206->140.82.114.33:443: write: broken pipe
#19 5.415 retrying in 1s
Error: The operation was canceled.

Is this some sort of upstream issue with ghcr.io rather than this action?

IMO, this action has an issue where it's not correctly exiting or retrying after hitting these errors though... however more importantly I'd like to know whats going wrong and if/how I can resolve it on my end 😄

@South-Paw South-Paw reopened this Nov 13, 2021
@jzucker2
Copy link

jzucker2 commented Nov 13, 2021

This has been happening to me for days now:

#43 exporting to image
#43 22.47 error: failed to copy: failed to do request: Put "https://ghcr.io/v2/jzucker2/[redacted]/blobs/upload/[redacted]?digest=sha256%3Abc216d83a33dec2f30cb12f7b6b55414c875bbbaec021d7a9449024ccce7e97e": use of closed network connection
#43 22.47 retrying in 1s

This seems to have been an ongoing issue for a while but I haven't tracked down which version of BuildKit is at play here yet: docker/buildx#367

@South-Paw
Copy link
Author

South-Paw commented Nov 13, 2021

Possibly related: containerd/containerd#6242

Appears that the issue is indeed upstream and (unless I'm corrected), unrelated to this action

@crazy-max
Copy link
Member

crazy-max commented Nov 13, 2021

Possibly related: containerd/containerd#6242

Yes that's it, tracked in docker/buildx#834 and PR opened on BuildKit repo: moby/buildkit#2461. In the meantime use the solution from docker/buildx#834 (comment).

@South-Paw
Copy link
Author

Cool, thanks @crazy-max

Puts things a bit more at ease having it confirmed to be getting handled upstream :)

@txomon
Copy link

txomon commented Apr 9, 2023

Seems like I encountered this in docker/buildx#1728, not sure if @South-Paw you have confirmation of the fix upstream?

@South-Paw
Copy link
Author

To this day, I'm unsure if it's been fixed upstream 😂 (more because I haven't revisited or kept up with the issues).

I still have @crazy-max's solution in place to avoid it

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants