Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci(buildkit): fix timeout for image job #4491

Closed
wants to merge 4 commits into from

Conversation

crazy-max
Copy link
Member

No description provided.

Copy link
Collaborator

@profnandaa profnandaa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see it's still failing with timeout, we try 30 min or what could be the cause?

.github/workflows/buildkit.yml Outdated Show resolved Hide resolved
@crazy-max crazy-max force-pushed the fix-ci-build-timeout branch 3 times, most recently from 207630b to c11cf91 Compare December 16, 2023 12:08
@crazy-max
Copy link
Member Author

@crazy-max crazy-max marked this pull request as ready for review December 16, 2023 13:02
@crazy-max crazy-max force-pushed the fix-ci-build-timeout branch from c11cf91 to b014472 Compare December 16, 2023 17:41
@crazy-max
Copy link
Member Author

crazy-max commented Dec 17, 2023

@tonistiigi Added ps aux per your suggestion and got: https://github.com/moby/buildkit/actions/runs/7233373573/job/19718123101?pr=4491#step:7:1109

PID   USER     TIME  COMMAND
    1 root      0:00 /sbin/docker-init -- buildkitd --debug
    7 root      0:30 buildkitd --debug
  163 root      0:00 buildctl dial-stdio
  442 root      0:00 buildkit-runc --log /var/lib/buildkit/runc-overlayfs/executor/runc-log.json --log-format json run --bundle /var/lib/buildkit/runc-overlayfs/executor/rtwgljgknwyvvi9kzrru3pbnb --keep rtwgljgknwyvvi9kzrru3pbnb
  502 root      0:00 /bin/sh -c   set -ex   xx-go build ${GOBUILDFLAGS} -gcflags="${GOGCFLAGS}" -ldflags "$(cat /tmp/.ldflags) -extldflags '-static'" -tags "osusergo netgo static_build seccomp ${BUILDKITD_TAGS}" -o /usr/bin/buildkitd ./cmd/buildkitd   xx-verify ${VERIFYFLAGS} /usr/bin/buildkitd   if [ "$(xx-info os)" = "linux" ]; then /usr/bin/buildkitd --version; fi 
37538 root      0:00 {buildkitd} /usr/bin/qemu-arm /usr/bin/buildkitd /usr/bin/buildkitd --version
37624 root      0:00 {buildkitd} /usr/bin/qemu-arm /usr/bin/buildkitd /usr/bin/buildkitd --version
42689 root      0:00 ps aux

So seems related to emulation when invoking buildkitd --version (related to #4357). I will add QEMU_STRACE=1 to have more logs.

Signed-off-by: CrazyMax <[email protected]>
Signed-off-by: CrazyMax <[email protected]>
@crazy-max
Copy link
Member Author

Comparing QEMU logs in https://github.com/moby/buildkit/actions/runs/7238175771/job/19719814734?pr=4491#step:7:4174 that succeeds:

#29 54.03 3905 3905 futex(clock_gettime(0x000000c000076948,FUTEX_PRIVATE_FLAG|FUTEX_WAKE,1,NULL,NULL,0)CLOCK_MONOTONIC,0x000000c000087ee8) = 0 ({tv_sec = 206,tv_nsec = 447380543})
#29 54.03  = 1
#29 54.03 3905 clock_gettime(CLOCK_MONOTONIC,0x000000c000087ee8)3905  = 0futex( ({tv_sec = 206,tv_nsec = 447395170}0x00000000023ccd68, = 0
#29 54.03 FUTEX_PRIVATE_FLAG|FUTEX_WAIT,0,NULL,NULL,0)3905 clock_gettime(CLOCK_MONOTONIC,0x000000c00010fd68) = 0 ({tv_sec = 206,tv_nsec = 447412413})
#29 54.03 3905 futex(0x00000000023ccd68,FUTEX_PRIVATE_FLAG|FUTEX_WAKE,1,NULL,NULL,0) = 1
#29 54.03 3905 futex(0x000000c000076948, = FUTEX_PRIVATE_FLAG|FUTEX_WAIT0,0,
#29 54.03 )

With this one that fails: https://github.com/moby/buildkit/actions/runs/7238175771/job/19719964140?pr=4491#step:7:47491

50 440.6 3915 futex(0x000000c000286148,FUTEX_PRIVATE_FLAG|FUTEX_WAKE,1,NULL,NULL,0) = 1
#50 440.6 3915 clock_gettime(CLOCK_REALTIME,0x000000c000072608) =  = 00
#50 440.6  ({tv_sec = 1702819771,tv_nsec = 199764230})
#50 440.6 3915 3915 clock_gettime(clock_gettime(CLOCK_MONOTONICCLOCK_MONOTONIC,,0x000000c00046dd680x000000c000072608)) = 0 ( = 0{tv_sec = 546,tv_nsec = 846162178})
#50 440.6  ({tv_sec = 546,tv_nsec = 846162526})
#50 440.6 3915 epoll_pwait(4,824638363080,128,239948,0,0)3915 clock_gettime(CLOCK_MONOTONIC,0x000000c0000726f8) = 0 ({tv_sec = 546,tv_nsec = 846625303})
#50 440.6 3915 write(6,0x72673,1) = 1
#50 440.6  = 1
#50 440.6 3915 read(5,0x46d5a8,16) = 1
#50 440.6 3915 clock_gettime(CLOCK_MONOTONIC,0x000000c00046ddc0) = 0 ({tv_sec = 546,tv_nsec = 846804653})
#50 440.6 3915 clock_gettime(CLOCK_MONOTONIC,0x000000c00046dd68) = 0 ({tv_sec = 546,tv_nsec = 846818488})
#50 440.6 3915 epoll_pwait(4,824638363080,128,49999,0,0)3915 clock_gettime(CLOCK_MONOTONIC,0x000000c000207d68) = 0 ({tv_sec = 546,tv_nsec = 846844494})
#50 440.6 3915 futex(0x000000c0003ae548,FUTEX_PRIVATE_FLAG|FUTEX_WAIT,0,NULL,NULL,0) = 0
#50 440.6 3915 clock_gettime(CLOCK_MONOTONIC,0x000000c000087ee8) = 0 ({tv_sec = 546,tv_nsec = 855425396})
#50 440.6 3915 clock_gettime(CLOCK_MONOTONIC,0x000000c000087e68) = 0 ({tv_sec = 546,tv_nsec = 855438741})
#50 490.6 3915 futex(0x00000000023cd100,FUTEX_PRIVATE_FLAG|FUTEX_WAIT,0,0x000000c000087e68,NULL,0) = -1 errno=110 (Operation timed out)

An operation timed out and retries indefinitely?

@crazy-max
Copy link
Member Author

crazy-max commented Dec 18, 2023

This seems related to https://gitlab.com/qemu-project/qemu/-/issues/456 where PI (priority-inheritance) futexes are not supported in qemu. This is supported since QEMU 7.2.0 but atm we are using 7.0.0. Not sure why I can't repro consistently though.

Same happens in this project anonaddy/docker#245 (comment)

@crazy-max crazy-max force-pushed the fix-ci-build-timeout branch from bcc5f3f to 7f9b09f Compare December 18, 2023 10:25
@crazy-max
Copy link
Member Author

crazy-max commented Dec 18, 2023

Still hangs with QEMU 8.0.4 😕 https://github.com/moby/buildkit/actions/runs/7246753647/job/19741177891?pr=4491#step:7:37760

#51 787.5 3755 futex(0x0221189c,FUTEX_PRIVATE_FLAG|FUTEX_WAIT,0,{tv_sec = 49,tv_nsec = 990587752},NULL,0) = -1 errno=110 (Operation timed out)

@tonistiigi
Copy link
Member

@crazy-max I guess this should be draft, right?

@crazy-max crazy-max marked this pull request as draft December 19, 2023 09:13
crazy-max added a commit to crazy-max/buildkit that referenced this pull request Dec 19, 2023
"buildkitd --version" might hang when using emulation during
smoke test. Add a timeout to mitigate this issue and avoid blocking
CI. This is currently tracked in moby#4491

Signed-off-by: CrazyMax <[email protected]>
crazy-max added a commit to crazy-max/buildkit that referenced this pull request Dec 19, 2023
"buildkitd --version" might hang when using emulation during
smoke test. Add a timeout to mitigate this issue and avoid blocking
CI. This is currently tracked in moby#4491

Signed-off-by: CrazyMax <[email protected]>
crazy-max added a commit to crazy-max/buildkit that referenced this pull request Dec 19, 2023
"buildkitd --version" might hang when using emulation during
smoke test. Add a timeout to mitigate this issue and avoid blocking
CI. This is currently tracked in moby#4491

Signed-off-by: CrazyMax <[email protected]>
crazy-max added a commit to crazy-max/buildkit that referenced this pull request Dec 20, 2023
"buildkitd --version" might hang when using emulation during
smoke test. Add a timeout to mitigate this issue and avoid blocking
CI. This is currently tracked in moby#4491

Signed-off-by: CrazyMax <[email protected]>
crazy-max added a commit to crazy-max/buildkit that referenced this pull request Dec 21, 2023
"buildkitd --version" might hang when using emulation during
smoke test. Add a timeout to mitigate this issue and avoid blocking
CI. This is currently tracked in moby#4491

Signed-off-by: CrazyMax <[email protected]>
crazy-max added a commit to crazy-max/buildkit that referenced this pull request Dec 21, 2023
"buildkitd --version" might hang when using emulation during
smoke test. Add a timeout to mitigate this issue and avoid blocking
CI. This is currently tracked in moby#4491

Signed-off-by: CrazyMax <[email protected]>
crazy-max added a commit to crazy-max/buildkit that referenced this pull request Dec 23, 2023
"buildkitd --version" might hang when using emulation during
smoke test. Add a timeout to mitigate this issue and avoid blocking
CI. This is currently tracked in moby#4491

Signed-off-by: CrazyMax <[email protected]>
crazy-max added a commit to crazy-max/buildkit that referenced this pull request Jan 4, 2024
"buildkitd --version" might hang when using emulation during
smoke test. Add a timeout to mitigate this issue and avoid blocking
CI. This is currently tracked in moby#4491

Signed-off-by: CrazyMax <[email protected]>
crazy-max added a commit to crazy-max/buildkit that referenced this pull request Jan 4, 2024
"buildkitd --version" might hang when using emulation during
smoke test. Add a timeout to mitigate this issue and avoid blocking
CI. This is currently tracked in moby#4491

Signed-off-by: CrazyMax <[email protected]>
azr pushed a commit to azr/buildkit that referenced this pull request May 7, 2024
"buildkitd --version" might hang when using emulation during
smoke test. Add a timeout to mitigate this issue and avoid blocking
CI. This is currently tracked in moby#4491

Signed-off-by: CrazyMax <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants