-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More sporadic 1.11 hangs #412
Comments
https://buildkite.com/julialang/metal-dot-jl/builds/1187#01919a97-f080-4e37-8f60-7422426ea8ef is curious, as the |
I believe the offending commit is JuliaLang/julia@5f36833 from JuliaLang/julia#50144. I was able to reliably reproduce the hang by running
in two different terminals simultaneously. (Both 1.11) It always hangs at the second Line 317 in 04b5481
I don't know if the offending Julia commit is the cause of the bug or if it just uncovered it, but hopefully this is a good starting point for someone with a better understanding of this part of the codebase to investigate this further. |
This might also help? Collected a sample of the hung process from activity monitor. |
MWE: using Metal, .MTL
dev = first(devices())
cmdq = MTLCommandQueue(dev)
cmdbuf = MTLCommandBuffer(cmdq)
scheduled = Ref(false)
completed = Ref(false)
on_scheduled(cmdbuf) do buf
GC.gc(true)
scheduled[] = true
end
on_completed(cmdbuf) do buf
GC.gc(true)
completed[] = true
end
@assert scheduled[] == false
@assert completed[] == false
@assert cmdbuf.status == MTL.MTLCommandBufferStatusNotEnqueued
enqueue!(cmdbuf)
@assert cmdbuf.status == MTL.MTLCommandBufferStatusEnqueued
commit!(cmdbuf)
print("start... ")
wait_completed(cmdbuf)
println("stop") Hangs on 1.10 and 1.12 on the first iteration. |
Now that 1.11 compilations are fixed, we're back to sporadic hanging of CI.
See example 1, example 2, example 3.
Bonus local output
Maybe related to #329?
The text was updated successfully, but these errors were encountered: