You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Having a "server" loop in the main process which receives requests on a globally-known channel
Each request contains a freshly-created response channel
The server uses that response channel to send back a single value, and then closes the response channel
All the threads in all the processes start hammering the server with requests
The motivation is to have one multi-threaded process on each server in a compute cluster, but in the code demonstrating the bug (see below), all processes run on the same (local) machine; the code doesn't make use of this fact (does not use shared memory or atomics, only remote channels).
Leaving aside whether this is a good idiom, the approach is legal and should work(?).
However, running this in Julia 1.6.1 produces nondeterministic failures:
Sometimes it works (not often)
Sometimes it deadlocks (often)
Sometimes (less often) it crashes with an error message complaining about conversion of the data type of the response channel EDIT: This is clearly "impossible" as the error message indicates an object of an explicitly created type actually has a different type instead; no to mention, this works most of the time - if the types were incorrect, the code should have failed the 1st time it was run. The code is:
nested task error: MethodError: Cannot `convert` an object of type
RemoteChannel{Channel{Any}} to an object of type
RemoteChannel{Channel{Int64}}
Sometimes (rarely) it crashes with error messages involving the GC and concurrency errors
This seems to be a bug, unless the code does something "forbidden" (it doesn't seem to?). It was suggested the GC issues might be related to JuliaLang/julia#38180 but this doesn't seem to cover the concurrency errors in the crash traces.
To run it type JULIA_NUM_THREADS=4 julia Bug.jl 4 1000 quiet - you can play with the number of threads, number of processes (here, also 4), number of requests sent by each thread of each process (here, 1000), and whether the code is quiet or verbose (the latter uses println and flush a lot which will impact the behavior).
The text was updated successfully, but these errors were encountered:
I have code which:
The motivation is to have one multi-threaded process on each server in a compute cluster, but in the code demonstrating the bug (see below), all processes run on the same (local) machine; the code doesn't make use of this fact (does not use shared memory or atomics, only remote channels).
Leaving aside whether this is a good idiom, the approach is legal and should work(?).
However, running this in Julia 1.6.1 produces nondeterministic failures:
And the error message complains that:
This seems to be a bug, unless the code does something "forbidden" (it doesn't seem to?). It was suggested the GC issues might be related to JuliaLang/julia#38180 but this doesn't seem to cover the concurrency errors in the crash traces.
The source code and output crash traces are available in https://gist.github.com/orenbenkiki/ac71f348d4915b394805656b142b33fe
To run it type
JULIA_NUM_THREADS=4 julia Bug.jl 4 1000 quiet
- you can play with the number of threads, number of processes (here, also 4), number of requests sent by each thread of each process (here, 1000), and whether the code isquiet
orverbose
(the latter usesprintln
andflush
a lot which will impact the behavior).The text was updated successfully, but these errors were encountered: