-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support foreign threads, and adding threads at run time #45447
Conversation
92b45dc
to
d635803
Compare
base/threadingconstructs.jl
Outdated
[`LinearAlgebra`](@ref man-linalg) standard library, and `nprocs()` in the | ||
[`Distributed`](@ref man-distributed) standard library. | ||
""" | ||
nthreadpool() = Threads._nthreads_in_pool(Int8(0)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Choose a better name, to be less confusable with nthreadpools
. Possibly sizethreadpool
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defaultpoolsize
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not
nthreads()
-> default pool
nthreads(pool)
nthreadpool
seems confusing IMO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nthreads()
means something different now, so that would probably violate semver
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, nthreads()
currently means the total number of threads over all pools, so we should just keep it that way. nthreads(pool)
seems fine to me too. If we want a new function for the number of threads in the default pool we can maybe add that, but it doesn't need to be done in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR requires that function quite often, so it does have to be this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But after rebasing, this seems to be running into some sort of memory issue now, so I am trying to figure out what it is conflicting with
end | ||
return A | ||
end | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would a more object-oriented API be good for replacing this?
struct ThreadLocalData
f::Function
l::Lock
d::Vector{Any}
end
Base.getindex(tld::ThreadLocalData) = ...
e.g. #14135
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
base/threadingconstructs.jl
Outdated
pools is the inclusive upper bound on [`threadid()`](@ref). | ||
Get a lower bound on the number of threads (across all thread pools or within | ||
the specified thread pool) available to the Julia process >= [`threadid()`](@ref), | ||
with atomic-acquire semantics. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it strange that acquire ordering is mentioned without noting what events have release store. Also, this is not an accurate description of the method maxthreadid(pool::Symbol)
since it uses non-atomic load (the first load is OK):
julia/base/threadingconstructs.jl
Lines 38 to 41 in d635803
function _nthreads_in_pool(tpid::Int8) | |
p = unsafe_load(cglobal(:jl_n_threads_per_pool, Ptr{Cint})) | |
return Int(unsafe_load(p, tpid + 1)) | |
end |
Also, my impression is that this ordering is required for runtime authors and not Julia users. For example, monotonic load with max(threadid(), maxthreadid())
is sufficient for supporting DCL thread-local storage initialization pattern.
I think it'd be simpler to not document the ordering in the public API specification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need threadid
there, since you always have the relaxed-order guaranteed that maxthreadid >= threadid. What this is intended to help with is to ensure that even if you monotonic
load a task object, and monotonic
examine the tid
field, then that value will be <= maxthreadid
(e.g. the processor is not allowed to reorder and load the maxthreadid value before the tid
field)
I don't currently intend to permit changing the size of thread pools.
base/threadingconstructs.jl
Outdated
# lower bound on the largest threadid() | ||
""" | ||
Threads.nthreads([:default|:interactive]) -> Int | ||
Threads.maxthreadid([:default|:interactive]) -> Int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does maxthreadid(pool::Symbol)
work when allowing adding threads to both worker pools? Are thread IDs for each pool non-contiguous now? Or do thread IDs of interactive pool threads change after adding a thread to the default pool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not allowed to add threads to an existing pool
Needs a PkgEval run to evaluate the impact of deleting the |
@nanosoldier |
Your package evaluation job has completed - possible new issues were detected. A full report can be found here. |
Looks like there are a couple more key packages that need fixing from glancing at some parts of that report, including FFTW, HTTP, and Parsers. The HTTP and Parsers code looks like it was probably already very dangerous (looks like it probably may be introducing data races into the code). The FFTW code is just trivially deciding if thread support is enabled, and that call can be removed: https://github.com/JuliaMath/FFTW.jl/blob/17bc81a0fcf9875d777ea4bee2fca70fc23c8a0c/src/providers.jl#L75 |
Additionally, the PR fails to build on |
With dynamic thread counts, we cannot ensure this count is constant after initialization, and we might like to even profile adding and removing threads.
Hook a couple functions (notably cfunction) to handle adopting foreign threads automatically when used. n.b. If returning an object pointer, we do not gc_unsafe_leave afterwards as that would render the pointer invalid. However, this means that it can be a long time before the next safepoint (if ever). We should look into ways of improving this bad situation, such as pinning only that specific object temporarily. n.b. There are some remaining issues to clean up. For example, we may trap pages in the ptls after GC to keep them "warm", and trap other pages in the unwind buffer, etc.
if (jl_profile_init(10000000 * jl_n_threads, 1000000) == -1){ | ||
if (jl_profile_init(10000000, 1000000) == -1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a note in the Profile.init()
docstring that needs updating if this is being reverted. Also the revert should probably be backported to 1.8 otherwise it's the only minor that's going to have that feature
Hook a couple functions (notably cfunction and jl_call*) to handle adding foreign threads automatically.
Fixes #17573