Merge pull request #35686 from JuliaLang/jb/threadstatus

upgrade threads from experimental to stable-with-caveats
JuliaLang · May 5, 2020 · d07fadf · d07fadf
2 parents d68243c + c879d1a
commit d07fadf
Show file tree

Hide file tree

Showing 7 changed files with 131 additions and 23 deletions.
diff --git a/NEWS.md b/NEWS.md
@@ -104,11 +104,14 @@ Command-line option changes
 
 Multi-threading changes
 -----------------------
+
+* Parts of the multi-threading API are now considered stable, with caveats.
+  This includes all documented identifiers from `Base.Threads` except the
+  `atomic_` operations.
 * `@threads` now allows an optional schedule argument. Use `@threads :static ...` to
   ensure that the same schedule will be used as in past versions; the default schedule
   is likely to change in the future.
 
-
 Build system changes
 --------------------
 * The build system now contains a pure-make caching system for expanding expensive operations at the latest

diff --git a/base/threadingconstructs.jl b/base/threadingconstructs.jl
@@ -155,7 +155,7 @@ constructed underlying closure. This allows you to insert the _value_ of a varia
 isolating the aysnchronous code from changes to the variable's value in the current task.
 
 !!! note
-    This feature is currently considered experimental.
+    See the manual chapter on threading for important caveats.
 
 !!! compat "Julia 1.3"
     This macro is available as of Julia 1.3.

diff --git a/base/threads.jl b/base/threads.jl
@@ -1,7 +1,7 @@
 # This file is a part of Julia. License is MIT: https://julialang.org/license
 
 """
-Experimental multithreading support.
+Multithreading support.
 """
 module Threads
 

diff --git a/doc/src/base/multi-threading.md b/doc/src/base/multi-threading.md
@@ -1,15 +1,27 @@
 # [Multi-Threading](@id lib-multithreading)
 
-This experimental interface supports Julia's multi-threading capabilities. Types and functions
-described here might (and likely will) change in the future.
-
 ```@docs
-Base.Threads.threadid
-Base.Threads.nthreads
 Base.Threads.@threads
 Base.Threads.@spawn
+Base.Threads.threadid
+Base.Threads.nthreads
 ```
 
+## Synchronization
+
+```@docs
+Base.Threads.Condition
+Base.Threads.Event
+```
+
+See also [Synchronization](@ref lib-task-sync).
+
+## Atomic operations
+
+!!! warning
+
+    The API for atomic operations has not yet been finalized and is likely to change.
+
 ```@docs
 Base.Threads.Atomic
 Base.Threads.atomic_cas!
@@ -31,7 +43,7 @@ Base.Threads.atomic_fence
 Base.@threadcall
 ```
 
-# Low-level synchronization primitives
+## Low-level synchronization primitives
 
 These building blocks are used to create the regular synchronization objects.
 

diff --git a/doc/src/base/parallel.md b/doc/src/base/parallel.md
@@ -4,10 +4,8 @@
 Core.Task
 Base.@task
 Base.@async
-Base.@sync
 Base.asyncmap
 Base.asyncmap!
-Base.fetch(t::Task)
 Base.current_task
 Base.istaskdone
 Base.istaskstarted
@@ -17,21 +15,25 @@ Base.task_local_storage(::Any, ::Any)
 Base.task_local_storage(::Function, ::Any, ::Any)
 ```
 
-# Scheduling
+## Scheduling
 
 ```@docs
 Base.yield
 Base.yieldto
 Base.sleep
+Base.schedule
+```
+
+## [Synchronization](@id lib-task-sync)
+
+```@docs
+Base.@sync
 Base.wait
+Base.fetch(t::Task)
 Base.timedwait
 
 Base.Condition
-Base.Threads.Condition
 Base.notify
-Base.schedule
-
-Base.Threads.Event
 
 Base.Semaphore
 Base.acquire
@@ -43,7 +45,11 @@ Base.unlock
 Base.trylock
 Base.islocked
 Base.ReentrantLock
+```
+
+## Channels
 
+```@docs
 Base.Channel
 Base.Channel(::Function)
 Base.put!(::Channel, ::Any)

diff --git a/doc/src/manual/asynchronous-programming.md b/doc/src/manual/asynchronous-programming.md
@@ -10,7 +10,7 @@ This sort of scenario falls in the domain of asynchronous programming, sometimes
 also referred to as concurrent programming (since, conceptually, multiple things
 are happening at once).
 
-To address these scenarios, Julia provides `Task`s (also known by several other
+To address these scenarios, Julia provides [`Task`](@ref)s (also known by several other
 names, such as symmetric coroutines, lightweight threads, cooperative multitasking,
 or one-shot continuations).
 When a piece of computing work (in practice, executing a particular function) is designated as
@@ -26,7 +26,7 @@ calls, where the called function must finish executing before control returns to
 You can think of a `Task` as a handle to a unit of computational work to be performed.
 It has a create-start-run-finish lifecycle.
 Tasks are created by calling the `Task` constructor on a 0-argument function to run,
-or using the `@task` macro:
+or using the [`@task`](@ref) macro:
 
 ```
 julia> t = @task begin; sleep(5); println("done"); end
@@ -36,7 +36,7 @@ Task (runnable) @0x00007f13a40c0eb0
 `@task x` is equivalent to `Task(()->x)`.
 
 This task will wait for five seconds, and then print `done`. However, it has not
-started running yet. We can run it whenever we're ready by calling `schedule`:
+started running yet. We can run it whenever we're ready by calling [`schedule`](@ref):
 
 ```
 julia> schedule(t);
@@ -47,12 +47,12 @@ That is because it simply adds `t` to an internal queue of tasks to run.
 Then, the REPL will print the next prompt and wait for more input.
 Waiting for keyboard input provides an opportunity for other tasks to run,
 so at that point `t` will start.
-`t` calls `sleep`, which sets a timer and stops execution.
+`t` calls [`sleep`](@ref), which sets a timer and stops execution.
 If other tasks have been scheduled, they could run then.
 After five seconds, the timer fires and restarts `t`, and you will see `done`
 printed. `t` is then finished.
 
-The `wait` function blocks the calling task until some other task finishes.
+The [`wait`](@ref) function blocks the calling task until some other task finishes.
 So for example if you type
 
 ```
@@ -63,8 +63,8 @@ instead of only calling `schedule`, you will see a five second pause before
 the next input prompt appears. That is because the REPL is waiting for `t`
 to finish before proceeding.
 
-It is common to want to create a task and schedule it right away, so a
-macro called `@async` is provided for that purpose --- `@async x` is
+It is common to want to create a task and schedule it right away, so the
+macro [`@async`](@ref) is provided for that purpose --- `@async x` is
 equivalent to `schedule(@task x)`.
 
 ## Communicating with Channels

diff --git a/doc/src/manual/multi-threading.md b/doc/src/manual/multi-threading.md
@@ -213,3 +213,90 @@ therefore a blocking call like other Julia APIs.
 It is very important that the called function does not call back into Julia, as it will segfault.
 
 `@threadcall` may be removed/changed in future versions of Julia.
+
+## Caveats
+
+At this time, most operations in the Julia runtime and standard libraries
+can be used in a thread-safe manner, if the user code is data-race free.
+However, in some areas work on stabilizing thread support is ongoing.
+Multi-threaded programming has many inherent difficulties, and if a program
+using threads exhibits unusual or undesirable behavior (e.g. crashes or
+mysterious results), thread interactions should typically be suspected first.
+
+There are a few specific limitations and warnings to be aware of when using
+threads in Julia:
+
+  * Base collection types require manual locking if used simultaneously by
+    multiple threads where at least one thread modifies the collection
+    (common examples include `push!` on arrays, or inserting
+    items into a `Dict`).
+  * After a task starts running on a certain thread (e.g. via `@spawn`), it
+    will always be restarted on the same thread after blocking. In the future
+    this limitation will be removed, and tasks will migrate between threads.
+  * `@threads` currently uses a static schedule, using all threads and assigning
+    equal iteration counts to each. In the future the default schedule is likely
+    to change to be dynamic.
+  * The schedule used by `@spawn` is nondeterministic and should not be relied on.
+  * Compute-bound, non-memory-allocating tasks can prevent garbage collection from
+    running in other threads that are allocating memory. In these cases it may
+    be necessary to insert a manual call to `GC.safepoint()` to allow GC to run.
+    This limitation will be removed in the future.
+  * Avoid running top-level operations, e.g. `include`, or `eval` of type,
+    method, and module definitions in parallel.
+  * Be aware that finalizers registered by a library may break if threads are enabled.
+    This may require some transitional work across the ecosystem before threading
+    can be widely adopted with confidence. See the next section for further details.
+
+## Safe use of Finalizers
+
+Because finalizers can interrupt any code, they must be very careful in how
+they interact with any global state. Unfortunately, the main reason that
+finalizers are used is to update global state (a pure function is generally
+rather pointless as a finalizer). This leads us to a bit of a conundrum.
+There are a few approaches to dealing with this problem:
+
+1. When single-threaded, code could call the internal `jl_gc_enable_finalizers`
+C function to prevent finalizers from being scheduled
+inside a critical region. Internally, this is used inside some functions (such
+as our C locks) to prevent recursion when doing certain operations (incremental
+package loading, codegen, etc.). The combination of a lock and this flag
+can be used to make finalizers safe.
+
+2. A second strategy, employed by Base in a couple places, is to explicitly
+delay a finalizer until it may be able to acquire its lock non-recursively.
+The following example demonstrates how this strategy could be applied to
+`Distributed.finalize_ref`:
+
+```
+function finalize_ref(r::AbstractRemoteRef)
+    if r.where > 0 # Check if the finalizer is already run
+        if islocked(client_refs) || !trylock(client_refs)
+            # delay finalizer for later if we aren't free to acquire the lock
+            finalizer(finalize_ref, r)
+            return nothing
+        end
+        try # `lock` should always be followed by `try`
+            if r.where > 0 # Must check again here
+                # Do actual cleanup here
+                r.where = 0
+            end
+        finally
+            unlock(client_refs)
+        end
+    end
+    nothing
+end
+```
+
+3. A related third strategy is to use a yield-free queue. We don't currently
+have a lock-free queue implemented in Base, but
+`Base.InvasiveLinkedListSynchronized{T}` is suitable. This can frequently be a
+good strategy to use for code with event loops. For example, this strategy is
+employed by `Gtk.jl` to manage lifetime ref-counting. In this approach, we
+don't do any explicit work inside the `finalizer`, and instead add it to a queue
+to run at a safer time. In fact, Julia's task scheduler already uses this, so
+defining the finalizer as `x -> @spawn do_cleanup(x)` is one example of this
+approach. Note however that this doesn't control which thread `do_cleanup`
+runs on, so `do_cleanup` would still need to acquire a lock. That
+doesn't need to be true if you implement your own queue, as you can explicitly
+only drain that queue from your thread.