improve the interplay between bounds checking system and effect system

In the current state of the Julia compiler, bounds checking and its related optimization code, such as `@boundscheck` and `@inbounds`, pose a significant handicap for effect analysis. As a result, we're encountering an ironic situation where the application of `@inbounds` annotations, which are intended to optimize performance, instead obstruct the program's optimization, thereby preventing us from achieving optimal performance. This PR is designed to resolve this situation. It aims to enhance the relationship between bounds checking and effect analysis, thereby correctly improving the performance of programs that have `@inbounds` annotations. In the following, I'll first explain the reasons that have led to this situation for better understanding, and then I'll present potential improvements to address these issues. This commit is a collection of various improvement proposals. It's necessary that we incorporate all of them simultaneously to enhance the situation without introducing any regressions. \## Core of the Problem There are fundamentally two reasons why effect analysis of code containing bounds checking is difficult: 1. The evaluation value of `Expr(:boundscheck)` is influenced by the `@inbounds` macro and the `--check-bounds` flag. Hence, when performing a concrete evaluation of a method containing `Expr(:boundscheck)`, it's crucial to respect the `@inbounds` macro context and the `--check-bounds` settings, ensuring the method's behavior is consistent across the compile time concrete evaluation and the runtime execution. 1. If the code, from which bounds checking has been removed due to `@inbounds` or `--check-bounds=no`, is unsafe, it may lead to undefined behavior due to uncertain memory access. \## Current State The current Julia compiler handles these two problems as follows: \### Current State 1 Regarding the first problem, if a code or method call containing `Expr(:boundscheck)` is within an `@inbounds` context, a concrete evaluation is immediately prohibited. For instance, in the following case, when analyzing `bar()`, if you simply perform concrete evaluation of `foo()`, it wouldn't properly respect the `@inbounds` context present in `bar()`. However, since the concrete evaluation of `foo()` is prohibited, it doesn't pose an issue: ```julia foo() = (r = 0; @BoundsCheck r += 1; return r) bar() = @inbounds foo() ``` Conversely, in the following case, there is _no need_ to prohibit the concrete evaluation of `A1_inbounds` due to the presence of `@inbounds`. This is because ~~the execution of the `@boundscheck` block is determined by the presence of local `@inbounds`~~ `Expr(:boundscheck)` within a local `@inbounds` context does not need to block concrete evaluation: ```julia function A1_inbounds() r = 0 @inbounds begin @BoundsCheck r += 1 end return r end ``` However, currently, we prohibit the concrete evaluation of such code as well. ~~Moreover, we are not handling such local `@inbounds` contexts effectively, which results in incorrect execution of `A1_inbounds()` (even our test is incorrect for this example: `https://github.com/JuliaLang/julia/blob/834aad4ab409f4ba65cbed2963b9ab6fa2770354/test/boundscheck_exec.jl#L34`)~~ EDIT: It is an expected behavior as pointed out by Jameson. Furthermore, there is room for improvement when the `--check-bounds` flag is specified. Specifically, when the `--check-bounds` flag is set, the evaluation value of `Expr(:boundscheck)` is determined irrespective of the `@inbounds` context. Hence, there is no need to prohibit concrete evaluations due to inconsistency in the evaluation value of `Expr(:boundscheck)`. \### Current State 2 Next, we've ensured that concrete evaluation isn't performed when there's potentially unsafe code that may have bounds checking removed, or when the `--check-bounds=no` flag is set, which could lead to bounds checking being removed always. For instance, if you perform concrete evaluation for the function call `baz((1,2,3), 4)` in the following example, it may return a value accessed from illicit memory and introduce undefined behaviors into the program: ```julia baz(t::Tuple, i::Int) = @inbounds t[i] baz((1,2,3), 4) ``` However, it's evident that the above code is incorrect and unsafe program and I believe undefined behavior in such programs is deemed, as explicitly stated in the `@inbounds` documentation: > │ Warning > │ > │ Using @inbounds may return incorrect results/crashes/corruption for > │ out-of-bounds indices. The user is responsible for checking it > │ manually. Only use @inbounds when it is certain from the information > │ locally available that all accesses are in bounds. Actually, the `@inbounds` macro is primarily an annotation to "improve performance by removing bounds checks from safe programs". Therefore, I opine that it would be more reasonable to utilize it to alleviate potential errors due to bounds checking within `@inbounds` contexts. To bring up another associated concern, in the current compiler implementation, the `:nothrow` modelings for `getfield`/`arrayref`/`arrayset` is a bit risky, and `:nothrow`-ness is assumed when their bounds checking is turned off by call argument. If our intended direction aligns with the removal of bounds checking based on `@inbounds` as proposed in issue #48245, then assuming `:nothrow`-ness due to `@inbounds` seems reasonable. However, presuming `:nothrow`-ness due to bounds checking argument or the `--check-bounds` flag appears to be risky, especially considering it's not documented. \## This Commit This commit implements all proposed improvements against the current issues as mentioned above. In summary, the enhancements include: - making `Expr(:boundscheck)` within a local `@inbounds` context not block concrete evaluation - folding out `Expr(:boundscheck)` when the `--check-bounds` flag is set (and allow concrete evaluation) - changing the `:nothrow` effect bit to `UInt8` type, and refining `:nothrow` information when in an `@inbounds` context - removing dangerous assumptions of `:nothrow`-ness for built-in functions when bounds checking is turned off - replacing the `@_safeindex` hack with `@inbounds`
JuliaLang · Jun 15, 2023 · 9225ae0 · 9225ae0
1 parent 5db2c27
commit 9225ae0
Show file tree

Hide file tree

Showing 15 changed files with 340 additions and 310 deletions.
diff --git a/base/array.jl b/base/array.jl
@@ -122,46 +122,12 @@ const DenseVecOrMat{T} = Union{DenseVector{T}, DenseMatrix{T}}
 
 using Core: arraysize, arrayset, const_arrayref
 
-"""
-    @_safeindex
-
-This internal macro converts:
-- `getindex(xs::Tuple, )` -> `__inbounds_getindex(args...)`
-- `setindex!(xs::Vector, args...)` -> `__inbounds_setindex!(xs, args...)`
-to tell the compiler that indexing operations within the applied expression are always
-inbounds and do not need to taint `:consistent` and `:nothrow`.
-"""
-macro _safeindex(ex)
-    return esc(_safeindex(__module__, ex))
-end
-function _safeindex(__module__, ex)
-    isa(ex, Expr) || return ex
-    if ex.head === :(=)
-        lhs = arrayref(true, ex.args, 1)
-        if isa(lhs, Expr) && lhs.head === :ref # xs[i] = x
-            rhs = arrayref(true, ex.args, 2)
-            xs = arrayref(true, lhs.args, 1)
-            args = Vector{Any}(undef, length(lhs.args)-1)
-            for i = 2:length(lhs.args)
-                arrayset(true, args, _safeindex(__module__, arrayref(true, lhs.args, i)), i-1)
-            end
-            return Expr(:call, GlobalRef(__module__, :__inbounds_setindex!), xs, _safeindex(__module__, rhs), args...)
-        end
-    elseif ex.head === :ref # xs[i]
-        return Expr(:call, GlobalRef(__module__, :__inbounds_getindex), ex.args...)
-    end
-    args = Vector{Any}(undef, length(ex.args))
-    for i = 1:length(ex.args)
-        arrayset(true, args, _safeindex(__module__, arrayref(true, ex.args, i)), i)
-    end
-    return Expr(ex.head, args...)
-end
-
 vect() = Vector{Any}()
 function vect(X::T...) where T
     @_terminates_locally_meta
-    vec = Vector{T}(undef, length(X))
-    @_safeindex for i = 1:length(X)
+    n = length(X)
+    vec = Vector{T}(undef, n)
+    @inbounds for i = 1:n
         vec[i] = X[i]
     end
     return vec
@@ -443,9 +409,10 @@ julia> getindex(Int8, 1, 2, 3)
 function getindex(::Type{T}, vals...) where T
     @inline
     @_effect_free_terminates_locally_meta
-    a = Vector{T}(undef, length(vals))
+    n = length(vals)
+    a = Vector{T}(undef, n)
     if vals isa NTuple
-        @_safeindex for i in 1:length(vals)
+        @inbounds for i in 1:n
             a[i] = vals[i]
         end
     else
@@ -460,8 +427,9 @@ end
 
 function getindex(::Type{Any}, @nospecialize vals...)
     @_effect_free_terminates_locally_meta
-    a = Vector{Any}(undef, length(vals))
-    @_safeindex for i = 1:length(vals)
+    n = length(vals)
+    a = Vector{Any}(undef, n)
+    @inbounds for i = 1:n
         a[i] = vals[i]
     end
     return a
@@ -1021,11 +989,6 @@ function setindex! end
 @eval setindex!(A::Array{T}, x, i1::Int, i2::Int, I::Int...) where {T} =
     (@inline; arrayset($(Expr(:boundscheck)), A, x isa T ? x : convert(T,x)::T, i1, i2, I...))
 
-__inbounds_setindex!(A::Array{T}, x, i1::Int) where {T} =
-    arrayset(false, A, convert(T,x)::T, i1)
-__inbounds_setindex!(A::Array{T}, x, i1::Int, i2::Int, I::Int...) where {T} =
-    (@inline; arrayset(false, A, convert(T,x)::T, i1, i2, I...))
-
 # This is redundant with the abstract fallbacks but needed and helpful for bootstrap
 function setindex!(A::Array, X::AbstractArray, I::AbstractVector{Int})
     @_propagate_inbounds_meta
@@ -1115,22 +1078,22 @@ function push!(a::Vector{T}, item) where T
     # convert first so we don't grow the array if the assignment won't work
     itemT = item isa T ? item : convert(T, item)::T
     _growend!(a, 1)
-    @_safeindex a[length(a)] = itemT
+    @inbounds a[length(a)] = itemT
     return a
 end
 
 # specialize and optimize the single argument case
 function push!(a::Vector{Any}, @nospecialize x)
     _growend!(a, 1)
-    @_safeindex a[length(a)] = x
+    @inbounds a[length(a)] = x
     return a
 end
 function push!(a::Vector{Any}, @nospecialize x...)
     @_terminates_locally_meta
     na = length(a)
     nx = length(x)
     _growend!(a, nx)
-    @_safeindex for i = 1:nx
+    @inbounds for i = 1:nx
         a[na+i] = x[i]
     end
     return a
@@ -1194,7 +1157,7 @@ function _append!(a::AbstractVector, ::Union{HasLength,HasShape}, iter)
     resize!(a, n+Int(length(iter))::Int)
     for (i, item) in zip(i+1:lastindex(a), iter)
         if isa(a, Vector) # give better effects for builtin vectors
-            @_safeindex a[i] = item
+            @inbounds a[i] = item
         else
             a[i] = item
         end
@@ -1263,7 +1226,7 @@ function _prepend!(a::Vector, ::Union{HasLength,HasShape}, iter)
     _growbeg!(a, n)
     i = 0
     for item in iter
-        @_safeindex a[i += 1] = item
+        @inbounds a[i += 1] = item
     end
     a
 end
@@ -1470,22 +1433,22 @@ julia> pushfirst!([1, 2, 3, 4], 5, 6)
 function pushfirst!(a::Vector{T}, item) where T
     item = item isa T ? item : convert(T, item)::T
     _growbeg!(a, 1)
-    @_safeindex a[1] = item
+    @inbounds a[1] = item
     return a
 end
 
 # specialize and optimize the single argument case
 function pushfirst!(a::Vector{Any}, @nospecialize x)
     _growbeg!(a, 1)
-    @_safeindex a[1] = x
+    @inbounds a[1] = x
     return a
 end
 function pushfirst!(a::Vector{Any}, @nospecialize x...)
     @_terminates_locally_meta
     na = length(a)
     nx = length(x)
     _growbeg!(a, nx)
-    @_safeindex for i = 1:nx
+    @inbounds for i = 1:nx
         a[i] = x[i]
     end
     return a