Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bounds-checking not elided in isassigned for a SlowSubArray despite @inbounds and @inline annotations #53302

Closed
jishnub opened this issue Feb 12, 2024 · 3 comments · Fixed by #53305
Labels
arrays [a, r, r, a, y, s] performance Must go faster

Comments

@jishnub
Copy link
Contributor

jishnub commented Feb 12, 2024

Reposting from https://discourse.julialang.org/t/why-is-the-bounds-checking-not-elided-in-isassigned-for-slowsubarray-s-despite-an-inbounds-annotation/110019 as there were no replies, and this seems like something that may be improved. Please close if this is not actionable.

Defining

julia> function mycopyto!(dest::AbstractArray, src::AbstractArray)
           iterdest, itersrc = eachindex(dest), eachindex(src)
           @inbounds for I in iterdest
               if isassigned(src, I) # this shouldn't be checking bounds
                   dest[I] = src[I]
               else
                   dest[I] = zero(eltype(dest))
               end
           end
           return dest
       end
mycopyto! (generic function with 1 method)

julia> @code_llvm mycopyto!(view(zeros(4,4), 1:4, 1:4), view(zeros(4,4), 1:4, 1:4))

The relevant bit of the output is

;  @ REPL[1]:4 within `mycopyto!`
; ┌ @ multidimensional.jl:1568 within `isassigned` @ subarray.jl:412
   br label %L36

L36:                                              ; preds = %L590, %L36.preheader
   %value_phi25 = phi i64 [ %value_phi276, %L590 ], [ 1, %L36.preheader ]
   %value_phi26 = phi i64 [ %value_phi277, %L590 ], [ 1, %L36.preheader ]
; │┌ @ abstractarray.jl:681 within `checkbounds`
; ││┌ @ abstractarray.jl:725 within `checkbounds_indices`
; │││┌ @ abstractarray.jl:754 within `checkindex`
; ││││┌ @ int.jl:86 within `-`
       %12 = add i64 %value_phi25, -1
; ││││└
; ││││┌ @ int.jl:513 within `<`
       %13 = icmp uge i64 %12, %7
; │││└└
; │││ @ abstractarray.jl:725 within `checkbounds_indices` @ abstractarray.jl:725
; │││┌ @ abstractarray.jl:754 within `checkindex`
; ││││┌ @ int.jl:86 within `-`
       %14 = add i64 %value_phi26, -1
; ││││└
; ││││┌ @ int.jl:513 within `<`
       %15 = icmp uge i64 %14, %9
; │└└└└
   %.not417.not.not = or i1 %13, %15
; └

This is on

julia> versioninfo()
Julia Version 1.11.0-DEV.1517
Commit 667cdde7847 (2024-02-08 12:39 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 8 × 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, tigerlake)
Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores)
Environment:
  LD_LIBRARY_PATH = :/usr/lib/x86_64-linux-gnu/gtk-3.0/modules
  JULIA_EDITOR = subl

The expectation here is that the bounds-check will be skipped.

Full output
julia> @code_llvm mycopyto!(view(zeros(4,4),1:4, 1:4), view(zeros(4,4), 1:4, 1:4));
; Function Signature: mycopyto!(Base.SubArray{Float64, 2, Array{Float64, 2}, Tuple{Base.UnitRange{Int64}, Base.UnitRange{Int64}}, false}, Base.SubArray{Float64, 2, Array{Float64, 2}, Tuple{Base.UnitRange{Int64}, Base.UnitRange{Int64}}, false})
;  @ REPL[1]:1 within `mycopyto!`
define void @"julia_mycopyto!_1916"(ptr noalias nocapture noundef nonnull sret({ ptr, [2 x [2 x i64]], i64, i64 }) align 8 dereferenceable(56) %sret_return, ptr noalias nocapture noundef nonnull align 8 dereferenceable(8) %return_roots, ptr nocapture noundef nonnull readonly align 8 dereferenceable(56) %"dest::SubArray", ptr nocapture noundef nonnull readonly align 8 dereferenceable(56) %"src::SubArray") #0 {
top:
;  @ REPL[1]:2 within `mycopyto!`
; ┌ @ abstractarray.jl:378 within `eachindex` @ multidimensional.jl:406
; │┌ @ subarray.jl:528 within `axes`
; ││┌ @ Base.jl:49 within `getproperty`
     %"dest::SubArray.indices_ptr" = getelementptr inbounds { ptr, [2 x [2 x i64]], i64, i64 }, ptr %"dest::SubArray", i64 0, i32 1
; ││└
    %"dest::SubArray.indices_ptr[2]_ptr" = getelementptr inbounds { ptr, [2 x [2 x i64]], i64, i64 }, ptr %"dest::SubArray", i64 0, i32 1, i64 1
; ││┌ @ subarray.jl:533 within `_indices_sub`
; │││┌ @ abstractarray.jl:98 within `axes`
; ││││┌ @ range.jl:670 within `size`
; │││││┌ @ range.jl:759 within `length`
; ││││││┌ @ range.jl:840 within `last`
; │││││││┌ @ Base.jl:49 within `getproperty`
          %"dest::SubArray.indices_ptr[1]_ptr.stop_ptr" = getelementptr inbounds { ptr, [2 x [2 x i64]], i64, i64 }, ptr %"dest::SubArray", i64 0, i32 1, i64 0, i64 1
; ││││││└└
; ││││││ @ range.jl:762 within `length`
; ││││││┌ @ int.jl:86 within `-`
         %"dest::SubArray.indices_ptr[1]_ptr.stop_ptr.unbox" = load i64, ptr %"dest::SubArray.indices_ptr[1]_ptr.stop_ptr", align 8
         %"dest::SubArray.indices_ptr[1]_ptr.start_ptr.unbox" = load i64, ptr %"dest::SubArray.indices_ptr", align 8
         %0 = sub i64 %"dest::SubArray.indices_ptr[1]_ptr.stop_ptr.unbox", %"dest::SubArray.indices_ptr[1]_ptr.start_ptr.unbox"
; ││││││└
; ││││││┌ @ int.jl:87 within `+`
         %1 = add i64 %0, 1
; │││└└└└
; │││ @ subarray.jl:533 within `_indices_sub` @ subarray.jl:533
; │││┌ @ abstractarray.jl:98 within `axes`
; ││││┌ @ range.jl:670 within `size`
; │││││┌ @ range.jl:759 within `length`
; ││││││┌ @ range.jl:840 within `last`
; │││││││┌ @ Base.jl:49 within `getproperty`
          %"dest::SubArray.indices_ptr[2]_ptr.stop_ptr" = getelementptr inbounds { ptr, [2 x [2 x i64]], i64, i64 }, ptr %"dest::SubArray", i64 0, i32 1, i64 1, i64 1
; ││││││└└
; ││││││ @ range.jl:762 within `length`
; ││││││┌ @ int.jl:86 within `-`
         %"dest::SubArray.indices_ptr[2]_ptr.stop_ptr.unbox" = load i64, ptr %"dest::SubArray.indices_ptr[2]_ptr.stop_ptr", align 8
         %"dest::SubArray.indices_ptr[2]_ptr.start_ptr.unbox" = load i64, ptr %"dest::SubArray.indices_ptr[2]_ptr", align 8
         %2 = sub i64 %"dest::SubArray.indices_ptr[2]_ptr.stop_ptr.unbox", %"dest::SubArray.indices_ptr[2]_ptr.start_ptr.unbox"
; ││││││└
; ││││││┌ @ int.jl:87 within `+`
         %3 = add i64 %2, 1
; └└└└└└└
;  @ REPL[1]:3 within `mycopyto!`
; ┌ @ multidimensional.jl:416 within `iterate`
; │┌ @ tuple.jl:383 within `map`
; ││┌ @ range.jl:1420 within `in`
; │││┌ @ int.jl:514 within `<=`
      %4 = icmp ugt i64 %0, 9223372036854775806
      %5 = icmp ugt i64 %2, 9223372036854775806
; └└└└
  %.not411.not.not.not.not = or i1 %4, %5
  br i1 %.not411.not.not.not.not, label %L623, label %L36.preheader

L36.preheader:                                    ; preds = %top
  %"src::SubArray.indices_ptr" = getelementptr inbounds { ptr, [2 x [2 x i64]], i64, i64 }, ptr %"src::SubArray", i64 0, i32 1
  %"src::SubArray.indices_ptr[2]_ptr" = getelementptr inbounds { ptr, [2 x [2 x i64]], i64, i64 }, ptr %"src::SubArray", i64 0, i32 1, i64 1
  %"src::SubArray.indices_ptr[1]_ptr.stop_ptr" = getelementptr inbounds { ptr, [2 x [2 x i64]], i64, i64 }, ptr %"src::SubArray", i64 0, i32 1, i64 0, i64 1
  %"src::SubArray.indices_ptr[1]_ptr.stop_ptr.unbox" = load i64, ptr %"src::SubArray.indices_ptr[1]_ptr.stop_ptr", align 8
  %"src::SubArray.indices_ptr[1]_ptr.start_ptr.unbox" = load i64, ptr %"src::SubArray.indices_ptr", align 8
  %6 = add i64 %"src::SubArray.indices_ptr[1]_ptr.stop_ptr.unbox", 1
  %7 = sub i64 %6, %"src::SubArray.indices_ptr[1]_ptr.start_ptr.unbox"
  %"src::SubArray.indices_ptr[2]_ptr.stop_ptr" = getelementptr inbounds { ptr, [2 x [2 x i64]], i64, i64 }, ptr %"src::SubArray", i64 0, i32 1, i64 1, i64 1
  %"src::SubArray.indices_ptr[2]_ptr.stop_ptr.unbox" = load i64, ptr %"src::SubArray.indices_ptr[2]_ptr.stop_ptr", align 8
  %"src::SubArray.indices_ptr[2]_ptr.start_ptr.unbox" = load i64, ptr %"src::SubArray.indices_ptr[2]_ptr", align 8
  %8 = add i64 %"src::SubArray.indices_ptr[2]_ptr.stop_ptr.unbox", 1
  %9 = sub i64 %8, %"src::SubArray.indices_ptr[2]_ptr.start_ptr.unbox"
  %"src::SubArray.parent155" = load atomic ptr, ptr %"src::SubArray" unordered, align 8
  %10 = getelementptr inbounds i8, ptr %"src::SubArray.parent155", i64 16
  %"dest::SubArray.parent" = load atomic ptr, ptr %"dest::SubArray" unordered, align 8
  %11 = getelementptr inbounds i8, ptr %"dest::SubArray.parent", i64 16
;  @ REPL[1]:4 within `mycopyto!`
; ┌ @ multidimensional.jl:1568 within `isassigned` @ subarray.jl:412
   br label %L36

L36:                                              ; preds = %L590, %L36.preheader
   %value_phi25 = phi i64 [ %value_phi276, %L590 ], [ 1, %L36.preheader ]
   %value_phi26 = phi i64 [ %value_phi277, %L590 ], [ 1, %L36.preheader ]
; │┌ @ abstractarray.jl:681 within `checkbounds`
; ││┌ @ abstractarray.jl:725 within `checkbounds_indices`
; │││┌ @ abstractarray.jl:754 within `checkindex`
; ││││┌ @ int.jl:86 within `-`
       %12 = add i64 %value_phi25, -1
; ││││└
; ││││┌ @ int.jl:513 within `<`
       %13 = icmp uge i64 %12, %7
; │││└└
; │││ @ abstractarray.jl:725 within `checkbounds_indices` @ abstractarray.jl:725
; │││┌ @ abstractarray.jl:754 within `checkindex`
; ││││┌ @ int.jl:86 within `-`
       %14 = add i64 %value_phi26, -1
; ││││└
; ││││┌ @ int.jl:513 within `<`
       %15 = icmp uge i64 %14, %9
; │└└└└
   %.not417.not.not = or i1 %13, %15
; └
;  @ REPL[1] within `mycopyto!`
  %16 = add i64 %value_phi25, -2
;  @ REPL[1]:4 within `mycopyto!`
  br i1 %.not417.not.not, label %L491, label %L211

L211:                                             ; preds = %L36
;  @ REPL[1]:5 within `mycopyto!`
; ┌ @ abstractarray.jl:1310 within `getindex`
; │┌ @ abstractarray.jl:1356 within `_getindex`
; ││┌ @ subarray.jl:317 within `getindex` @ array.jl:915
; │││┌ @ abstractarray.jl:1345 within `_to_linear_index`
; ││││┌ @ abstractarray.jl:2975 within `_sub2ind`
; │││││┌ @ abstractarray.jl:98 within `axes`
; ││││││┌ @ array.jl:194 within `size`
         %"src::SubArray.parent155.size174.sroa.0.0.copyload" = load i64, ptr %10, align 8
; │││└└└└
; │││ @ subarray.jl:317 within `getindex`
; │││┌ @ subarray.jl:295 within `reindex` @ subarray.jl:295
; ││││┌ @ views.jl:150 within `maybeview`
; │││││┌ @ array.jl:3058 within `getindex`
; ││││││┌ @ range.jl:932 within `_getindex`
; │││││││┌ @ int.jl:87 within `+`
          %17 = add i64 %value_phi26, -2
; │││└└└└└
; │││ @ subarray.jl:317 within `getindex` @ array.jl:915
; │││┌ @ abstractarray.jl:1345 within `_to_linear_index`
; ││││┌ @ abstractarray.jl:2975 within `_sub2ind` @ abstractarray.jl:2991
; │││││┌ @ abstractarray.jl:3007 within `_sub2ind_recurse` @ abstractarray.jl:3007
; ││││││┌ @ abstractarray.jl:3014 within `offsetin`
; │││││││┌ @ int.jl:86 within `-`
          %18 = add i64 %17, %"src::SubArray.indices_ptr[2]_ptr.start_ptr.unbox"
; ││││││└└
; ││││││┌ @ int.jl:88 within `*`
         %19 = mul i64 %"src::SubArray.parent155.size174.sroa.0.0.copyload", %18
; │││└└└└
; │││ @ subarray.jl:317 within `getindex` @ array.jl:915 @ essentials.jl:887
     %20 = load ptr, ptr %"src::SubArray.parent155", align 8
; │││ @ subarray.jl:317 within `getindex` @ array.jl:915
; │││┌ @ abstractarray.jl:1345 within `_to_linear_index`
; ││││┌ @ abstractarray.jl:2975 within `_sub2ind` @ abstractarray.jl:2991
; │││││┌ @ abstractarray.jl:3007 within `_sub2ind_recurse` @ abstractarray.jl:3007
; ││││││┌ @ int.jl:87 within `+`
         %21 = add i64 %16, %"src::SubArray.indices_ptr[1]_ptr.start_ptr.unbox"
; │││└└└└
; │││ @ subarray.jl:317 within `getindex` @ array.jl:915 @ essentials.jl:887
     %22 = add i64 %21, %19
     %23 = getelementptr inbounds double, ptr %20, i64 %22
     %24 = load double, ptr %23, align 8
; └└└
; ┌ @ abstractarray.jl:1411 within `setindex!`
; │┌ @ abstractarray.jl:1441 within `_setindex!`
; ││┌ @ subarray.jl:368 within `setindex!` @ array.jl:979
; │││┌ @ Base.jl:49 within `getproperty`
      %25 = load ptr, ptr %"dest::SubArray.parent", align 8
; │││└
; │││┌ @ abstractarray.jl:1345 within `_to_linear_index`
; ││││┌ @ abstractarray.jl:2975 within `_sub2ind`
; │││││┌ @ abstractarray.jl:98 within `axes`
; ││││││┌ @ array.jl:194 within `size`
         %"dest::SubArray.parent.size263.sroa.0.0.copyload" = load i64, ptr %11, align 8
; │││││└└
; │││││ @ abstractarray.jl:2975 within `_sub2ind` @ abstractarray.jl:2991
; │││││┌ @ abstractarray.jl:3007 within `_sub2ind_recurse` @ abstractarray.jl:3007
; ││││││┌ @ abstractarray.jl:3014 within `offsetin`
; │││││││┌ @ int.jl:86 within `-`
          %26 = add i64 %17, %"dest::SubArray.indices_ptr[2]_ptr.start_ptr.unbox"
; ││││││└└
; ││││││┌ @ int.jl:88 within `*`
         %27 = mul i64 %"dest::SubArray.parent.size263.sroa.0.0.copyload", %26
; ││││││└
; ││││││┌ @ int.jl:87 within `+`
         %28 = add i64 %16, %"dest::SubArray.indices_ptr[1]_ptr.start_ptr.unbox"
; │││└└└└
     %29 = add i64 %28, %27
     %30 = getelementptr inbounds double, ptr %25, i64 %29
     store double %24, ptr %30, align 8
; └└└
  br label %L590

L491:                                             ; preds = %L36
;  @ REPL[1]:7 within `mycopyto!`
; ┌ @ abstractarray.jl:1411 within `setindex!`
; │┌ @ abstractarray.jl:1441 within `_setindex!`
; ││┌ @ subarray.jl:368 within `setindex!` @ array.jl:979
; │││┌ @ Base.jl:49 within `getproperty`
      %31 = load ptr, ptr %"dest::SubArray.parent", align 8
; │││└
; │││┌ @ abstractarray.jl:1345 within `_to_linear_index`
; ││││┌ @ abstractarray.jl:2975 within `_sub2ind`
; │││││┌ @ abstractarray.jl:98 within `axes`
; ││││││┌ @ array.jl:194 within `size`
         %"dest::SubArray.parent366.size386.sroa.0.0.copyload" = load i64, ptr %11, align 8
; │││└└└└
; │││ @ subarray.jl:368 within `setindex!`
; │││┌ @ subarray.jl:295 within `reindex` @ subarray.jl:295
; ││││┌ @ views.jl:150 within `maybeview`
; │││││┌ @ array.jl:3058 within `getindex`
; ││││││┌ @ range.jl:932 within `_getindex`
; │││││││┌ @ int.jl:87 within `+`
          %32 = add i64 %value_phi26, -2
; │││└└└└└
; │││ @ subarray.jl:368 within `setindex!` @ array.jl:979
; │││┌ @ abstractarray.jl:1345 within `_to_linear_index`
; ││││┌ @ abstractarray.jl:2975 within `_sub2ind` @ abstractarray.jl:2991
; │││││┌ @ abstractarray.jl:3007 within `_sub2ind_recurse` @ abstractarray.jl:3007
; ││││││┌ @ abstractarray.jl:3014 within `offsetin`
; │││││││┌ @ int.jl:86 within `-`
          %33 = add i64 %32, %"dest::SubArray.indices_ptr[2]_ptr.start_ptr.unbox"
; ││││││└└
; ││││││┌ @ int.jl:88 within `*`
         %34 = mul i64 %"dest::SubArray.parent366.size386.sroa.0.0.copyload", %33
; ││││││└
; ││││││┌ @ int.jl:87 within `+`
         %35 = add i64 %16, %"dest::SubArray.indices_ptr[1]_ptr.start_ptr.unbox"
; │││└└└└
     %36 = add i64 %35, %34
     %37 = getelementptr inbounds double, ptr %31, i64 %36
     store i64 0, ptr %37, align 8
; │└└
   br label %L590

L590:                                             ; preds = %L491, %L211
; └
;  @ REPL[1]:9 within `mycopyto!`
; ┌ @ multidimensional.jl:422 within `iterate`
; │┌ @ multidimensional.jl:446 within `__inc`
; ││┌ @ int.jl:87 within `+`
     %38 = add i64 %value_phi25, 1
; ││└
; ││ @ multidimensional.jl:447 within `__inc`
; ││┌ @ operators.jl:277 within `!=`
; │││┌ @ promotion.jl:620 within `==`
      %.not = icmp eq i64 %value_phi25, %1
; ││└└
    %39 = icmp eq i64 %value_phi26, %3
    %narrow.not.not = select i1 %.not, i1 %39, i1 false
    %value_phi276 = select i1 %.not, i64 1, i64 %38
    %40 = zext i1 %.not to i64
    %value_phi277 = add i64 %value_phi26, %40
; └└
  br i1 %narrow.not.not, label %L623, label %L36

L623:                                             ; preds = %L590, %top
;  @ REPL[1]:10 within `mycopyto!`
  %41 = load ptr, ptr %"dest::SubArray", align 8
  store ptr %41, ptr %return_roots, align 8
  call void @llvm.memcpy.p0.p0.i64(ptr noundef nonnull align 8 dereferenceable(56) %sret_return, ptr noundef nonnull align 8 dereferenceable(56) %"dest::SubArray", i64 56, i1 false)
  ret void
}
@jishnub jishnub added performance Must go faster arrays [a, r, r, a, y, s] labels Feb 12, 2024
@vtjnash
Copy link
Member

vtjnash commented Feb 12, 2024

It seems correct there though (or equivalently, it is incorrect to specify removing bounds), as you are indexing src with the bounds of dst, which is indeed unsound if it removed boundschecking

@mbauman
Copy link
Member

mbauman commented Feb 12, 2024

It's surely just a method in need of bounds check propagation:

julia/base/multidimensional.jl

Lines 1568 to 1571 in 81c6526

isassigned(a::AbstractArray, i::CartesianIndex) = isassigned(a, Tuple(i)...)
function isassigned(A::AbstractArray, i::Union{Integer, CartesianIndex}...)
isa(i, Tuple{Vararg{Int}}) || return isassigned(A, CartesianIndex(to_indices(A, i)))
@boundscheck checkbounds(Bool, A, i...) || return false

That's why it's only appearing for a "slow" array — eachindex gives you CartesianIndexes, which then hit that intermediate function.

@vtjnash
Copy link
Member

vtjnash commented Feb 12, 2024

Yes, that is true of isassigned usages in general. I just wanted to also note that the usage in the OP is not valid, so that someone does not try to make a test from it, as it may lead to some data corruption, etc, due to also being wrong to specify there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays [a, r, r, a, y, s] performance Must go faster
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants