-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dot operation between a tuple and a scalar variable generates type instability #21291
Comments
The literal constant in Note that type inference works fine for the array case cc @pabloferz |
Also |
I have a half-baked solution for this. Will submit a PR soon. |
Closed by #21331 |
Thanks! This fix helps a lot. One more question. With this fix, dot-addition between a tuple and a scalar works not much slower than dot-addition between a StaticArray and a scalar: julia> VERSION
v"0.6.0-pre.beta.86"
julia> using BenchmarkTools
julia> using StaticArrays
julia> function dotadd_tuple_intv()
o = 1
t = (1,0,1)
t = t .+ o
end
dotadd_tuple_intv (generic function with 1 method)
julia> function dotadd_svec_intv()
o = 1
v = @SVector([1,0,1])
v = v .+ o
end
dotadd_svec_intv (generic function with 1 method)
julia> @btime dotadd_tuple_intv()
10.647 ns (0 allocations: 0 bytes)
julia> @btime dotadd_svec_intv()
7.007 ns (0 allocations: 0 bytes) However, inlining makes the StaticArray version much faster than the tuple version: julia> @inline function inlined_dotadd_tuple_intv()
o = 1
t = (1,0,1)
t = t .+ o
end
inlined_dotadd_tuple_intv (generic function with 1 method)
julia> @inline function inlined_dotadd_svec_intv()
o = 1
v = @SVector([1,0,1])
v = v .+ o
end
inlined_dotadd_svec_intv (generic function with 1 method)
julia> @btime inlined_dotadd_tuple_intv()
8.219 ns (0 allocations: 0 bytes)
julia> @btime inlined_dotadd_svec_intv()
1.894 ns (0 allocations: 0 bytes) I wonder if there is a way to make the inlined function with tuple operations as fast as the inlined function with StaticArray operations. |
Hi @wsshin. You seem to be benchmarking in global scope, if you define f() = inlined_dotadd_tuple_intv()
g() = inlined_dotadd_svec_intv() you should observe the same difference you had with |
I see. I knew functions defined in REPL are defined in global scope, but I though (without knowing why) I could benchmark functions directly if they do not take any arguments. (Of course, when benchmarking functions with arguments, I interpolated the arguments with Why does this convention not work here? Is this because of the |
There doesn't seem to be any "Benchmark in global scope" here. |
@yuyichao, do you mean it is incorrect to say that I was benchmarking in global scope? Then why is wrapping by another function ( |
Right.
No it's not. It just disabled inlining. |
@yuyichao, thanks for your answer! @pabloferz, then back to my original question: is there a way to make an inlined function that uses tuple operations as fast as an inlined function that uses equivalent StaticArray operations? |
The issue seems to be that |
Oh, right. Sorry, @yuyichao is right, |
Hm, |
I guess it might be reasonable as long as we put some limit on the |
@Sacha0, @pabloferz, is there a PR on this |
No PR exists specifically for inlining |
@Sacha0, sure. Let me look into this. |
PR inlining |
For example, when a constant
Int64
is dot-added to aTuple{Int64,Int64,Int64}
, the result correctly remains aTuple{Int64,Int64,Int64}
:However, if you assign the constant
Int64
to a variable and dot-add that variable to aTuple{Int64,Int64,Int64}
, the result is no longer aTuple{Int64,Int64,Int64}
, but just an abstractTuple
:If we directly use
broadcast
, the same type instability occurs even if we do not assign the constantInt64
to a variable :The text was updated successfully, but these errors were encountered: