Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suboptimal code with invoke #34900

Closed
MasonProtter opened this issue Feb 27, 2020 · 3 comments · Fixed by #34906
Closed

Suboptimal code with invoke #34900

MasonProtter opened this issue Feb 27, 2020 · 3 comments · Fixed by #34906
Assignees
Labels
performance Must go faster regression Regression in behavior compared to a previous version

Comments

@MasonProtter
Copy link
Contributor

It was my impression that if everything infers, invoke shouldn't carry any performance penalty, but I noticed something with this little example:

julia> f(x::Int, y) = x + y
f (generic function with 1 method)

julia> f(x, y::Int) = x * y
f (generic function with 2 methods)

julia> f(x::Int, y::Int) = invoke(f, Tuple{Int, Any}, x, y)
f (generic function with 3 methods)

julia> @code_typed f(1, 1)
CodeInfo(
1%1 = Main.invoke(Main.f, Tuple{Int64,Any}, x, y)::Int64
└──      return %1
) => Int64

that the llvm code is not a simple integer addition:

julia> @code_llvm f(1, 1)
;  @ REPL[3]:1 within `f'
define i64 @julia_f_17415(i64, i64) {
top:
  %2 = alloca %jl_value_t addrspace(10)*, i32 4
  %gcframe = alloca %jl_value_t addrspace(10)*, i32 4, align 16
  %3 = bitcast %jl_value_t addrspace(10)** %gcframe to i8*
  call void @llvm.memset.p0i8.i32(i8* align 16 %3, i8 0, i32 32, i1 false)
  %thread_ptr = call i8* asm "movq %fs:0, $0", "=r"()
  %ptls_i8 = getelementptr i8, i8* %thread_ptr, i64 -15712
  %ptls = bitcast i8* %ptls_i8 to %jl_value_t***
  %4 = getelementptr %jl_value_t addrspace(10)*, %jl_value_t addrspace(10)** %gcframe, i32 0
  %5 = bitcast %jl_value_t addrspace(10)** %4 to i64*
  store i64 8, i64* %5
  %6 = getelementptr %jl_value_t**, %jl_value_t*** %ptls, i32 0
  %7 = getelementptr %jl_value_t addrspace(10)*, %jl_value_t addrspace(10)** %gcframe, i32 1
  %8 = bitcast %jl_value_t addrspace(10)** %7 to %jl_value_t***
  %9 = load %jl_value_t**, %jl_value_t*** %6
  store %jl_value_t** %9, %jl_value_t*** %8
  %10 = bitcast %jl_value_t*** %6 to %jl_value_t addrspace(10)***
  store %jl_value_t addrspace(10)** %gcframe, %jl_value_t addrspace(10)*** %10
  %11 = call %jl_value_t addrspace(10)* @jl_box_int64(i64 signext %0)
  %12 = getelementptr %jl_value_t addrspace(10)*, %jl_value_t addrspace(10)** %gcframe, i32 3
  store %jl_value_t addrspace(10)* %11, %jl_value_t addrspace(10)** %12
  %13 = call %jl_value_t addrspace(10)* @jl_box_int64(i64 signext %1)
  %14 = getelementptr %jl_value_t addrspace(10)*, %jl_value_t addrspace(10)** %gcframe, i32 2
  store %jl_value_t addrspace(10)* %13, %jl_value_t addrspace(10)** %14
  %15 = getelementptr %jl_value_t addrspace(10)*, %jl_value_t addrspace(10)** %2, i32 0
  store %jl_value_t addrspace(10)* addrspacecast (%jl_value_t* inttoptr (i64 140333499486744 to %jl_value_t*) to %jl_value_t addrspace(10)*), %jl_value_t addrspace(10)** %15
  %16 = getelementptr %jl_value_t addrspace(10)*, %jl_value_t addrspace(10)** %2, i32 1
  store %jl_value_t addrspace(10)* addrspacecast (%jl_value_t* inttoptr (i64 140332530658816 to %jl_value_t*) to %jl_value_t addrspace(10)*), %jl_value_t addrspace(10)** %16
  %17 = getelementptr %jl_value_t addrspace(10)*, %jl_value_t addrspace(10)** %2, i32 2
  store %jl_value_t addrspace(10)* %11, %jl_value_t addrspace(10)** %17
  %18 = getelementptr %jl_value_t addrspace(10)*, %jl_value_t addrspace(10)** %2, i32 3
  store %jl_value_t addrspace(10)* %13, %jl_value_t addrspace(10)** %18
  %19 = call nonnull %jl_value_t addrspace(10)* @jl_f_invoke(%jl_value_t addrspace(10)* addrspacecast (%jl_value_t* null to %jl_value_t addrspace(10)*), %jl_value_t addrspace(10)** %2, i32 4)
  %20 = bitcast %jl_value_t addrspace(10)* %19 to i64 addrspace(10)*
  %21 = load i64, i64 addrspace(10)* %20, align 8
  %22 = getelementptr %jl_value_t addrspace(10)*, %jl_value_t addrspace(10)** %gcframe, i32 1
  %23 = load %jl_value_t addrspace(10)*, %jl_value_t addrspace(10)** %22
  %24 = getelementptr %jl_value_t**, %jl_value_t*** %ptls, i32 0
  %25 = bitcast %jl_value_t*** %24 to %jl_value_t addrspace(10)**
  store %jl_value_t addrspace(10)* %23, %jl_value_t addrspace(10)** %25
  ret i64 %21
}

Not sure what's going on here or if it's expected, so I figured I'd open an issue.

@yuyichao
Copy link
Contributor

It worked in 1.1

@MasonProtter
Copy link
Contributor Author

I tested on 1.3.1, 1.4.0-rc1 and 1.4.0-r and got the same result for each of them.

@KristofferC
Copy link
Member

Works in 1.2 as well.

@KristofferC KristofferC added performance Must go faster regression Regression in behavior compared to a previous version labels Feb 27, 2020
@Keno Keno self-assigned this Feb 27, 2020
Keno added a commit that referenced this issue Feb 27, 2020
When we added the check to prevent inlining through ambiguous methods,
we failed exclude this check for calls to `invoke` (which skip ambiguous
methods, since the method is explicitly selected). Fixes #34900.
Keno added a commit that referenced this issue Feb 28, 2020
When we added the check to prevent inlining through ambiguous methods,
we failed exclude this check for calls to `invoke` (which skip ambiguous
methods, since the method is explicitly selected). Fixes #34900.
KristofferC pushed a commit that referenced this issue Mar 23, 2020
When we added the check to prevent inlining through ambiguous methods,
we failed exclude this check for calls to `invoke` (which skip ambiguous
methods, since the method is explicitly selected). Fixes #34900.

(cherry picked from commit 6abc852)
ravibitsgoa pushed a commit to ravibitsgoa/julia that referenced this issue Apr 9, 2020
When we added the check to prevent inlining through ambiguous methods,
we failed exclude this check for calls to `invoke` (which skip ambiguous
methods, since the method is explicitly selected). Fixes JuliaLang#34900.
KristofferC pushed a commit that referenced this issue Apr 11, 2020
When we added the check to prevent inlining through ambiguous methods,
we failed exclude this check for calls to `invoke` (which skip ambiguous
methods, since the method is explicitly selected). Fixes #34900.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster regression Regression in behavior compared to a previous version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants