-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
optimizer: enhance SROA, handle partially-initialized allocations #42834
Conversation
7ccb600
to
a9ae9f2
Compare
a9ae9f2
to
72715ee
Compare
break | ||
for use in du.uses | ||
def = find_def_for_use(ir, domtree, allblocks, du, use)[1] | ||
(def == 0 || def == idx) && @goto skip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If def == 0
here, doesn't that mean that this field is truly unitialized? maybe I am misunderstanding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def
is the index where the object field is defined, so it could be:
- object allocation site:
def >= 1
setfield!
call:def >= 1
- PhiNode (i.e. the object field can be defined in different control flow, and different possibilities are merged at PhiNode):
def == 0
So def == 0
case here corresponds to something like:
let src = code_typed((Bool,)) do cond
r = Ref{Any}()
if cond
r[] = 42
else
r[] = 32
end
return r[] # ::PhiNode(32, 42)
end |> only |> first
@test_broken !any(src.code) do @nospecialize x
Meta.isexpr(x, :new)
end
end
Since traversing such PhiNode could be complicated (I think?), I just leave it as TODO.
Maybe this is worth to be left as comment ?
preserve_uses = IdDict{Int, Vector{Any}}((idx=>Any[] for idx in IdSet{Int}(defuse.ccall_preserve_uses))) | ||
# Everything accounted for. Go field by field and perform idf | ||
for (fidx, du) in pairs(fielddefuse) | ||
for fidx in 1:ndefuse | ||
du = fielddefuse[fidx] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit nitpicky, but is there any difference between the original and this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, no difference. If fielddefuse
is type-unstable and it can contain arbitrary type objects (like user-given expressions fielddefuse = x.args
where x::Expr
), then iterators like pairs(fielddefuse)
may produce tuple object whose type is very dynamic (which may cause lots of dispatch with almost no benefit), but for this case we don't need to care that since fielddefuse::Vector{SSADefUse}
.
lgtm --- basically move doing idf back so as to include structs with potentially uninitialized fields, then handle those structs by making sure that any field we think could be uninitialized was defined somewhere |
16b1154
to
3c6dfeb
Compare
Will this break the ability to use |
72715ee
to
a2742c9
Compare
Maybe ? If the definition is available in the analysis scope where |
Yea, the core change is here: julia/base/compiler/ssair/passes.jl Lines 826 to 829 in a2742c9
Previously we give up anytime when allocation contains uninitialized field ( fidx + 1 > length(defexpr.args) ), but this PR tries to check if there is a "definitive" definition there (!(def == 0 || def == idx) ) and then tries scalar replace.
|
Can you post results of |
ah, |
On this PR: julia> @btime (1+1)
0.032 ns (0 allocations: 0 bytes)
2
julia> @btime 1+Ref(1)[]
1.235 ns (0 allocations: 0 bytes)
2 The essential difference is that julia> code_typed() do
a = 1 + Ref(1)[]
b = 1 + 1
a, b
end
1-element Vector{Any}:
CodeInfo(
1 ─ %1 = Base.add_int(1, 1)::Int64
│ %2 = Core.tuple(%1, 2)::Tuple{Int64, Int64}
└── return %2
) => Tuple{Int64, Int64} |
@nanosoldier |
It seems odd those aren't the same. We should fix that. The benchmarks are supposed to interpolate that Ref object to avoid the optimizer removing it. |
Hm I don't see the oddness. Whether or not |
@oscardssmith the correct invocation is |
I don't think that is the key difference, since:
But maybe something with the effect-free computation interacting with no-inline? |
This is what @Roger-luo observed in Yao. Constant propagation occurs during type-inference, but optimizations can form constant expressions, which will then be optimized by LLVM. Which is okay in the normal case, but cause us to write a const-propagation pass on SSAIR for YaoCompiler. |
I think the actual difference in performance stems from: julia> f() = 1+1; g() = 1+Ref(1)[]; f(); g();
julia> Core.Compiler.invoke_api(only(methods(f)).specializations[1].cache) # with use constant calling convention
2
julia> Core.Compiler.invoke_api(only(methods(g)).specializations[1].cache) # just a usual function call
-1 So the question might be what Valentin pointed out: constant-folding at optimization time (rather, re-inference to maximize general optimization opportunities?). Or introduce mutation-analysis into inference and constant-fold more. |
I don't think re-running inference is possible (opens up a whole bag of convergence questions). Doing constant-folding at optimization time is likely beneficial and cheap, and can enable other optimizations.
That might be the way to go, but I am cautious over the cost-benefit here. I would rather enable this as SSAIR optimization so that we can do it after inlining as well. |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. |
0b5f426
to
2520aa2
Compare
@nanosoldier |
Your benchmark job has completed - possible performance regressions were detected. A full report can be found here. |
During adding more test cases for our SROA pass, I found our SROA doesn't handle allocation sites with uninitialized fields at all. This commit is based on #42833 and tries to handle such "unsafe" allocations, if there are safe `setfield!` definitions. For example, this commit allows the allocation `r = Ref{Int}()` to be eliminated in the following example (adapted from <https://hackmd.io/bZz8k6SHQQuNUW-Vs7rqfw?view>): ```julia julia> code_typed() do r = Ref{Int}() r[] = 42 b = sin(r[]) return b end |> only ``` This commit comes with a plenty of basic test cases for our SROA pass also.
2520aa2
to
b228e48
Compare
…liaLang#42834) During adding more test cases for our SROA pass, I found our SROA doesn't handle allocation sites with uninitialized fields at all. This commit is based on JuliaLang#42833 and tries to handle such "unsafe" allocations, if there are safe `setfield!` definitions. For example, this commit allows the allocation `r = Ref{Int}()` to be eliminated in the following example (adapted from <https://hackmd.io/bZz8k6SHQQuNUW-Vs7rqfw?view>): ```julia julia> code_typed() do r = Ref{Int}() r[] = 42 b = sin(r[]) return b end |> only ``` This commit comes with a plenty of basic test cases for our SROA pass also.
…liaLang#42834) During adding more test cases for our SROA pass, I found our SROA doesn't handle allocation sites with uninitialized fields at all. This commit is based on JuliaLang#42833 and tries to handle such "unsafe" allocations, if there are safe `setfield!` definitions. For example, this commit allows the allocation `r = Ref{Int}()` to be eliminated in the following example (adapted from <https://hackmd.io/bZz8k6SHQQuNUW-Vs7rqfw?view>): ```julia julia> code_typed() do r = Ref{Int}() r[] = 42 b = sin(r[]) return b end |> only ``` This commit comes with a plenty of basic test cases for our SROA pass also.
During adding more test cases for our SROA pass, I found our SROA doesn't
handle allocation sites with uninitialized fields at all.
This commit is built on top of #42833 and tries to handle such "unsafe" allocations,
if there are safe
setfield!
definitions.For example, this commit allows the allocation
r = Ref{Int}()
to beeliminated in the following example (adapted from https://hackmd.io/bZz8k6SHQQuNUW-Vs7rqfw?view):
This commit comes with a plenty of basic test cases for our SROA pass also.
Two test cases depend on #42831 as well.