-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eliminating allocation of ccall root in simple cases #22734
Conversation
e8eb601
to
15c92f1
Compare
As shown in the test, this PR along should elminate the allocation for |
15c92f1
to
69a0886
Compare
The one merged later in this pr and #22684 should include the following test case. It doesn't currently pass on any of the PR but pass with the two combined. diff --git a/test/codegen.jl b/test/codegen.jl
index 83ceb9b937..fb73e8ae83 100644
--- a/test/codegen.jl
+++ b/test/codegen.jl
@@ -146,6 +146,25 @@ Base.unsafe_convert(::Type{Ptr{BadRef}}, ar::BadRef) = Ptr{BadRef}(pointer_from_
breakpoint_badref(a::MutableStruct) = ccall(:jl_breakpoint, Void, (Ptr{BadRef},), a)
+struct PtrStruct
+ a::Ptr{Void}
+ b::Int
+end
+
+mutable struct RealStruct
+ a::Float64
+ b::Int
+end
+
+function Base.cconvert(::Type{Ref{PtrStruct}}, a::RealStruct)
+ (a, Ref(PtrStruct(pointer_from_objref(a), a.b)))
+end
+Base.unsafe_convert(::Type{Ref{PtrStruct}}, at::Tuple) =
+ Base.unsafe_convert(Ref{PtrStruct}, at[2])
+
+breakpoint_ptrstruct(a::RealStruct) =
+ ccall(:jl_breakpoint, Void, (Ref{PtrStruct},), a)
+
if opt_level > 0
@test !contains(get_llvm(isequal, Tuple{Nullable{BigFloat}, Nullable{BigFloat}}), "%gcframe")
@test !contains(get_llvm(pointer_not_safepoint, Tuple{}), "%gcframe")
@@ -161,6 +180,10 @@ if opt_level > 0
breakpoint_badref_ir = get_llvm(breakpoint_badref, Tuple{MutableStruct})
@test !contains(breakpoint_badref_ir, "%gcframe")
@test !contains(breakpoint_badref_ir, "jl_gc_pool_alloc")
+
+ breakpoint_ptrstruct_ir = get_llvm(breakpoint_ptrstruct, Tuple{RealStruct})
+ @test !contains(breakpoint_ptrstruct_ir, "%gcframe")
+ @test !contains(breakpoint_ptrstruct_ir, "jl_gc_pool_alloc")
end
function two_breakpoint(a::Float64) |
Cool. Getting rid of The allocation that had bothered me previously was when I wanted to My question is - are we closer to removing allocations to local E.g. function f()
x = Ref{Int}() # heap allocation?
ccall((:foo, :bar), Void, Tuple{Ptr{Int}}, x)
return x[]
end |
duh, thanks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this need any docs changes?
base/inference.jl
Outdated
elseif head === :foreigncall | ||
args = e.args | ||
nccallargs = args[5]::Int | ||
# Only arguments escape, GC root arguments do not escape. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
most arguments also don't escape (get copied to the stack or passed in registers). AFAIK, only those with a declared type of Any
can escape?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"escape" here refer to the structure/layout. If it's used as an argument then we do need to allocate memory to make it have the correct layout. If that use doesn't escape the pointer, it'll be the job of llvm-alloc-opt.cpp
to remove the heap allocation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, the reference to gc made this confusing. but also because the main purpose of the gc-root arguments is to indicate what object(s) need(s) to be pinned in memory during the call. Perhaps a clearer comment might be to mention that this function is computing whether the structure layout is observable from the ccall target, but that the gc-root arguments are not passed through to the child. But also, is that actually true?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, adding that comment.
But also, is that actually true?
It is not true just from the ccall itself. E.g. I think it is perfectly valid if one write
function cconvert(....)
a = Ref(b)
(a, pointer_from_objref(a))
end
function unsafe_convert(..., v)
v[2]
end
For the ccall, a
will only appear in the root slot but it cannot be deconstructed since there are escaped use of a
's address. OTOH, if we scan through all the uses and there isn't any escaped use, then I think it is valid that we can elide the allocation. I mean, the user is not even using it.....
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, yes, that sounds correct.
base/inference.jl
Outdated
next_i += 1 | ||
|
||
symequal(args[i], tupname) || continue | ||
splice!(args, i, vals) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a comment that this splice is removing gc roots
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*replacing gc roots.
Sure.
@@ -1380,7 +1380,7 @@ static std::pair<CallingConv::ID, bool> convert_cconv(jl_sym_t *lhd) | |||
if (lhd == jl_symbol("stdcall")) { | |||
return std::make_pair(CallingConv::X86_StdCall, false); | |||
} | |||
else if (lhd == jl_symbol("cdecl")) { | |||
else if (lhd == jl_symbol("cdecl") || lhd == jl_symbol("ccall")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ccc
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defaultcc
? ccc
will probably just be cdecl
. I just want to make it clear that this is the default one for ccall
. Or I can just use cdecl
if we will never make anything else the default in any platform/target dependent way. That's also why this is intentionally left out as valid calling convention in the parser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, ok. mhm, this seems fine as-is
and the foreigncall head in list at https://github.com/JuliaLang/julia/blob/69a0886c3906f8fee1f83b52f5c948d01855147b/doc/src/devdocs/ast.md#expr-types We will want to add an entry to NEWS at some point too, but maybe just open an issue to deprecate |
What do you think should be added there? I don't think this actually change any user visible API for ccall.
This should be the less significant one. If any, news entry should probably go to #22684 |
Apart from the reduction in allocation the only user visible change should be the format of the internal
I'll add a comment about it there... |
Just a reminder / example of correct rooting in ccall and a (additional) warning / example that the argument pointer memory is destroyed immediately when the ccall returns, unless the user made sure to also leak the object elsewhere global. For instance, what can we say about the pointer seen by consecutive ccalls to the same Ref object? I think experience has shown that we can never say enough about how to allocate objects correctly for use in a |
Yep, that's the part I was thinking should have an explicit note :) |
This actually haven't changed. The change in this PR is purely an optimization since it is based on escape info. If the pointer is take for the object (or if the object is used in any way that is not ccall root or getfield (and |
69a0886
to
6739834
Compare
Added more comment for each of the cases above. Also added a commit attempting to make the relation between |
base/docs/helpdb/Base.jl
Outdated
[`unsafe_convert`](@ref) to handle. The result of this function should be kept valid (for the GC) | ||
until the result of [`unsafe_convert`](@ref) is not needed anymore. | ||
This can be used to allocate memory that will be accessed by the `ccall`. | ||
If multiple object needs to be allocated, a tuple of the objects can be used as return value. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multiple objects need to
base/docs/helpdb/Base.jl
Outdated
@@ -881,7 +884,7 @@ trunc | |||
""" | |||
unsafe_convert(T,x) | |||
|
|||
Convert `x` to a value of type `T` | |||
Convert `x` to a C argument of type `T` where `x` is the return value of `cconvert(T, ...)`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does this mean? x
is whatever the user inputs, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it isn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
huh? the signature says unsafe_convert(T,x)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, depend on what do you mean by user input. x
is the input to this function, sure, but it's basically never what the user inputs. It must always be the return value of cconvert
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not very clear what you mean. x
is the input to the unsafe_convert
function in this docstring. So how is cconvert
relevant to that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should elaborate on that a bit in the docstring then, probably with a cross-reference to the full ccall docs which I hope explain this in more detail. Something like "This function should not usually be called directly, it gets called on the output from cconvert
"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is currently the full document of the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh? I'm saying the part that you're adding here is unclear and would be better if phrased differently. "where x
is the ..." doesn't make sense in this context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, I think I've explained what I wanted to say clear enough so I can take any suggestions as long as the first sentense doesn't make it sound like this is doing a simple convertion and make it clear that the x
is the output of cconvert
.
By "the full document of the function" I mean I don't think it is better explained in the manual and I think it makes sense to describe it here in the doc string since it's a property of the function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, you're saying there's nothing more detailed to cross-reference to then? My interpretation or suggestion is to explain the intended usage as another sentence after the sort-of-obvious Convert `x` to a C argument of type `T`
. The "where" phrasing doesn't make that restriction very clear to me.
base/inference.jl
Outdated
# GC root arguments do not. | ||
# Also note that only being used in the root slot for this ccall itself | ||
# does **not** mean that the object is not needed during the ccall. | ||
# However, if it's address is never taken |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its address
base/inference.jl
Outdated
# Also note that only being used in the root slot for this ccall itself | ||
# does **not** mean that the object is not needed during the ccall. | ||
# However, if it's address is never taken | ||
# and the object is never used in a way that escapes it's layout, we can be sure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its layout
base/inference.jl
Outdated
next_i += 1 | ||
|
||
symequal(args[i], tupname) || continue | ||
# Replace the gc root arguments with it's fields |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
argument singular or plural? if singular, "root argument with its fields" - if plural, "root arguments with their fields"
496a09d
to
71a09d9
Compare
base/refpointer.jl
Outdated
# If the slot is not leaf type, it could be either isbits or not. | ||
# If it is actually an isbits type and the type inference can infer that | ||
# it can rebox the `b.x` if we simply call `pointer_from_objref(b.x)` on it. | ||
# Instead, explictly load the pointer from the `RefValue` so that the pointer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
explicitly (missing second i)
71a09d9
to
5bb3b96
Compare
Explicitly specify the number of ccall arguments and put the gc root arguments at the end. This way we can specify variable number of roots, making it easier for typeinf to split the root. into smaller allocations.
Also fix and test an invalid `unsafe_convert` definition in Base. The result of `pointer_from_objref` on isbits types is never valid to use across safepoint.
5bb3b96
to
245868c
Compare
Applied on top of #22684, this should eliminate most if not all of the allocation in
ccall
that was previously only non-allocationg when using the to be deprecated&
syntax.