-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use tagged CodeInstances for AbstractInterpreter caching #52233
Conversation
1990bfb
to
6a5c024
Compare
I think this is fine, but I do wonder whether this is slightly premature. Just to clarify, the primary motivation here is not the performance of the |
This is motivated by what GPUCompiler need today, We have a sever latency issue for GPU based codes that is prohibitive for both HPC and application deployment.
Yeah I would love some additional flexibility with what could be stored in a CodeInstance, but I am planning on doing forward validation for any information based on this. |
Ok, I didn't quite get the serialization implications of this change. Regardless, as I said, I don't think there's anything particularly wrong with it. 234a758 went a similar direction with respect to external interpreter analyses. |
// if (ci !in ci->defs->cache) | ||
// record_field_change((jl_value_t**)&ci->next, NULL); | ||
// Why are we checking that the method/module this orignates from is in_image? | ||
// and then disconnect this CI? | ||
if (jl_object_in_image((jl_value_t*)ci->def->def.value)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry to be late to the party. It was added in #48751 so @vtjnash is the expert, but I interpret this as a way to truncate traversal. The point is that if you've already cached the method you've brought in everything you should and nothing you shouldn't (e.g.,
Lines 791 to 799 in a28e553
else if (jl_is_method(def) && jl_object_in_image(def)) { | |
// we only need 3 specific fields of this (the rest are restored afterward, if valid) | |
// in particular, cache is repopulated by jl_mi_cache_insert for all foreign function, | |
// so must not be present here | |
record_field_change((jl_value_t**)&mi->uninferred, NULL); | |
record_field_change((jl_value_t**)&mi->backedges, NULL); | |
record_field_change((jl_value_t**)&mi->callbacks, NULL); | |
record_field_change((jl_value_t**)&mi->cache, NULL); | |
} |
6a5c024
to
c093a42
Compare
@aviatesk I think this is ready for a first review. |
@@ -86,7 +83,7 @@ struct EscapeCacheInfo | |||
end | |||
|
|||
struct EscapeCache | |||
cache::IdDict{MethodInstance,EscapeCacheInfo} | |||
cache::IdDict{MethodInstance,EscapeCacheInfo} # Should this be CodeInstance to EscapeCacheInfo? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aviatesk this is something we should talk about. In my head this should probably be an IdDict{CodeInstance, EscapeCacheInfo}
since we are attaching information to a particular inference task and we store that information in the CodeInstance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree. I would like to work on that refactoring to see what possibilities this change opens up, if you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be great!
test/compiler/invalidation.jl
Outdated
@test any(INVALIDATION_TESTER_CACHE.dict) do (mi, ci) | ||
mi.def.name === :basic_caller | ||
|
||
let mi = only(Base.specializations(only(methods(basic_caller)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can make this logic simpler after #52176
@nanosoldier |
The package evaluation job you requested has completed - possible new issues were detected. |
Mostly just needed API changes, but Krotov is worth a closer look.
But this may be unrelated to this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally think it's good, particularly the way it simplifies the external AbstractInterpreter
's code cache. On the other hand, as Keno pointed out, I'm not completely clear on the serialization support. Am I right in thinking that this PR itself enables the serialization of CodeInstance
inferred by an external AbstractInterpreter
?
@@ -315,7 +315,7 @@ jl_datatype_t *jl_mk_builtin_func(jl_datatype_t *dt, const char *name, jl_fptr_a | |||
jl_atomic_store_relaxed(&m->unspecialized, mi); | |||
jl_gc_wb(m, mi); | |||
|
|||
jl_code_instance_t *codeinst = jl_new_codeinst(mi, | |||
jl_code_instance_t *codeinst = jl_new_codeinst(mi, jl_nothing, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe define
JL_DLLEXPORT jl_code_instance_t *jl_new_codeinst_native(
jl_method_instance_t *mi,
jl_value_t *rettype, jl_value_t *exctype,
jl_value_t *inferred_const, jl_value_t *inferred,
int32_t const_flags, size_t min_world, size_t max_world,
uint32_t ipo_effects, uint32_t effects, jl_value_t *analysis_results,
uint8_t relocatability
/*, jl_array_t *edges, int absolute_max*/)
{
return jl_new_codeinst(mi, /*owner*/jl_nothing, rettype, exctype, inferred_const, inferred,
const_flags, min_world, max_world, ipo_effects, effects,
analysis_results, relocatability
/*, edges, absolute_max*/);
}
and make it more explicit where we have native compilation specific code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The next step would be to create a field in jl_task_t and use that to propagate the compilation context, which will remove the _native
function again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementing it with #50958 e.g. as a reference seems feasible. If that's the case, it might be best to keep the current code unchanged.
// call external callbacks registered with this method_instance | ||
static void invalidate_external(jl_method_instance_t *mi, size_t max_world) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's nice we no longer need this since agged CodeInstance
automatically gains support for invalidation -- CodeInstance
inferred by each AbstractInterpreter
is appended to the next
linked list, and they are invalidated by manipulating max_world
, irrespective of their owner.
@@ -442,7 +442,8 @@ function custom_lookup(mi::MethodInstance, min_world::UInt, max_world::UInt) | |||
end | |||
end | |||
end | |||
return CONST_INVOKE_INTERP.code_cache.dict[mi] | |||
# XXX: This seems buggy, custom_lookup should probably construct the absint on demand. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's hacky. We need a way to set a contextual AbstractInterpreter
globally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand it AbstractInterpreters are ephemeral. Since they capture the world they operate in on construction.
@@ -86,7 +83,7 @@ struct EscapeCacheInfo | |||
end | |||
|
|||
struct EscapeCache | |||
cache::IdDict{MethodInstance,EscapeCacheInfo} | |||
cache::IdDict{MethodInstance,EscapeCacheInfo} # Should this be CodeInstance to EscapeCacheInfo? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I agree. I would like to work on that refactoring to see what possibilities this change opens up, if you want.
For external abstract interpreters like JET, which propagate custom data interprocedurally, it is essential to inject custom behaviors into where cached regular/constant inference results are used. Previously, this was accomplished by overloading both `abstract_call_method` and `get(::WorldView{CustomView}, ...)` (and `const_prop_call` and `cached_lookup`), that method was admittedly hacky and should probably better to be avoided. Moreover, after #52233, doing so has become infeasible when the external abstract interpreter uses `InternalCodeCache`. To address this issue, this commit provides an interface named `return_cached_result`. This allows external abstract interpreters to inject custom interprocedural data propagation during abstract interpretation even when they use `InternalCodeCache`.
Leverages JuliaLang/julia#52233 to use the internal code cache that comes with the inherent invalidation support. Still requires: - JuliaLang/julia#53300 (or JuliaLang/julia#53219) - JuliaLang/julia#53318
This lacks invalidation support (since this commit still uses the external code cache), but this lets JET to run on the latest nightly at least.
This lacks invalidation support (since this commit still uses the external code cache), but this lets JET to run on the latest nightly at least.
This lacks invalidation support (since this commit still uses the external code cache), but this lets JET to run on the latest nightly at least.
Leverages JuliaLang/julia#52233 to use the internal code cache that comes with the inherent invalidation support. Still requires: - JuliaLang/julia#53300 (or JuliaLang/julia#53219) - JuliaLang/julia#53318
It has been possible for external abstract interpreters to keep custom data in `codeinst.inferred` (together /w overloading `inlining_policy`). After JuliaLang#52233, when such external absint uses `InternalCodeCache`, this data is passed to `jl_ir_flag_inferred`, leading to segfaults in assertion builds. This commit resolves the issue by omitting `jl_ir_flag_inferred` checks when the `cache_owner` is external. Nonetheless, a better resolution might be necessary. It suggests that the current design of `code_owner` and `InternalCodeCache` for the external cache system is somewhat flawed. A conceivable approach could involve: - Adding a layer similar to `inlining_policy` in `CC.get(::WorldView{InternalCodeCache})` to enable safe redirection of custom data to the native interpreter's implementation. - Prohibiting custom data in the `inferred` field and directing such data to be kept in `analysis_results`.
For external abstract interpreters like JET, which propagate custom data interprocedurally, it is essential to inject custom behaviors into where cached regular/constant inference results are used. Previously, this was accomplished by overloading both `abstract_call_method` and `get(::WorldView{CustomView}, ...)` (and `const_prop_call` and `cached_lookup`), that method was admittedly hacky and should probably better to be avoided. Moreover, after JuliaLang#52233, doing so has become infeasible when the external abstract interpreter uses `InternalCodeCache`. To address this issue, this commit provides an interface named `return_cached_result`. This allows external abstract interpreters to inject custom interprocedural data propagation during abstract interpretation even when they use `InternalCodeCache`.
Leverages JuliaLang/julia#52233 to use the internal code cache that comes with the inherent invalidation support. Still requires: - JuliaLang/julia#53300 (or JuliaLang/julia#53219) - JuliaLang/julia#53318
Leverages JuliaLang/julia#52233 to use the internal code cache that comes with the inherent invalidation support. Still requires: - JuliaLang/julia#53300 (or JuliaLang/julia#53219) - JuliaLang/julia#53318
#53318) For external abstract interpreters like JET, which propagate custom data interprocedurally, it is essential to inject custom behaviors into where cached regular/constant inference results are used. Previously, this was accomplished by overloading both `abstract_call_method` and `get(::WorldView{CustomView}, ...)` (and `const_prop_call` and `cached_lookup`), that method was admittedly hacky and should probably better to be avoided. Moreover, after #52233, doing so has become infeasible when the external abstract interpreter uses `InternalCodeCache`. To address this issue, this commit provides an interface named `return_cached_result`. This allows external abstract interpreters to inject custom interprocedural data propagation during abstract interpretation even when they use `InternalCodeCache`.
Two small cleanups to #52233 - Don't use foreign code-instances for native code caching decision making - don't return a non-native codeinst for jl_method_compiled --------- Co-authored-by: Keno Fischer <[email protected]>
It has been possible for external abstract interpreters to keep custom data in `codeinst.inferred` (together /w overloading `inlining_policy`). After #52233, when such external absint uses `InternalCodeCache`, this data is passed to `jl_ir_flag_inferred`, leading to segfaults in assertion builds. This commit resolves the issue by omitting `jl_ir_flag_inferred` checks when the `cache_owner` is external. Nonetheless, a better resolution might be necessary. It suggests that the current design of `code_owner` and `InternalCodeCache` for the external cache system is somewhat flawed. A conceivable approach could involve: - Adding a layer similar to `inlining_policy` in `CC.get(::WorldView{InternalCodeCache})` to enable safe redirection of custom data to the native interpreter's implementation. - Prohibiting custom data in the `inferred` field and directing such data to be kept in `analysis_results`.
It has been possible for external abstract interpreters to keep custom data in `codeinst.inferred` (together /w overloading `inlining_policy`). After #52233, when such external absint uses `InternalCodeCache`, this data is passed to `jl_ir_flag_inferred`, leading to segfaults in assertion builds. This commit resolves the issue by omitting `jl_ir_flag_inferred` checks when the `cache_owner` is external. Nonetheless, a better resolution might be necessary. It suggests that the current design of `code_owner` and `InternalCodeCache` for the external cache system is somewhat flawed. A conceivable approach could involve: - Adding a layer similar to `inlining_policy` in `CC.get(::WorldView{InternalCodeCache})` to enable safe redirection of custom data to the native interpreter's implementation. - Prohibiting custom data in the `inferred` field and directing such data to be kept in `analysis_results`.
It has been possible for external abstract interpreters to keep custom data in `codeinst.inferred` (together /w overloading `inlining_policy`). After #52233, when such external absint uses `InternalCodeCache`, this data is passed to `jl_ir_flag_inferred`, leading to segfaults in assertion builds. This commit resolves the issue by omitting `jl_ir_flag_inferred` checks when the `cache_owner` is external. Nonetheless, a better resolution might be necessary. It suggests that the current design of `code_owner` and `InternalCodeCache` for the external cache system is somewhat flawed. A conceivable approach could involve: - Adding a layer similar to `inlining_policy` in `CC.get(::WorldView{InternalCodeCache})` to enable safe redirection of custom data to the native interpreter's implementation. - Prohibiting custom data in the `inferred` field and directing such data to be kept in `analysis_results`.
) It has been possible for external abstract interpreters to keep custom data in `codeinst.inferred` (together /w overloading `inlining_policy`). After #52233, when such external absint uses `InternalCodeCache`, this data is passed to `jl_ir_flag_inferred`, leading to segfaults in assertion builds. This commit resolves the issue by omitting `jl_ir_flag_inferred` checks when the `cache_owner` is external. Nonetheless, a better resolution might be necessary. It suggests that the current design of `code_owner` and `InternalCodeCache` for the external cache system is somewhat flawed. A conceivable approach could involve: - Adding a layer similar to `inlining_policy` in `CC.get(::WorldView{InternalCodeCache})` to enable safe redirection of custom data to the native interpreter's implementation. - Prohibiting custom data in the `inferred` field and directing such data to be kept in `analysis_results`.
) It has been possible for external abstract interpreters to keep custom data in `codeinst.inferred` (together /w overloading `inlining_policy`). After #52233, when such external absint uses `InternalCodeCache`, this data is passed to `jl_ir_flag_inferred`, leading to segfaults in assertion builds. This commit resolves the issue by omitting `jl_ir_flag_inferred` checks when the `cache_owner` is external. Nonetheless, a better resolution might be necessary. It suggests that the current design of `code_owner` and `InternalCodeCache` for the external cache system is somewhat flawed. A conceivable approach could involve: - Adding a layer similar to `inlining_policy` in `CC.get(::WorldView{InternalCodeCache})` to enable safe redirection of custom data to the native interpreter's implementation. - Prohibiting custom data in the `inferred` field and directing such data to be kept in `analysis_results`. (cherry picked from commit 93876c9)
Two small cleanups to JuliaLang#52233 - Don't use foreign code-instances for native code caching decision making - don't return a non-native codeinst for jl_method_compiled --------- Co-authored-by: Keno Fischer <[email protected]>
…iaLang#53300) It has been possible for external abstract interpreters to keep custom data in `codeinst.inferred` (together /w overloading `inlining_policy`). After JuliaLang#52233, when such external absint uses `InternalCodeCache`, this data is passed to `jl_ir_flag_inferred`, leading to segfaults in assertion builds. This commit resolves the issue by omitting `jl_ir_flag_inferred` checks when the `cache_owner` is external. Nonetheless, a better resolution might be necessary. It suggests that the current design of `code_owner` and `InternalCodeCache` for the external cache system is somewhat flawed. A conceivable approach could involve: - Adding a layer similar to `inlining_policy` in `CC.get(::WorldView{InternalCodeCache})` to enable safe redirection of custom data to the native interpreter's implementation. - Prohibiting custom data in the `inferred` field and directing such data to be kept in `analysis_results`.
For GPUCompiler we would like to support a native on disk cache of LLVM IR. One of the longstanding issues has been the cache invalidation of such an on disk cache. With #52233 we now have an integrated cache for the inference results and we can rely on `CodeInstance` to be stable across sessions. Due to #52119 we can also rely on the `objectid` to be stable. My inital thought was to key the native disk cache in GPUCompiler on the objectid of the corresponding CodeInstance (+ some compilation parameters). While discussing this with @rayegun yesterday we noted that having a CodeInstance with the same objectid might not be enough provenance. E.g we are not gurantueed that the CodeInstance is from the same build artifact and the same precise source code. For the package images we are tracking this during loading and validate all contents at once, and we keep explicitly track of the provenance chain. This PR adds a lookup up table where we map from "external_blobs" e.g. loaded images, to the corresponding top module of each image, and uses this to determine the build_id of the package image.
…reters (#54069) Partially reverts #49391 PrecompileTools uses the timing infrastructure to snoop on the inference process. The reason for #49391 was that this could lead to accidental pollution of the caches with foreign results (timholy/SnoopCompile.jl#338) After #52233 and especially #53336 we now filter results by cache owner and don't try to cache foreign code using the native pipeline. Motivated by JuliaGPU/GPUCompiler.jl#567 which demonstrated that a foreign code instance would not be cached without PrecompileTools.
…reters (#54069) Partially reverts #49391 PrecompileTools uses the timing infrastructure to snoop on the inference process. The reason for #49391 was that this could lead to accidental pollution of the caches with foreign results (timholy/SnoopCompile.jl#338) After #52233 and especially #53336 we now filter results by cache owner and don't try to cache foreign code using the native pipeline. Motivated by JuliaGPU/GPUCompiler.jl#567 which demonstrated that a foreign code instance would not be cached without PrecompileTools. (cherry picked from commit c0611e8)
For GPUCompiler we would like to support a native on disk cache of LLVM IR. One of the longstanding issues has been the cache invalidation of such an on disk cache. With #52233 we now have an integrated cache for the inference results and we can rely on `CodeInstance` to be stable across sessions. Due to #52119 we can also rely on the `objectid` to be stable. My inital thought was to key the native disk cache in GPUCompiler on the objectid of the corresponding CodeInstance (+ some compilation parameters). While discussing this with @rayegun yesterday we noted that having a CodeInstance with the same objectid might not be enough provenance. E.g we are not gurantueed that the CodeInstance is from the same build artifact and the same precise source code. For the package images we are tracking this during loading and validate all contents at once, and we keep explicitly track of the provenance chain. This PR adds a lookup up table where we map from "external_blobs" e.g. loaded images, to the corresponding top module of each image, and uses this to determine the build_id of the package image. (cherry picked from commit d47cbf6)
Currently external AbstractInterpreters all follow the pattern
that
CACHE = IdDict{MethodInstance,CodeInstance}
. The NativeInterpreterhas the benefit that each
MethodInstance
carries an inline cache ofCodeInstances
.CodeInstances
pull triple duty, they contain validity information (min_world
,max_world
),the inference state cache and the compilation state cache.
When we currently create a CodeInstance we don't record the owner and we thus construct detached CodeInstances.
In this PR I introduce the notion of tagging the code instances with the owner (nothing meaning native),
thus allowing foreign code instances to be stored in the inline cache.
GPUCompiler/CUDA would change it's caching strategy from
IdDict{MethodInstance, CodeInstance}
toIdDict{CodeInstance, LLVM.Module}
or similar. Ideally we would allow polymorphism for the compilation state part of the code instance, but that seems to invasive for now.
This has the benefit that CodeInstances are handled by caching and it would allow us to cache inference results
from foreign abstract interpreters across sessions.
The downside is that the cache walk now is a bit slower since we need to filter on owner. If this turns out to be
a large bottleneck we could bifurcate the cache on owner.
TODO:
Replace theDefered to follow-upparams->lookup(mi, world, world)
with ajl_rettype_inferred(owner, mi, world, world)
Unresolved
REPLInterpreter
, should those go intomi.cache
jl_rettype_inferred_native
, which ones should receive aowner
token.