-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Effect Preconditions - or - the future of @inbounds #50641
base: master
Are you sure you want to change the base?
Conversation
This proposal is an attempt to tie together some of the recent discussion around the future of `@inbounds`, formal interface checking (a long the lines of my proposal in https://github.com/Keno/InterfaceSpecs.jl), and `--check-bounds`. # Reviewing `@inbounds` ## History The `@inbounds` annotation has been in the langauge for a decade at this point [1]. It is a crictical tool to achieve machine performance in high-performance applications where the overhead of boundschecking is prohibitive [2]. At the same time, as Julia has matured, the dangers of this macro have become more clear. It is essentially impossible to use correctly in generic code and instances where code (including code in base from before we were culturally more careful about inbounds) are using `@inbounds` have become a not-uncommon complaint-point about the language as a whole [3]. In current practice, at least in Base, we have become extremely hesitant about the introduction of new `@inbounds` annotations (see e.g. discussion in #41200). At the same time, the ability to disable bounds checking remains a critical capability for several of our user communities (c.f. #50239 and related discussions). So how do we move forward here? I'm hoping we can find a way forward that achieves the same benefits of `@inbounds` but in a way where it is possible to re-establish soundness if desired. ## When is inbounds still required? ### LLVM's current capabilities Let's look at a couple of examples. First: ``` function sum1(A::Vector{Int64}) a = zero(eltype(A)) for i in eachindex(A) a += A[i] end return a end ``` Is inbounds required here in modern julia? The answer is no - LLVM's range analysis proves that the bounds check is dead and eliminates it. (The same holds true for the `1:length(A)` version, although this was not always true). What about the following: ``` function sum2(A::Vector{Int64}, B::Vector{Int64}) a = zero(eltype(A)) for i in eachindex(A, B) a += A[i] + B[i] end return a end ``` Here, LLVM is again able to eliminate the boundscheck, though of course the `eachindex` introduces an additional check that the array shapes are compatible. Even this is still ok-ish: ``` # Noinline here just for illustration, think some big internal function that is # not inlineable. @noinline function sum_partial(A::Vector{Int64}, upto::Int) a = zero(eltype(A)) for i in 1:upto a += A[i] end return a end sum3(A::Vector{Int64}) = sum_partial(A, length(A)) ``` The generated code here runs through the loop. LLVM's vectorizer has support for bounds-like checks and can remove them from the vector body. However, in scalar code (e.g. Float64 without fastmath), LLVM does still perform the bounds check in every loop iteration rather than hoisting it outside. This is probably a bug somewhere. The IRCE pass is supposed to take care of this, so for the present purposes, let's assume that this will also go through eventually. That said, the takeaway here is that, for simple cases, where everything is inlined, LLVM is reasonably powerful at eliminating these checks. ### The effect-separation trick Let's consider a case like this ``` function repeat_outer(a::AbstractMatrix, (m,n)::NTuple{2, Any}) o, p = size(a,1), size(a,2) b = similar(a, o*m, p*n) for j=1:n d = (j-1)*p+1 R = d:d+p-1 for i=1:m c = (i-1)*o+1 b[c:c+o-1, R] = a end end return b end ``` The setindex! call in here goes through a few layers of dispatch, but eventually ends up at: ``` function _setindex!(l::IndexStyle, A::AbstractArray, x, I::Union{Real, AbstractArray}...) @inline @BoundsCheck checkbounds(A, I...) _unsafe_setindex!(l, _maybe_reshape(l, A, I...), x, I...) A end ``` This is a very common pattern in our code, where we have an `@inline` function that first checks for the in-boundedness and then performs an unsafe operation that assumes everything is inbounds. This pattern is quite good. By allowing the boundscheck itself to be inlined, LLVM can eliminate the boundscheck using local context from the calling function (in theory at least - in practice we still have an `@inbounds` there because LLVM isn't strong enough currently). However, the pattern is somewhat hard to use consistently and correctly, so we generally only use it in the low level indexing code. # Effect-precondition synthesis ## `@inbounds` motivations In the previous section I said that the effect-separation was good but hard to do consistently and correctly. So can we have the compiler implement this for us. Suppose we had a macro `@hoist_boundschecks` that worked something like the following: ``` function _setindex!(l::IndexStyle, A::AbstractArray, x, I::Union{Real, AbstractArray}...) @inline @hoist_boundscheck _safe_setindex!(l, _maybe_reshape(l, A, I...), x, I...) A end # = Expands to =# function _setindex!(l::IndexStyle, A::AbstractArray, x, I::Union{Real, AbstractArray}...) @inline #= All the boundschecks from _safe_setindex! hoisted =# checkbounds(A, I...) #= Switched to unsafe_setindex, but now proven correct =# _unsafe_setindex!(l, _maybe_reshape(l, A, I...), x, I...) A end ``` Where essentially what the macro would do is go through the `_safe_setindex!` call and synthesize a check that ensures that all calls are inbounds. Of course such a synthesis is not always possible in general, but it's not that hard for simple cases (a trivial way to do it is to duplicate the function, delete all the side effects other than throwing boundserrors and then let LLVM go to town on it). ## Generalizing There is a few problems with the macro as suggested in the previous section. In particular, it changes the ordering of side effects and errors in the function, which may be desirable, but I would like have explicit control over. My proposal is to have a slight generalization of it that works roughly as follows: ``` function foo(A, I) @split_effects :nothrow bar(A, I) end #= Expands to =# function foo(A, I) if precondition(Val{:nothrow}(), bar, A, I) @assume_effects :nothrow bar(A, I) else bar(A, I) end end ``` Where `precondition` like the proposed macro above is some synthesized predicate that ensures that the resulting function does not throw. Of course this brings up many questions, such as the implementation of `precondition` and the statement-position `@assume_effects`, which we currently don't have, but let's explore the implications of this a bit further. ## Precondition inference and composition The question of coming up with these kinds of preconditions is a well-known problem in the compiler literature. The term to look for is "weakest precondition synthesis". How exactly do this best is somewhat outside the scope of this writeup, but I do want to highlight a the primary property that suggests the idea: functional composition of functions is boolean composition of preconditions: ``` function foo() bar() baz() end precondition(n::Val{:nothrow}, foo) = precondition(n, bar) && precondition(n, baz) ``` Also, if a function uses the `@split_effects` macro internally, then an outside `@assume_effects` can cause the precondition to be assumed to be true. This mirrors the `@inbounds/@boundscheck` interaction we have right now, but with an alternative that is provably safe. # Extensions to `@assume_effects` So far, we have assumed that these preconditions are synthesized automatically, but getting this to work well, of course depends on the synthesis strength of the compiler. To still allow users to take advantage of this mechanism, even if the compiler's synthesis is not strong enough, we can extend @assume_effects to allow conditional effects: ``` @assume_effects (checkbounds(false, A, I) && :nothrow) setindex!(A, I) _safe_setindex!(A, I) end ``` The idea here is that precondition is overriden by `checkbounds(false, A, I)`, so any `@split_effects` of this function would use the `checkbounds` function for its check and if this returned true, could `@assume_effects :nothrow` this function, which as described above would allow the use of the unsafe function in the interior. ## Call-site `@assume_effects` and effects-assumption specialization In the foregoing we have, in a few places, used a call-site `@assume_effects`, without defining what it does. The idea here is pretty simple: We add a new specializaiton axis based on effects assumption. For example, if a function is `@assume_effects :nothrow`, then at codegen time, any path that throws an error (in particular bounds checks) becomes undefined behavior and LLVM is allowed to assume that it does not happen. Of course, this is macro is extremely dangerous (as the existing @assume_effects and @inbounds are). However, one significant benefit of this design is that there is a very clear notion of what the assertion promises. This is not necessarily clear of `@inbounds`, which we did not give semantics beyond it's effect on the `@boundscheck` macro. As a result, even an infinitely-powerful prover could not check the correctness or non-correctness of `@inbounds` (as currenlty used - we could of course consider an alternative @inbounds design with stronger semantics). In contrast, the formal meaning of a conditional `@assume_effects` is well defined and could in principle be checked by a tool (#49273). # Implications for `--check-bounds` In the whole, `--check-bounds` removal discussion, we had assumed that we did not want to keep two copies of all code just for the purposes of `--check-bounds` which thus required us disable constant propagation. However, in this proposal, the `@assume_effects` specialization is exactly such a copy set and could thus be used for this purpose. That said, this proposal would also hopefully provide a much stronger system for boundscheck removal that would allow us to make `--check-bounds` much less necessary. # Other uses There are a few other places where domain checks can interfere with optimization. For exmaple, consider the following situation: ``` function sin_loop(n) for i = 1:n # Imagine there was a whole bunch of other code here that used this value, # but all that got DCE'd, but we can't in general DCE `sin`, because it # may throw. sin(Float64(i)) end end ``` ``` julia> @time sin_loop(1_000_000_000) 20.383584 seconds ``` With the support in this PR, we can: ``` # The actual condition here is `!isinf`, but we're allowed to overapproximate and # LLVM can handle `isfinite` better. # TODO: In a more complete version of this PR, this precondition would be synthesized @Base.assume_effects (isfinite(x) && :nothrow) @noinline function mysin(x::Float64) sin(x) end function sin_loop_split(n) for i = 1:n Core.invoke_split_effects(:nothrow, mysin, Float64(i)) end end ``` ``` julia> @code_llvm sin_loop_split(1_000_000_000) ; @ REPL[19]:1 within `sin_loop_split` define void @julia_sin_loop_split_428(i64 signext %"n::Int64") #0 { top: ; @ REPL[19]:4 within `sin_loop_split` ret void } julia> @time sin_loop_split(1_000_000_000) 0.000000 seconds ``` # Current status of this PR This PR contains a complete sketch of this mechanism, including inference support for the new intrinsic, as well as codegen and runtime support for effect-assumption specialization. It also includes an (extremely incomplete) sketch of the supporting macros. It does not implement any precondition synthesis logic. My plan is to support synthesis for some of the simple `@inbounds` cases in Base, but then delagate fancier synthesis support for packages, since that can get arbitrarily complicated. # Implementation Plan This PR itself is not suitable for merging, but if people like this direction, I would like to get the basic pieces in in relatively short order. To that end, I would suggest the following order of implementation as separate PRs once we've agreed on the overall plan: 1. New intrinsics, Method(Instance) fields, inference support 2. @assume_effects extensions 3. Codegen and specialization support, Code(Instance) fields 4. Basic Synthesis and `@inbounds` removal [1] Introduced in 66ab577 for julia 0.2 [2] Note that it's usually not the boundschecking itself that is the problem, but rather that the presence of the boundschecking inhibits other optimizations. [3] E.g. https://yuri.is/not-julia/
Thank you for the concrete proposal! You mention that any incorrect use of |
They're not.
Yes, which is why it's a big problem when people get it wrong. The hope is that we could leverage tooling from the formal methods world to figure out when people got the annotations wrong to tell them about it (potentially by using significant amounts of compute in excess of what we would consider reasonable for the base compiler). #49273 Also, just to be clear, the hope here is that the automatic synthesis in Base would be sufficient to cover 90% of current uses of |
How good are the chances that such formal method proving techniques can distinguish cases of "the compiler is unable to prove it, but the use of the macro is sound" from "the compiler has proven the contraposition"? I.e., what's our target ratio of false positives? To be clear, I'm personally in the camp of "have the compiler scream loudly at me" whenever it can notice that it thinks I've done something wrong (because the intention is usually to get some static guarantee that the code is relying on for correctness), but then again, I also don't tend to use |
Thank you for such a well-thought out proposal. Even as I'm not qualified to understand all the implications of this proposal, it's clear that a macro like However, I'm a bit worried about that the macro is even harder to use correctly than Hence, I'm a bit worried that if the motivation for this new macro is to make user code more robust by recommending the new macro over |
Yes, it is unquestionably harder to use, because it is a stronger assertion.
Well, perhaps. The conditional In particular, the replacement for call-site
The hope is that the |
With That aside, I do like the proposal. Since you mention the general literature on pre-conditions, how does this compare to having generalized pre-condition and post-condition guards (e.g. pre and post dispatch function call MethodTables), with explicit options to hoist/reorder those? Lastly, what prevents inference of effects from becoming undecidable when you allow the user to hoist effects? For example, with code like this:
Is it permitted to hoist side-effects from the boundscheck in |
The comment on
The question is what do those pre-conditions guard. In a static system, you would prevent people from calling functions that don't have their preconditions satisfied. You could imagine a system where the preconditions/postconditions are semantically enforced dynamically and called everytime you enter or leave a function, but that would be a pretty different feature. The hope here is that the majority of preconditions would be compiler-enforced.
Preconditions are required to be |
This changes the canonical source of truth for va handling from `Method` to `CodeInfo`. There are multiple goals for this change: 1. This addresses a longstanding complaint about the way that CodeInfo-returning generated functions work. Previously, the va-ness or not of the returned CodeInfo always had to match that of the generator. For Cassette-like transforms that generally have one big generator function that is varargs (while then looking up lowered code that is not varargs), this could become quite annoying. It's possible to workaround, but there is really no good reason to tie the two together. As we observed when we implemented OpaqueClosures, the vararg-ness of the signature and the `vararg arguments`->`tuple` transformation are mostly independent concepts. With this PR, generated functions can return CodeInfos with whatever combination of nargs/isva is convenient. 2. This change requires clarifying where the va processing boundary is in inference. #54076 was already moving in that direction for irinterp, and this essentially does much of the same for regular inference. As a consequence the constprop cache is now using non-va-cooked signatures, which I think is preferable. 3. This further decouples codegen from the presence of a `Method` (which is already not assumed, since the code being generated could be a toplevel thunk, but some codegen features are only available to things that come from Methods). There are a number of upcoming features that will require codegen of things that are not quite method specializations (See design doc linked in #52797 and things like #50641). This helps pave the road for that. 4. I've previously considered expanding the kinds of vararg signatures that can be described (see e.g. #53851), which also requires a decoupling of the signature and ast notions of vararg. This again lays the groundwork for that, although I have no immediate plans to implement this change. Impact wise, this adds an internal field, which is not too breaking, but downstream clients vary in how they construct their `CodeInfo`s and the current way they're doing it will likely be incorrect after this change, so they will require a small two-line adjustment. We should perhaps consider pulling out some of the more common patterns into a more stable package, since interface in most of the last few releases, but that's a separate issue.
This changes the canonical source of truth for va handling from `Method` to `CodeInfo`. There are multiple goals for this change: 1. This addresses a longstanding complaint about the way that CodeInfo-returning generated functions work. Previously, the va-ness or not of the returned CodeInfo always had to match that of the generator. For Cassette-like transforms that generally have one big generator function that is varargs (while then looking up lowered code that is not varargs), this could become quite annoying. It's possible to workaround, but there is really no good reason to tie the two together. As we observed when we implemented OpaqueClosures, the vararg-ness of the signature and the `vararg arguments`->`tuple` transformation are mostly independent concepts. With this PR, generated functions can return CodeInfos with whatever combination of nargs/isva is convenient. 2. This change requires clarifying where the va processing boundary is in inference. #54076 was already moving in that direction for irinterp, and this essentially does much of the same for regular inference. As a consequence the constprop cache is now using non-va-cooked signatures, which I think is preferable. 3. This further decouples codegen from the presence of a `Method` (which is already not assumed, since the code being generated could be a toplevel thunk, but some codegen features are only available to things that come from Methods). There are a number of upcoming features that will require codegen of things that are not quite method specializations (See design doc linked in #52797 and things like #50641). This helps pave the road for that. 4. I've previously considered expanding the kinds of vararg signatures that can be described (see e.g. #53851), which also requires a decoupling of the signature and ast notions of vararg. This again lays the groundwork for that, although I have no immediate plans to implement this change. Impact wise, this adds an internal field, which is not too breaking, but downstream clients vary in how they construct their `CodeInfo`s and the current way they're doing it will likely be incorrect after this change, so they will require a small two-line adjustment. We should perhaps consider pulling out some of the more common patterns into a more stable package, since interface in most of the last few releases, but that's a separate issue.
This changes the canonical source of truth for va handling from `Method` to `CodeInfo`. There are multiple goals for this change: 1. This addresses a longstanding complaint about the way that CodeInfo-returning generated functions work. Previously, the va-ness or not of the returned CodeInfo always had to match that of the generator. For Cassette-like transforms that generally have one big generator function that is varargs (while then looking up lowered code that is not varargs), this could become quite annoying. It's possible to workaround, but there is really no good reason to tie the two together. As we observed when we implemented OpaqueClosures, the vararg-ness of the signature and the `vararg arguments`->`tuple` transformation are mostly independent concepts. With this PR, generated functions can return CodeInfos with whatever combination of nargs/isva is convenient. 2. This change requires clarifying where the va processing boundary is in inference. #54076 was already moving in that direction for irinterp, and this essentially does much of the same for regular inference. As a consequence the constprop cache is now using non-va-cooked signatures, which I think is preferable. 3. This further decouples codegen from the presence of a `Method` (which is already not assumed, since the code being generated could be a toplevel thunk, but some codegen features are only available to things that come from Methods). There are a number of upcoming features that will require codegen of things that are not quite method specializations (See design doc linked in #52797 and things like #50641). This helps pave the road for that. 4. I've previously considered expanding the kinds of vararg signatures that can be described (see e.g. #53851), which also requires a decoupling of the signature and ast notions of vararg. This again lays the groundwork for that, although I have no immediate plans to implement this change. Impact wise, this adds an internal field, which is not too breaking, but downstream clients vary in how they construct their `CodeInfo`s and the current way they're doing it will likely be incorrect after this change, so they will require a small two-line adjustment. We should perhaps consider pulling out some of the more common patterns into a more stable package, since interface in most of the last few releases, but that's a separate issue.
This changes the canonical source of truth for va handling from `Method` to `CodeInfo`. There are multiple goals for this change: 1. This addresses a longstanding complaint about the way that CodeInfo-returning generated functions work. Previously, the va-ness or not of the returned CodeInfo always had to match that of the generator. For Cassette-like transforms that generally have one big generator function that is varargs (while then looking up lowered code that is not varargs), this could become quite annoying. It's possible to workaround, but there is really no good reason to tie the two together. As we observed when we implemented OpaqueClosures, the vararg-ness of the signature and the `vararg arguments`->`tuple` transformation are mostly independent concepts. With this PR, generated functions can return CodeInfos with whatever combination of nargs/isva is convenient. 2. This change requires clarifying where the va processing boundary is in inference. #54076 was already moving in that direction for irinterp, and this essentially does much of the same for regular inference. As a consequence the constprop cache is now using non-va-cooked signatures, which I think is preferable. 3. This further decouples codegen from the presence of a `Method` (which is already not assumed, since the code being generated could be a toplevel thunk, but some codegen features are only available to things that come from Methods). There are a number of upcoming features that will require codegen of things that are not quite method specializations (See design doc linked in #52797 and things like #50641). This helps pave the road for that. 4. I've previously considered expanding the kinds of vararg signatures that can be described (see e.g. #53851), which also requires a decoupling of the signature and ast notions of vararg. This again lays the groundwork for that, although I have no immediate plans to implement this change. Impact wise, this adds an internal field, which is not too breaking, but downstream clients vary in how they construct their `CodeInfo`s and the current way they're doing it will likely be incorrect after this change, so they will require a small two-line adjustment. We should perhaps consider pulling out some of the more common patterns into a more stable package, since interface in most of the last few releases, but that's a separate issue.
This changes the canonical source of truth for va handling from `Method` to `CodeInfo`. There are multiple goals for this change: 1. This addresses a longstanding complaint about the way that CodeInfo-returning generated functions work. Previously, the va-ness or not of the returned CodeInfo always had to match that of the generator. For Cassette-like transforms that generally have one big generator function that is varargs (while then looking up lowered code that is not varargs), this could become quite annoying. It's possible to workaround, but there is really no good reason to tie the two together. As we observed when we implemented OpaqueClosures, the vararg-ness of the signature and the `vararg arguments`->`tuple` transformation are mostly independent concepts. With this PR, generated functions can return CodeInfos with whatever combination of nargs/isva is convenient. 2. This change requires clarifying where the va processing boundary is in inference. #54076 was already moving in that direction for irinterp, and this essentially does much of the same for regular inference. As a consequence the constprop cache is now using non-va-cooked signatures, which I think is preferable. 3. This further decouples codegen from the presence of a `Method` (which is already not assumed, since the code being generated could be a toplevel thunk, but some codegen features are only available to things that come from Methods). There are a number of upcoming features that will require codegen of things that are not quite method specializations (See design doc linked in #52797 and things like #50641). This helps pave the road for that. 4. I've previously considered expanding the kinds of vararg signatures that can be described (see e.g. #53851), which also requires a decoupling of the signature and ast notions of vararg. This again lays the groundwork for that, although I have no immediate plans to implement this change. Impact wise, this adds an internal field, which is not too breaking, but downstream clients vary in how they construct their `CodeInfo`s and the current way they're doing it will likely be incorrect after this change, so they will require a small two-line adjustment. We should perhaps consider pulling out some of the more common patterns into a more stable package, since interface in most of the last few releases, but that's a separate issue.
This changes the canonical source of truth for va handling from `Method` to `CodeInfo`. There are multiple goals for this change: 1. This addresses a longstanding complaint about the way that CodeInfo-returning generated functions work. Previously, the va-ness or not of the returned CodeInfo always had to match that of the generator. For Cassette-like transforms that generally have one big generator function that is varargs (while then looking up lowered code that is not varargs), this could become quite annoying. It's possible to workaround, but there is really no good reason to tie the two together. As we observed when we implemented OpaqueClosures, the vararg-ness of the signature and the `vararg arguments`->`tuple` transformation are mostly independent concepts. With this PR, generated functions can return CodeInfos with whatever combination of nargs/isva is convenient. 2. This change requires clarifying where the va processing boundary is in inference. #54076 was already moving in that direction for irinterp, and this essentially does much of the same for regular inference. As a consequence the constprop cache is now using non-va-cooked signatures, which I think is preferable. 3. This further decouples codegen from the presence of a `Method` (which is already not assumed, since the code being generated could be a toplevel thunk, but some codegen features are only available to things that come from Methods). There are a number of upcoming features that will require codegen of things that are not quite method specializations (See design doc linked in #52797 and things like #50641). This helps pave the road for that. 4. I've previously considered expanding the kinds of vararg signatures that can be described (see e.g. #53851), which also requires a decoupling of the signature and ast notions of vararg. This again lays the groundwork for that, although I have no immediate plans to implement this change. Impact wise, this adds an internal field, which is not too breaking, but downstream clients vary in how they construct their `CodeInfo`s and the current way they're doing it will likely be incorrect after this change, so they will require a small two-line adjustment. We should perhaps consider pulling out some of the more common patterns into a more stable package, since interface in most of the last few releases, but that's a separate issue.
This changes the canonical source of truth for va handling from `Method` to `CodeInfo`. There are multiple goals for this change: 1. This addresses a longstanding complaint about the way that CodeInfo-returning generated functions work. Previously, the va-ness or not of the returned CodeInfo always had to match that of the generator. For Cassette-like transforms that generally have one big generator function that is varargs (while then looking up lowered code that is not varargs), this could become quite annoying. It's possible to workaround, but there is really no good reason to tie the two together. As we observed when we implemented OpaqueClosures, the vararg-ness of the signature and the `vararg arguments`->`tuple` transformation are mostly independent concepts. With this PR, generated functions can return CodeInfos with whatever combination of nargs/isva is convenient. 2. This change requires clarifying where the va processing boundary is in inference. #54076 was already moving in that direction for irinterp, and this essentially does much of the same for regular inference. As a consequence the constprop cache is now using non-va-cooked signatures, which I think is preferable. 3. This further decouples codegen from the presence of a `Method` (which is already not assumed, since the code being generated could be a toplevel thunk, but some codegen features are only available to things that come from Methods). There are a number of upcoming features that will require codegen of things that are not quite method specializations (See design doc linked in #52797 and things like #50641). This helps pave the road for that. 4. I've previously considered expanding the kinds of vararg signatures that can be described (see e.g. #53851), which also requires a decoupling of the signature and ast notions of vararg. This again lays the groundwork for that, although I have no immediate plans to implement this change. Impact wise, this adds an internal field, which is not too breaking, but downstream clients vary in how they construct their `CodeInfo`s and the current way they're doing it will likely be incorrect after this change, so they will require a small two-line adjustment. We should perhaps consider pulling out some of the more common patterns into a more stable package, since interface in most of the last few releases, but that's a separate issue.
This changes the canonical source of truth for va handling from `Method` to `CodeInfo`. There are multiple goals for this change: 1. This addresses a longstanding complaint about the way that CodeInfo-returning generated functions work. Previously, the va-ness or not of the returned CodeInfo always had to match that of the generator. For Cassette-like transforms that generally have one big generator function that is varargs (while then looking up lowered code that is not varargs), this could become quite annoying. It's possible to workaround, but there is really no good reason to tie the two together. As we observed when we implemented OpaqueClosures, the vararg-ness of the signature and the `vararg arguments`->`tuple` transformation are mostly independent concepts. With this PR, generated functions can return CodeInfos with whatever combination of nargs/isva is convenient. 2. This change requires clarifying where the va processing boundary is in inference. #54076 was already moving in that direction for irinterp, and this essentially does much of the same for regular inference. As a consequence the constprop cache is now using non-va-cooked signatures, which I think is preferable. 3. This further decouples codegen from the presence of a `Method` (which is already not assumed, since the code being generated could be a toplevel thunk, but some codegen features are only available to things that come from Methods). There are a number of upcoming features that will require codegen of things that are not quite method specializations (See design doc linked in #52797 and things like #50641). This helps pave the road for that. 4. I've previously considered expanding the kinds of vararg signatures that can be described (see e.g. #53851), which also requires a decoupling of the signature and ast notions of vararg. This again lays the groundwork for that, although I have no immediate plans to implement this change. Impact wise, this adds an internal field, which is not too breaking, but downstream clients vary in how they construct their `CodeInfo`s and the current way they're doing it will likely be incorrect after this change, so they will require a small two-line adjustment. We should perhaps consider pulling out some of the more common patterns into a more stable package, since interface in most of the last few releases, but that's a separate issue.
This changes the canonical source of truth for va handling from `Method` to `CodeInfo`. There are multiple goals for this change: 1. This addresses a longstanding complaint about the way that CodeInfo-returning generated functions work. Previously, the va-ness or not of the returned CodeInfo always had to match that of the generator. For Cassette-like transforms that generally have one big generator function that is varargs (while then looking up lowered code that is not varargs), this could become quite annoying. It's possible to workaround, but there is really no good reason to tie the two together. As we observed when we implemented OpaqueClosures, the vararg-ness of the signature and the `vararg arguments`->`tuple` transformation are mostly independent concepts. With this PR, generated functions can return CodeInfos with whatever combination of nargs/isva is convenient. 2. This change requires clarifying where the va processing boundary is in inference. JuliaLang#54076 was already moving in that direction for irinterp, and this essentially does much of the same for regular inference. As a consequence the constprop cache is now using non-va-cooked signatures, which I think is preferable. 3. This further decouples codegen from the presence of a `Method` (which is already not assumed, since the code being generated could be a toplevel thunk, but some codegen features are only available to things that come from Methods). There are a number of upcoming features that will require codegen of things that are not quite method specializations (See design doc linked in JuliaLang#52797 and things like JuliaLang#50641). This helps pave the road for that. 4. I've previously considered expanding the kinds of vararg signatures that can be described (see e.g. JuliaLang#53851), which also requires a decoupling of the signature and ast notions of vararg. This again lays the groundwork for that, although I have no immediate plans to implement this change. Impact wise, this adds an internal field, which is not too breaking, but downstream clients vary in how they construct their `CodeInfo`s and the current way they're doing it will likely be incorrect after this change, so they will require a small two-line adjustment. We should perhaps consider pulling out some of the more common patterns into a more stable package, since interface in most of the last few releases, but that's a separate issue.
This proposal is an attempt to tie together some of the recent discussion around the future of
@inbounds
, formal interface checking (a long the lines of my proposal in https://github.com/Keno/InterfaceSpecs.jl), and--check-bounds
.Reviewing
@inbounds
History
The
@inbounds
annotation has been in the langauge for a decade at this point [1]. It is a crictical tool to achieve machine performance in high-performance applications where the overhead of boundschecking is prohibitive [2]. At the same time, as Julia has matured, the dangers of this macro have become more clear. It is essentially impossible to use correctly in generic code and instances where code (including code in base from before we were culturally more careful about inbounds) are using@inbounds
have become a not-uncommon complaint-point about the language as a whole [3]. In current practice, at least in Base, we have become extremely hesitant about the introduction of new@inbounds
annotations (see e.g. discussion in #41200). At the same time, the ability to disable bounds checking remains a critical capability for several of our user communities (c.f. #50239 and related discussions). So how do we move forward here? I'm hoping we can find a way forward that achieves the same benefits of@inbounds
but in a way where it is possible to re-establish soundness if desired.When is inbounds still required?
LLVM's current capabilities
Let's look at a couple of examples. First:
Is inbounds required here in modern julia? The answer is no - LLVM's range analysis proves that the bounds check is dead and eliminates it. (The same holds true for the
1:length(A)
version, although this was not always true).What about the following:
Here, LLVM is again able to eliminate the boundscheck, though of course the
eachindex
introduces an additional check that the array shapes are compatible. Even this is still ok-ish:The generated code here runs through the loop. LLVM's vectorizer has support for bounds-like checks and can remove them from the vector body. However, in scalar code (e.g. Float64 without fastmath), LLVM does still perform the bounds check in every loop iteration rather than hoisting it outside. This is probably a bug somewhere. The IRCE pass is supposed to take care of this, so for the present purposes, let's assume that this will also go through eventually.
That said, the takeaway here is that, for simple cases, where everything is inlined, LLVM is reasonably powerful at eliminating these checks.
The effect-separation trick
Let's consider a case like this
The setindex! call in here goes through a few layers of dispatch, but eventually ends up at:
This is a very common pattern in our code, where we have an
@inline
function that first checks for the in-boundedness and then performs an unsafe operation that assumes everything is inbounds.This pattern is quite good. By allowing the boundscheck itself to be inlined, LLVM can eliminate the boundscheck using local context from the calling function (in theory at least - in practice we still have an
@inbounds
there because LLVM isn't strong enough currently).However, the pattern is somewhat hard to use consistently and correctly, so we generally only use it in the low level indexing code.
Effect-precondition synthesis
@inbounds
motivationsIn the previous section I said that the effect-separation was good but hard to do consistently and correctly. So can we have the compiler implement this for us. Suppose we had a macro
@hoist_boundschecks
that worked something like the following:Where essentially what the macro would do is go through the
_safe_setindex!
call and synthesize a check that ensures that all calls are inbounds. Of course such a synthesis is not always possible in general, but it's not that hard for simple cases (a trivial way to do it is to duplicate the function, delete all the side effects other than throwing boundserrors and then let LLVM go to town on it).Generalizing
There is a few problems with the macro as suggested in the previous section. In particular, it changes the ordering of side effects and errors in the function, which may be desirable, but I would like have explicit control over. My proposal is to have a slight generalization of it that works roughly as follows:
Where
precondition
like the proposed macro above is some synthesized predicate that ensures that the resulting function does not throw. Of course this brings up many questions, such as the implementation ofprecondition
and the statement-position@assume_effects
, which we currently don't have, but let's explore the implications of this a bit further.Precondition inference and composition
The question of coming up with these kinds of preconditions is a well-known problem in the compiler literature. The term to look for is "weakest precondition synthesis". How exactly do this best is somewhat outside the scope of this writeup, but I do want to highlight a the primary property that suggests the idea: functional composition of functions is boolean composition of preconditions:
Also, if a function uses the
@split_effects
macro internally, then an outside@assume_effects
can cause the precondition to be assumed to be true. This mirrors the@inbounds/@boundscheck
interaction we have right now, but with an alternative that is provably safe.Extensions to
@assume_effects
So far, we have assumed that these preconditions are synthesized automatically, but getting this to work well, of course depends on the synthesis strength of the compiler. To still allow users to take advantage of this mechanism, even if the compiler's synthesis is not strong enough, we can extend @assume_effects to allow conditional effects:
The idea here is that precondition is overriden by
checkbounds(false, A, I)
, so any@split_effects
of this function would use thecheckbounds
function for its check and if this returned true, could@assume_effects :nothrow
this function, which as described above would allow the use of the unsafe function in the interior.Call-site
@assume_effects
and effects-assumption specializationIn the foregoing we have, in a few places, used a call-site
@assume_effects
, without defining what it does. The idea here is pretty simple: We add a new specializaiton axis based on effects assumption. For example, if a function is@assume_effects :nothrow
, then at codegen time, any path that throws an error (in particular bounds checks) becomes undefined behavior and LLVM is allowed to assume that it does not happen. Of course, this is macro is extremely dangerous (as the existing @assume_effects and @inbounds are).However, one significant benefit of this design is that there is a very clear notion of what the assertion promises. This is not necessarily clear of
@inbounds
, which we did not give semantics beyond it's effect on the@boundscheck
macro. As a result, even an infinitely-powerful prover could not check the correctness or non-correctness of@inbounds
(as currenlty used - we could of course consider an alternative @inbounds design with stronger semantics). In contrast, the formal meaning of a conditional@assume_effects
is well defined and could in principle be checked by a tool (#49273).Implications for
--check-bounds
In the whole,
--check-bounds
removal discussion, we had assumed that we did not want to keep two copies of all code just for the purposes of--check-bounds
which thus required us disable constant propagation. However, in this proposal, the@assume_effects
specialization is exactly such a copy set and could thus be used for this purpose. That said, this proposal would also hopefully provide a much stronger system for boundscheck removal that would allow us to make--check-bounds
much less necessary.Other uses
There are a few other places where domain checks can interfere with optimization. For exmaple, consider the following situation:
With the support in this PR, we can:
Current status of this PR
This PR contains a complete sketch of this mechanism, including inference support for the new intrinsic, as well as codegen and runtime support for effect-assumption specialization. It also includes an (extremely incomplete) sketch of the supporting macros. It does not implement any precondition synthesis logic. My plan is to support synthesis for some of the simple
@inbounds
cases in Base, but then delagate fancier synthesis support for packages, since that can get arbitrarily complicated.Implementation Plan
This PR itself is not suitable for merging, but if people like this direction, I would like to get the basic pieces in in relatively short order. To that end, I would suggest the following order of implementation as separate PRs once we've agreed on the overall plan:
@inbounds
removal[1] Introduced in 66ab577 for julia 0.2 [2] Note that it's usually not the boundschecking itself that is the problem,
but rather that the presence of the boundschecking inhibits other optimizations.
[3] E.g. https://yuri.is/not-julia/