-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve the interplay between bounds checking system and effect system #50167
base: master
Are you sure you want to change the base?
Conversation
The test for this is not wrong, but this claim is. The function getindex(arr::MyArray, idx::Int)
@boundscheck (boundscheck(arr.idsx, idx) && @inbounds(arr[idx.idxs] !== missing)) || throw_boundserror(arr, idx)
return @inbounds arr.data[arr.idxs[idx]]
end |
How do we deal with the "foo" problem from the current state 1 in this change? |
Thanks for pointing this out. After double-checking with https://docs.julialang.org/en/v1/devdocs/boundscheck/#Eliding-bounds-checks, I just agree with you that my claim was incorrect. I'll go ahead and remove that part from the PR.
With this adjustment, |
b07cd82
to
7d6579e
Compare
In the current state of the Julia compiler, bounds checking and its related optimization code, such as `@boundscheck` and `@inbounds`, pose a significant handicap for effect analysis. As a result, we're encountering an ironic situation where the application of `@inbounds` annotations, which are intended to optimize performance, instead obstruct the program's optimization, thereby preventing us from achieving optimal performance. This PR is designed to resolve this situation. It aims to enhance the relationship between bounds checking and effect analysis, thereby correctly improving the performance of programs that have `@inbounds` annotations. In the following, I'll first explain the reasons that have led to this situation for better understanding, and then I'll present potential improvements to address these issues. This commit is a collection of various improvement proposals. It's necessary that we incorporate all of them simultaneously to enhance the situation without introducing any regressions. \## Core of the Problem There are fundamentally two reasons why effect analysis of code containing bounds checking is difficult: 1. The evaluation value of `Expr(:boundscheck)` is influenced by the `@inbounds` macro and the `--check-bounds` flag. Hence, when performing a concrete evaluation of a method containing `Expr(:boundscheck)`, it's crucial to respect the `@inbounds` macro context and the `--check-bounds` settings, ensuring the method's behavior is consistent across the compile time concrete evaluation and the runtime execution. 1. If the code, from which bounds checking has been removed due to `@inbounds` or `--check-bounds=no`, is unsafe, it may lead to undefined behavior due to uncertain memory access. \## Current State The current Julia compiler handles these two problems as follows: \### Current State 1 Regarding the first problem, if a code or method call containing `Expr(:boundscheck)` is within an `@inbounds` context, a concrete evaluation is immediately prohibited. For instance, in the following case, when analyzing `bar()`, if you simply perform concrete evaluation of `foo()`, it wouldn't properly respect the `@inbounds` context present in `bar()`. However, since the concrete evaluation of `foo()` is prohibited, it doesn't pose an issue: ```julia foo() = (r = 0; @BoundsCheck r += 1; return r) bar() = @inbounds foo() ``` Conversely, in the following case, there is _no need_ to prohibit the concrete evaluation of `A1_inbounds` due to the presence of `@inbounds`. This is because ~~the execution of the `@boundscheck` block is determined by the presence of local `@inbounds`~~ `Expr(:boundscheck)` within a local `@inbounds` context does not need to block concrete evaluation: ```julia function A1_inbounds() r = 0 @inbounds begin @BoundsCheck r += 1 end return r end ``` However, currently, we prohibit the concrete evaluation of such code as well. ~~Moreover, we are not handling such local `@inbounds` contexts effectively, which results in incorrect execution of `A1_inbounds()` (even our test is incorrect for this example: `https://github.com/JuliaLang/julia/blob/834aad4ab409f4ba65cbed2963b9ab6fa2770354/test/boundscheck_exec.jl#L34`)~~ EDIT: It is an expected behavior as pointed out by Jameson. Furthermore, there is room for improvement when the `--check-bounds` flag is specified. Specifically, when the `--check-bounds` flag is set, the evaluation value of `Expr(:boundscheck)` is determined irrespective of the `@inbounds` context. Hence, there is no need to prohibit concrete evaluations due to inconsistency in the evaluation value of `Expr(:boundscheck)`. \### Current State 2 Next, we've ensured that concrete evaluation isn't performed when there's potentially unsafe code that may have bounds checking removed, or when the `--check-bounds=no` flag is set, which could lead to bounds checking being removed always. For instance, if you perform concrete evaluation for the function call `baz((1,2,3), 4)` in the following example, it may return a value accessed from illicit memory and introduce undefined behaviors into the program: ```julia baz(t::Tuple, i::Int) = @inbounds t[i] baz((1,2,3), 4) ``` However, it's evident that the above code is incorrect and unsafe program and I believe undefined behavior in such programs is deemed, as explicitly stated in the `@inbounds` documentation: > │ Warning > │ > │ Using @inbounds may return incorrect results/crashes/corruption for > │ out-of-bounds indices. The user is responsible for checking it > │ manually. Only use @inbounds when it is certain from the information > │ locally available that all accesses are in bounds. Actually, the `@inbounds` macro is primarily an annotation to "improve performance by removing bounds checks from safe programs". Therefore, I opine that it would be more reasonable to utilize it to alleviate potential errors due to bounds checking within `@inbounds` contexts. To bring up another associated concern, in the current compiler implementation, the `:nothrow` modelings for `getfield`/`arrayref`/`arrayset` is a bit risky, and `:nothrow`-ness is assumed when their bounds checking is turned off by call argument. If our intended direction aligns with the removal of bounds checking based on `@inbounds` as proposed in issue #48245, then assuming `:nothrow`-ness due to `@inbounds` seems reasonable. However, presuming `:nothrow`-ness due to bounds checking argument or the `--check-bounds` flag appears to be risky, especially considering it's not documented. \## This Commit This commit implements all proposed improvements against the current issues as mentioned above. In summary, the enhancements include: - making `Expr(:boundscheck)` within a local `@inbounds` context not block concrete evaluation - folding out `Expr(:boundscheck)` when the `--check-bounds` flag is set (and allow concrete evaluation) - changing the `:nothrow` effect bit to `UInt8` type, and refining `:nothrow` information when in an `@inbounds` context - removing dangerous assumptions of `:nothrow`-ness for built-in functions when bounds checking is turned off - replacing the `@_safeindex` hack with `@inbounds`
7d6579e
to
538a511
Compare
The real problem here is that the compiler may make up values at function boundaries. In order for inbounds to be safe in the presence of concrete evaluation, it needs to be true for all values the compiler could possibly make up. For example, consider the following:
I think this use of inbounds is reasonable, and yet it will blow up when compiling |
I don't understand this point. At the very least, this breaks :consistent, which is a requisite for concrete-eval. |
Isn't that already a bug introduced into the user code with use of that command line flag, and not unique to this though? |
Considering @Keno's example, can't we just unconditionally enable bounds-checking during concrete-eval? If it then hits a bounds error, it can just synthesize throwing that. It the error is never hit at runtime, no-one will notice. If it is hit, we're in UB land, and throwing an exception is allowed to happen just as anything else. A niche problem would be bounds checks with side effects; but I think if the semantics of |
Yes, that has always been an option, but we don't really have the capability to do this, because we'd need to compile two versions of every function, which even if we're ok with the perf overhead, the system isn't set up to do. |
It also is not currently considered a bug for boundschecking to have other side effects, which might be removed by inbounds. That should block concrete eval anyways though. The main issue is divergent results. The semantics might suggest it may be okay to unconditionally run the code with boundschecking on, even if that means the result might not be consistent? (Of course, we can't do that if the user passes the flag, since that flag forces execution to be inconsistent then) |
This PR introduces two main modifications: 1.) enhances the analysis of divergent results (inconsistencies) due to the use of @eval code_typed() do
index_into_array($(Expr(:new, LessThan5, 6)))
end |
The
Given the cache is already divided because of the |
`Core._svec_ref` has accepted `boundscheck`-value as the first argument since it was added in #45062. Nonetheless, `Core._svec_ref` simply calls `jl_svec_ref` in either the interpreter or the codegen, and thus the `boundscheck` value isn't utilized in any optimizations. Rather, even worse, this `boundscheck`-argument negatively influences the effect analysis (xref #50167 for details) and has caused type inference regressions as reported in #50544. For these reasons, this commit simply eliminates the `boundscheck` argument from `Core._svec_ref`. Consequently, `getindex(::SimpleVector, ::Int)` is now being concrete-eval eligible. closes #50544
`Core._svec_ref` has accepted `boundscheck`-value as the first argument since it was added in #45062. Nonetheless, `Core._svec_ref` simply calls `jl_svec_ref` in either the interpreter or the codegen, and thus the `boundscheck` value isn't utilized in any optimizations. Rather, even worse, this `boundscheck`-argument negatively influences the effect analysis (xref #50167 for details) and has caused type inference regressions as reported in #50544. For these reasons, this commit simply eliminates the `boundscheck` argument from `Core._svec_ref`. Consequently, `getindex(::SimpleVector, ::Int)` is now being concrete-eval eligible. closes #50544
`Core._svec_ref` has accepted `boundscheck`-value as the first argument since it was added in #45062. Nonetheless, `Core._svec_ref` simply calls `jl_svec_ref` in either the interpreter or the codegen, and thus the `boundscheck` value isn't utilized in any optimizations. Rather, even worse, this `boundscheck`-argument negatively influences the effect analysis (xref #50167 for details) and has caused type inference regressions as reported in #50544. For these reasons, this commit simply eliminates the `boundscheck` argument from `Core._svec_ref`. Consequently, `getindex(::SimpleVector, ::Int)` is now being concrete-eval eligible. closes #50544
`Core._svec_ref` has accepted `boundscheck`-value as the first argument since it was added in #45062. Nonetheless, `Core._svec_ref` simply calls `jl_svec_ref` in either the interpreter or the codegen, and thus the `boundscheck` value isn't utilized in any optimizations. Rather, even worse, this `boundscheck`-argument negatively influences the effect analysis (xref #50167 for details) and has caused type inference regressions as reported in #50544. For these reasons, this commit simply eliminates the `boundscheck` argument from `Core._svec_ref`. Consequently, `getindex(::SimpleVector, ::Int)` is now being concrete-eval eligible. closes #50544
`Core._svec_ref` has accepted `boundscheck`-value as the first argument since it was added in #45062. Nonetheless, `Core._svec_ref` simply calls `jl_svec_ref` in either the interpreter or the codegen, and thus the `boundscheck` value isn't utilized in any optimizations. Rather, even worse, this `boundscheck`-argument negatively influences the effect analysis (xref #50167 for details) and has caused type inference regressions as reported in #50544. For these reasons, this commit simply eliminates the `boundscheck` argument from `Core._svec_ref`. Consequently, `getindex(::SimpleVector, ::Int)` is now being concrete-eval eligible. closes #50544
In the current state of the Julia compiler, bounds checking and its related optimization code, such as
@boundscheck
and@inbounds
, pose a significant handicap for effect analysis. As a result, we're encountering an ironic situation where the application of@inbounds
annotations, which are intended to optimize performance, instead obstruct the program's optimization, thereby preventing us from achieving optimal performance.This PR is designed to resolve this situation. It aims to enhance the relationship between bounds checking and effect analysis, thereby correctly improving the performance of programs that have
@inbounds
annotations.In the following, I'll first explain the reasons that have led to this situation for better understanding, and then I'll present potential improvements to address these issues. This commit is a collection of various improvement proposals. It's necessary that we incorporate all of them simultaneously to enhance the situation without introducing any regressions.
Core of the Problem
There are fundamentally two reasons why effect analysis of code containing bounds checking is difficult:
Expr(:boundscheck)
is influenced by the@inbounds
macro and the--check-bounds
flag. Hence, when performing a concrete evaluation of a method containingExpr(:boundscheck)
, it's crucial to respect the@inbounds
macro context and the--check-bounds
settings, ensuring the method's behavior is consistent across the compile time concrete evaluation and the runtime execution.@inbounds
or--check-bounds=no
, is unsafe, it may lead to undefined behavior due to uncertain memory access.Current State
The current Julia compiler handles these two problems as follows:
Current State 1
Regarding the first problem, if a code or method call containing
Expr(:boundscheck)
is within an@inbounds
context, a concrete evaluation is immediately prohibited. For instance, in the following case, when analyzingbar()
, if you simply perform concrete evaluation offoo()
, it wouldn't properly respect the@inbounds
context present inbar()
. However, since the concrete evaluation offoo()
is prohibited, it doesn't pose an issue:Conversely, in the following case, there is no need to prohibit the concrete evaluation of
A1_inbounds
due to the presence of@inbounds
. This is becausethe execution of the@boundscheck
block is determined by the presence of local@inbounds
Expr(:boundscheck)
within a local@inbounds
context does not need to block concrete evaluation:However, currently, we prohibit the concrete evaluation of such code as
well.
Moreover, we are not handling such local
@inbounds
contextseffectively, which results in incorrect execution of
A1_inbounds()
(even our test is incorrect for this example:
https://github.com/JuliaLang/julia/blob/834aad4ab409f4ba65cbed2963b9ab6fa2770354/test/boundscheck_exec.jl#L34
)EDIT It is an expected behavior as pointed out by Jameson.
Furthermore, there is room for improvement when the
--check-bounds
flag is specified. Specifically, when the--check-bounds
flag is set, the evaluation value ofExpr(:boundscheck)
is determined irrespective of the@inbounds
context. Hence, there is no need to prohibit concrete evaluations due to inconsistency in the evaluation value ofExpr(:boundscheck)
.Current State 2
Next, we've ensured that concrete evaluation isn't performed when there's potentially unsafe code that may have bounds checking removed, or when the
--check-bounds=no
flag is set, which could lead to bounds checking being removed always.For instance, if you perform concrete evaluation for the function call
baz((1,2,3), 4)
in the following example, it may return a value accessed from illicit memory and introduce undefined behaviors into the program:However, it's evident that the above code is incorrect and unsafe program and I believe undefined behavior in such programs is deemed, as explicitly stated in the
@inbounds
documentation:Actually, the
@inbounds
macro is primarily an annotation to "improve performance by removing bounds checks from safe programs". Therefore, I opine that it would be more reasonable to utilize it to alleviate potential errors due to bounds checking within@inbounds
contexts.To bring up another associated concern, in the current compiler implementation, the
:nothrow
modelings forgetfield
/arrayref
/arrayset
is a bit risky, and:nothrow
-ness is assumed when their bounds checking is turned off by call argument.If our intended direction aligns with the removal of bounds checking based on
@inbounds
as proposed in issue #48245, then assuming:nothrow
-ness due to@inbounds
seems reasonable. However, presuming:nothrow
-ness due to bounds checking argument or the--check-bounds
flag appears to be risky, especially considering it's not documented.This Commit
This commit implements all proposed improvements against the current issues as mentioned above. In summary, the enhancements include:
@inbounds
contextExpr(:boundscheck)
when the--check-bounds
flag is set (and allow concrete evaluation):nothrow
effect bit toUInt8
type, and refining:nothrow
information when in an@inbounds
context:nothrow
-ness for built-in functions when bounds checking is turned off@_safeindex
hack with@inbounds