-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added some inbounds annotations in array.jl #24901
Conversation
base/array.jl
Outdated
@@ -1532,7 +1532,7 @@ julia> findnext(isodd, A, 2) | |||
function findnext(testf::Function, A, start::Integer) | |||
l = endof(A) | |||
i = start | |||
while i <= l | |||
@inbounds while i <= l |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will apply @inbounds
to the user-provided testf
as well.
There are also a few cases like this below.
For the Base it's better to use @inbounds
really "atomically".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will apply @inbounds to the user-provided testf as well.
Is this bad?
For the Base it's better to use @inbounds really "atomically".
Ok, I was not aware of this coding standard (I thought the code looks nicer if we pull out the inbounds as far as possible, instead of replicating it inside).
I'll change to comply. For cases a[i] = b[j]
, is there a preferred way of annotating inbounds separately for the getindex and setindex! ? The obvious way is to write bj=b[j];a[i]=bj
and then separately annotate it; but this looks somewhat ugly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this bad?
It's quite bad, because you are applying @inbounds
to an unknown code. E.g. if it's fresh untested code, it'll crash Julia session.
I thought the code looks nicer if we pull out the inbounds as far as possible
Then it's easy to overlook. If the code within the loop gets modified, the @inbounds
might no longer apply safely.
"Fine-grained" @inbounds
might look pedantic in the user code, but Base has higher standards.
For cases a[i] = b[j], is there a preferred way of annotating inbounds separately for the getindex and setindex! ?
If from the context it follows that both a[i]
and b[j]
accesses are safe, you can just do @inbounds a[i] = b[j]
, of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See discussion at #24335.
base/array.jl
Outdated
@@ -1749,7 +1749,7 @@ function find(A) | |||
cnt = 1 | |||
inds = _index_remapper(A) | |||
warned = false | |||
for (i,a) in enumerate(A) | |||
for (i,a) in enumerate(A) #not inbounds in mulithreaded if A is concurrently mutated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean by the comment exactly?
A
is read-only here (well, I
is not), and the loop is not multi-threaded IIUC.
If any other thread modifies A
it can lead to any kind of weird effects in any Base and non-Base function anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can safely corrupt memory if users decide to concurrently resize arrays (unless someone wants to go over array.c and check every single function).
On the other hand, if one thread writes to an array (increasing nnz) while another runs find, then we can return nonsense or throw an exception, but are not supposed to corrupt memory. Well, that is the previous behavior. Being conservative, I thought it better not to introduce security vulns in existing buggy-but-unexploitable programs, especially in my first PR to base, and for rather small performance gains.
Since it was not entirely obvious which accesses are inbounds in multithreaded, and I already spent the effort of thinking about it, I thought to better document it.
Or what is the intended behavior? Do you think memory-corrupting in this case is OK?
In practice, the two relevant changes are for pop! and push!; an always-predicted branch is not very expensive compared to the ccall, but it clears up the code_native / code_llvm (for people who like to see what their julia code actually produces), and push!/pop! are used quite extensively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I'm missing is why you think @inbounds while i <= l
above is ok and this enumerate(A)
is not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The enumerate is inbounds, the later write I[cnt] = inds[i]
has an inbounds getindex and a possibly out-of-bounds setindex!, if cnt>nnzA (because some other thread mutated A).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the clarification. Personally, I would not add the "concurrent mutation" comment here (and in the other cases below):
- as you see from our conversation, it's not explicit enough: it's not obvious that you are concerned about
I[cnt]
inbounds access (so it's rather should go to that line) - annotating for-loops with inbounds in Base code is a dangerous thing anyway
- concurrent mutation of
A
is possible, but it's quite exotic scenario. It's a nice thing to take into account from security perspective, but we need some core developers opinion whether the code should be systematically annotated with these comments.
OTOH, you can still use @inbounds Aij = A[i,j]
in the loops below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I placed the comments better now. If you think they are superfluous or misleading (because similar comments are possibly missing at other places), or that memory-corrupting in this case is admissible, feel free to modify my commit. I would not have commented / modified if the original version had been missing the bounds-check there, only raised an eyebrow. As I said, this PR tries to be conservative.
A less conservative modifiation in iterators.jl would be to make the getindex in enumerate(A::AbstractArray) inbounds. If the array is length-modified during the lifetime of the iterator, then we produce garbage anyway; the only question is whether we are willing to segfault / infoleak (oob reads typically don't lead to code exec, but this depends on the context and exploitable situations are possible). As you said, that's for some core dev to decide.
I think each of these needs a performance benchmark showing a significant (>25%) improvement, or we shouldn't do that one. |
@@ -1533,7 +1533,8 @@ function findnext(testf::Function, A, start::Integer) | |||
l = endof(A) | |||
i = start | |||
while i <= l | |||
if testf(A[i]) | |||
@inbounds ai = A[i] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this isn't inbounds on the first iteration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i < 1 kills us? Sorry, will revert.
base/array.jl
Outdated
I[cnt] = i | ||
@inbounds aij = A[i,j] | ||
if aij != 0 | ||
I[cnt] = i #not inbounds in mulithreaded if A is concurrently mutated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"multithreaded"
Re performance: The effect of inbounds on performance is mostly negligible (always predicted branch). I think the most relevant benchmarkeable effect is on codesize; a predicted branch is so much cheaper than a ccall. Here I am again not so much worried about instruction-cache / decoding, but rather about legibility of the resulting code_llvm and code_native. Due to the mostly negative reactions, discussions, and me making mistakes, I'd propose a compromise: Just stick to the four most important and most obvious @inbounds (shift!, unshift!, pop!, push!)? |
Regarding legibility of the generated code. I rather we add some sort of filtering to the generated code than putting Do you have any benchmarks for the |
Sure, but take them with a mountain of salt (the variability is gigantic, the test is unrealistic, the functions are usually inlined and everything depends on context). But remember: push!/pop! are used in a lot of places, both in base and in user-code, and are typically inlined. The code looks so weird because I need a lot of calls in order to get the timing at least semi-stable, and sequentially writing a big percentage of my memory rather tests my OS at swapping.
yielding
If you squint long enough you could guess at 1% improvement, tops, but realistically irrelevant (how would you test this reliably? Normally you just measure allocator state and background noise). If you look at the code_native of push_ib! for a vector of Int, you see:
whereas push_nib!:
push! is inline and used often, hence we get a lot of bounds-check code emitted. The offending setindex! is literally one code-line below the ccall that makes it inbounds. |
As far as I gathered concurrent resizes of arrays are forbidden, but concurrent modifications of values are allowed.
I added some obvious inbounds, and commented why some others cannot be safely added.