-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
findall has strange behavior for impure predicates on Bool Arrays #46425
Comments
potential candidate to tag for 1.10? |
Oops! This is broken on 1.9 and not on 1.8, so we probably should have put it on the 1.9 milestone and held back the release until it was fixed. Thanks for the bump. |
In commit 4c4c94f, findall(f, A::AbstractArray{Bool}) was optimised by using a technique where A was traversed twice: Once to count the number of true elements and once to fill in the resulting vector. However, this could cause problems for arbitrary functions f: For slow f, the approach is ~2x slower. For impure f, f being called twice could cause side effects and strange issues (see issue JuliaLang#46425) With this commit, the optimised version is only dispatched to when f is ! or xor.
In commit 4c4c94f, findall(f, A::AbstractArray{Bool}) was optimised by using a technique where A was traversed twice: Once to count the number of true elements and once to fill in the resulting vector. However, this could cause problems for arbitrary functions f: For slow f, the approach is ~2x slower. For impure f, f being called twice could cause side effects and strange issues (see issue JuliaLang#46425) With this commit, the optimised version is only dispatched to when f is ! or identity.
#42202 should be reverted since it has an invalid use of |
To be clear, pure here means |
Note that querying the compiler for effects is permissible to deal with these sorts of cases if the performance is important. |
Not returning
3
is bad. Returing4
could lead to segfaults and is definitely a bug.We have two implementations of
findall(f::Function, a::AbstractArray)
f.(a)
and then get indices from theBitArray
julia/base/array.jl
Line 2328 in aac466f
count(f, a)
, preallocate an index array, and iterate through a second time recomputingf(a)
julia/base/array.jl
Lines 2375 to 2393 in aac466f
For simple
f
, 2 is about 1.5x faster according to my rough benchmarks. If the runtime off
dominates, then 1 should be 2x faster. Iff
is impure then 1 behaves how one would expect and 2 can have bizarre consequences.Ideally, we dispatch to 2 for simple pure
f
and 1 otherwise. Our current heuristic is to dispatch to 2 fora::AbstractArray{Bool}
and 1 otherwise. This is a bad heuristic. It would be cool to dispatch based on effect analysis, but if that is not an option, my preference is to usef === identity
as the heuristic (even though this is a performance hit in some cases).I suspect this was introduced by #42202 (cc @jakobnissen) which fixed #42187.
The text was updated successfully, but these errors were encountered: