Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trouble with filter(!ismissing, x) #27809

Closed
GunnarFarneback opened this issue Jun 26, 2018 · 11 comments
Closed

trouble with filter(!ismissing, x) #27809

GunnarFarneback opened this issue Jun 26, 2018 · 11 comments
Labels
bug Indicates an unexpected problem or unintended behavior missing data Base.missing and related functionality potential benchmark Could make a good benchmark in BaseBenchmarks

Comments

@GunnarFarneback
Copy link
Contributor

Summary: filter(!ismissing, x) gives wrong results, is exceedingly slow, and occasionally leads to crashes. I would suspect code generation goes wrong.

As an accidental spinoff from #27781 I ran into this, which gives unexpected results and slowness.

   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: https://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.7.0-beta.36 (2018-06-26 19:43 UTC)
 _/ |\__'_|_|_|\__'_|  |  Commit afbd699 (0 days old master)
|__/                   |  x86_64-linux-gnu

julia> y = rand(Float64, 1_000_000); y2 = ifelse.(rand(length(y)) .< 0.9, y, missing);

julia> a = collect(skipmissing(y2));

julia> b = filter(!ismissing, y2);

julia> a == b
missing

julia> sum(skipmissing(a))
450111.2603673207

julia> sum(skipmissing(b))
450101.5954636309

julia> @time collect(skipmissing(y2));
  0.015578 seconds (25 allocations: 9.001 MiB)

julia> @time filter(!ismissing, y2);
  7.770327 seconds (24 allocations: 10.126 MiB, 0.02% gc time)

julia> versioninfo()
Julia Version 0.7.0-beta.36
Commit afbd699 (2018-06-26 19:43 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.0 (ORCJIT, skylake)

Increasing the size from one million to ten million elements increases the running time from 8 seconds to 2700 seconds. Moreover I've had Julia crashing on me twice while experimenting with this, including this complete session:

               _
   _       _ _(_)_     |  A fresh approach to technical computing
  (_)     | (_) (_)    |  Documentation: https://docs.julialang.org
   _ _   _| |_  __ _   |  Type "?" for help, "]?" for Pkg help.
  | | | | | | |/ _` |  |
  | | |_| | | | (_| |  |  Version 0.7.0-beta.36 (2018-06-26 19:43 UTC)
 _/ |\__'_|_|_|\__'_|  |  Commit afbd699 (0 days old master)
|__/                   |  x86_64-linux-gnu

julia> y = rand(Float64, 1_000_000); y2 = ifelse.(rand(length(y)) .< 0.9, y, missing);

julia> 

julia> a = collect(skipmissing(y2));

julia> b = filter(!ismissing, y2);

julia> a == b
Unreachable reached at 0x7f7bc37cff38

signal (4): Illegal instruction
in expression starting at no file:0
iterate at ./array.jl:682 [inlined]
zip_iterate at ./iterators.jl:303 [inlined]
iterate at ./iterators.jl:324 [inlined]
== at ./abstractarray.jl:1679
unknown function (ip: 0x7f7bc37d000c)
jl_fptr_trampoline at /home/gunnar/julia0.7/src/gf.c:1813
jl_apply_generic at /home/gunnar/julia0.7/src/gf.c:2151
do_call at /home/gunnar/julia0.7/src/interpreter.c:324
eval_value at /home/gunnar/julia0.7/src/interpreter.c:428
eval_stmt_value at /home/gunnar/julia0.7/src/interpreter.c:363 [inlined]
eval_body at /home/gunnar/julia0.7/src/interpreter.c:671
jl_interpret_toplevel_thunk_callback at /home/gunnar/julia0.7/src/interpreter.c:788
unknown function (ip: 0xfffffffffffffffe)
unknown function (ip: 0x7f7bc510f78f)
unknown function (ip: 0xffffffffffffffff)
jl_interpret_toplevel_thunk at /home/gunnar/julia0.7/src/interpreter.c:797
jl_toplevel_eval_flex at /home/gunnar/julia0.7/src/toplevel.c:814
jl_toplevel_eval_in at /home/gunnar/julia0.7/src/builtins.c:631
eval at ./boot.jl:319
jl_apply_generic at /home/gunnar/julia0.7/src/gf.c:2151
eval_user_input at /home/gunnar/julia0.7/usr/share/julia/stdlib/v0.7/REPL/src/REPL.jl:85
macro expansion at /home/gunnar/julia0.7/usr/share/julia/stdlib/v0.7/REPL/src/REPL.jl:116 [inlined]
#28 at ./task.jl:257
jl_fptr_trampoline at /home/gunnar/julia0.7/src/gf.c:1813
jl_apply_generic at /home/gunnar/julia0.7/src/gf.c:2151
jl_apply at /home/gunnar/julia0.7/src/julia.h:1533 [inlined]
start_task at /home/gunnar/julia0.7/src/task.c:268
unknown function (ip: 0xffffffffffffffff)
Allocations: 4333365 (Pool: 4332618; Big: 747); GC: 9
Illegal instruction (core dumped)
@simonbyrne
Copy link
Contributor

To deal with the immediate problem, we could define a special case

Base.filter(::typeof(!ismissing), X::Vector) = collect(skipmissing(X))

@JeffBezanson JeffBezanson added bug Indicates an unexpected problem or unintended behavior performance Must go faster labels Jun 27, 2018
@GunnarFarneback
Copy link
Contributor Author

GunnarFarneback commented Jun 27, 2018

This gives me a fairly reproducible error (not crash), usually in the third iteration, occasionally later.

using Random
srand(13);
y = rand(Float64, 500_000);
y2 = ifelse.(rand(length(y)) .< 0.9, y, missing);
for k = 1:10
    b = filter(!ismissing, y2)
    @show k
    sum(b)
end

The error looks like this:

ERROR: fatal error in type inference (type bound)
Stacktrace:
 [1] macro expansion at ./reduce.jl:84 [inlined]
 [2] macro expansion at ./simdloop.jl:73 [inlined]
 [3] mapreduce_impl(::typeof(identity), ::typeof(Base.add_sum), ::Array{Union{Missing, Float64},1}, ::Int64, ::Int64, ::Int64) at ./reduce.jl:193
 [4] mapreduce_impl(::typeof(identity), ::typeof(Base.add_sum), ::Array{Union{Missing, Float64},1}, ::Int64, ::Int64, ::Int64) at ./reduce.jl:200
 [5] mapreduce_impl(::typeof(identity), ::typeof(Base.add_sum), ::Array{Union{Missing, Float64},1}, ::Int64, ::Int64, ::Int64) at ./reduce.jl:197
 [6] mapreduce_impl(::typeof(identity), ::typeof(Base.add_sum), ::Array{Union{Missing, Float64},1}, ::Int64, ::Int64, ::Int64) at ./reduce.jl:200
 [7] mapreduce_impl(::typeof(identity), ::typeof(Base.add_sum), ::Array{Union{Missing, Float64},1}, ::Int64, ::Int64, ::Int64) at ./reduce.jl:197
 [8] mapreduce_impl(::typeof(identity), ::typeof(Base.add_sum), ::Array{Union{Missing, Float64},1}, ::Int64, ::Int64, ::Int64) at ./reduce.jl:200
 [9] mapreduce_impl(::typeof(identity), ::typeof(Base.add_sum), ::Array{Union{Missing, Float64},1}, ::Int64, ::Int64, ::Int64) at ./reduce.jl:197 (repeats 2 times)
 [10] mapreduce_impl(::typeof(identity), ::typeof(Base.add_sum), ::Array{Union{Missing, Float64},1}, ::Int64, ::Int64, ::Int64) at ./reduce.jl:200
 [11] mapreduce_impl(::typeof(identity), ::typeof(Base.add_sum), ::Array{Union{Missing, Float64},1}, ::Int64, ::Int64, ::Int64) at ./reduce.jl:197
 [12] mapreduce_impl at ./reduce.jl:207 [inlined]
 [13] _mapreduce(::typeof(identity), ::typeof(Base.add_sum), ::IndexLinear, ::Array{Union{Missing, Float64},1}) at ./reduce.jl:355
 [14] _mapreduce_dim at ./reducedim.jl:280 [inlined]
 [15] #mapreduce#537 at ./reducedim.jl:276 [inlined]
 [16] mapreduce at ./reducedim.jl:276 [inlined]
 [17] _sum at ./reducedim.jl:633 [inlined]
 [18] _sum at ./reducedim.jl:632 [inlined]
 [19] #sum#540 at ./reducedim.jl:628 [inlined]
 [20] sum(::Array{Union{Missing, Float64},1}) at ./reducedim.jl:628
 [21] macro expansion at ./show.jl:549 [inlined]
 [22] top-level scope at ./REPL[5]:3 [inlined]
 [23] top-level scope at ./<missing>:0

@GunnarFarneback
Copy link
Contributor Author

And this minor variation consistently crashes within 2-7 iterations.

using Random
srand(13);
y = rand(Float64, 500_000);
y2 = ifelse.(rand(length(y)) .< 0.9, y, missing);
a = collect(skipmissing(y2));
for k = 1:10
    b = filter(!ismissing, y2)
    @show k
    a == b
end

Basically the same backtrace as before:

Unreachable reached at 0x7f892737b508

signal (4): Illegal instruction
in expression starting at no file:0
iterate at ./array.jl:682 [inlined]
zip_iterate at ./iterators.jl:303 [inlined]
iterate at ./iterators.jl:324 [inlined]
== at ./abstractarray.jl:1679
unknown function (ip: 0x7f892737b5dc)
jl_apply_generic at /home/gunnar/julia0.7/src/gf.c:2151
macro expansion at ./show.jl:549 [inlined]
top-level scope at ./REPL[6]:3 [inlined]
top-level scope at ./<missing>:0
jl_fptr_trampoline at /home/gunnar/julia0.7/src/gf.c:1813
jl_toplevel_eval_flex at /home/gunnar/julia0.7/src/toplevel.c:808
jl_toplevel_eval_in at /home/gunnar/julia0.7/src/builtins.c:631
eval at ./boot.jl:319
jl_apply_generic at /home/gunnar/julia0.7/src/gf.c:2151
eval_user_input at /home/gunnar/julia0.7/usr/share/julia/stdlib/v0.7/REPL/src/REPL.jl:85
macro expansion at /home/gunnar/julia0.7/usr/share/julia/stdlib/v0.7/REPL/src/REPL.jl:116 [inlined]
#28 at ./task.jl:257
jl_fptr_trampoline at /home/gunnar/julia0.7/src/gf.c:1813
jl_apply_generic at /home/gunnar/julia0.7/src/gf.c:2151
jl_apply at /home/gunnar/julia0.7/src/julia.h:1533 [inlined]
start_task at /home/gunnar/julia0.7/src/task.c:268
unknown function (ip: 0xffffffffffffffff)
Allocations: 4783965 (Pool: 4783125; Big: 840); GC: 10
Illegal instruction (core dumped)

@nalimilan nalimilan added compiler:inference Type inference missing data Base.missing and related functionality labels Jun 27, 2018
@martinholters
Copy link
Member

julia> b = Vector{Union{Missing, Float64}}()
0-element Array{Union{Missing, Float64},1}

julia> for i in 1:150_000
           push!(b, 0.0)
       end

julia> any(ismissing, b)
true

julia> findfirst(ismissing, b)
131065

Reminds me of #27767, cc @quinnj.

@martinholters
Copy link
Member

This is reproducible here:

julia> b = Vector{Union{Missing, Float64}}()
0-element Array{Union{Missing, Float64},1}

julia> while length(b) < 2^17
           push!(b, 0.0)
       end

julia> any(ismissing, b)
false

julia> push!(b, 0.0);

julia> any(ismissing, b)
true

julia> findall(ismissing, b)
8-element Array{Int64,1}:
 131065
 131066
 131067
 131068
 131069
 131070
 131071
 131072

@quinnj
Copy link
Member

quinnj commented Jun 27, 2018

Taking a look

@JeffBezanson
Copy link
Member

The performance issue is caused by O(n^2) time in memmove at array.c:844.

@vtjnash vtjnash removed performance Must go faster compiler:inference Type inference labels Jun 27, 2018
@vtjnash
Copy link
Member

vtjnash commented Jun 27, 2018

Split off performance improvement to #27825

quinnj added a commit that referenced this issue Jul 2, 2018
…g realloc was causing problems for isbits Union arrays, which have different layout requirements than regular arrays
@GunnarFarneback
Copy link
Contributor Author

GunnarFarneback commented Jul 9, 2018

With

Julia Version 0.7.0-beta.206
Commit b6f9244 (2018-07-08 19:42 UTC)

I can no longer reproduce the Illegal instruction but instead get another ERROR: fatal error in type inference (type bound) from that example.
This still crashes though:

using Random
srand(13);
y = rand(Float64, 500_000);
y2 = ifelse.(rand(length(y)) .< 0.9, y, missing);
a = collect(skipmissing(y2));
for k = 1:10
    global b = filter(!ismissing, y2)
    @show k
    a == b
end
i = findfirst(ismissing, b)
for j = 0:10
    @show i + j
    b[i + j] == 0.0
end

with

signal (11): Segmentation fault
in expression starting at no file:0
jl_typemap_level_assoc_exact at /home/gunnar/julia0.7/src/typemap.c:826
jl_typemap_assoc_exact at /home/gunnar/julia0.7/src/julia_internal.h:882 [inlined]
jl_lookup_generic_ at /home/gunnar/julia0.7/src/gf.c:2110 [inlined]
jl_apply_generic at /home/gunnar/julia0.7/src/gf.c:2156
macro expansion at ./show.jl:549 [inlined]
top-level scope at ./REPL[10]:2 [inlined]
top-level scope at ./<missing>:0
jl_fptr_trampoline at /home/gunnar/julia0.7/src/gf.c:1821
jl_toplevel_eval_flex at /home/gunnar/julia0.7/src/toplevel.c:815
jl_toplevel_eval_in at /home/gunnar/julia0.7/src/builtins.c:633
eval at ./boot.jl:319
jl_apply_generic at /home/gunnar/julia0.7/src/gf.c:2159
eval_user_input at /home/gunnar/julia0.7/usr/share/julia/stdlib/v0.7/REPL/src/REPL.jl:85
macro expansion at /home/gunnar/julia0.7/usr/share/julia/stdlib/v0.7/REPL/src/REPL.jl:116 [inlined]
#28 at ./task.jl:257
jl_fptr_trampoline at /home/gunnar/julia0.7/src/gf.c:1821
jl_apply_generic at /home/gunnar/julia0.7/src/gf.c:2159
jl_apply at /home/gunnar/julia0.7/src/julia.h:1533 [inlined]
start_task at /home/gunnar/julia0.7/src/task.c:268
unknown function (ip: 0xffffffffffffffff)
Allocations: 15029937 (Pool: 15027118; Big: 2819); GC: 34
Segmentation fault (core dumped)

quinnj added a commit that referenced this issue Jul 22, 2018
…r jl_array_t* with potential a->offset and a->maxsize values. Before, any resizing operations on isbits Union arrays (push, pushfirst, insert, append, deleteat, slice, etc.) incurred extra cost by having to constantly move the type tag bytes to be directly after the last array element. By treating them rather as jl_array_t* objects, sharing the a->offset and a->maxsize fields with the parent array, resizing operations now only require moving bytes around if the parent array's data itself must be moved. Fixes #27825 and #27809.
quinnj added a commit that referenced this issue Jul 22, 2018
…r jl_array_t* with potential a->offset and a->maxsize values. Before, any resizing operations on isbits Union arrays (push, pushfirst, insert, append, deleteat, slice, etc.) incurred extra cost by having to constantly move the type tag bytes to be directly after the last array element. By treating them rather as jl_array_t* objects, sharing the a->offset and a->maxsize fields with the parent array, resizing operations now only require moving bytes around if the parent array's data itself must be moved. Fixes #27825 and #27809.
Liozou pushed a commit to Liozou/julia that referenced this issue Jul 23, 2018
…r jl_array_t* with potential a->offset and a->maxsize values. Before, any resizing operations on isbits Union arrays (push, pushfirst, insert, append, deleteat, slice, etc.) incurred extra cost by having to constantly move the type tag bytes to be directly after the last array element. By treating them rather as jl_array_t* objects, sharing the a->offset and a->maxsize fields with the parent array, resizing operations now only require moving bytes around if the parent array's data itself must be moved. Fixes JuliaLang#27825 and JuliaLang#27809.
@JeffBezanson
Copy link
Member

@quinnj It seems this is fixed now?

@quinnj quinnj closed this as completed Jul 24, 2018
@quinnj
Copy link
Member

quinnj commented Jul 24, 2018

Yes, this is fixed now.

@KristofferC KristofferC added the potential benchmark Could make a good benchmark in BaseBenchmarks label Jul 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior missing data Base.missing and related functionality potential benchmark Could make a good benchmark in BaseBenchmarks
Projects
None yet
Development

No branches or pull requests

8 participants