-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
package precompile is slow #15048
Comments
this probably due to some code that DataFrames is conditionally including when it detects precompile mode |
Really? |
If you have a dependency chain of N stale modules (module n depends on module n–1 for n=2…N), loading them without compilation is O(N), but precompiling them currently seems to be O(N^2): my understanding is that N precompilation processes are launched, and the n-th process needs to load (from cache) the previous n-1 modules, so that O(N^2) modules are reconstituted from the cache overall. Maybe we can do something like what |
I think part of the problem here is warmup cost. Precompilation assumes that will be very low (because it shouldn't require any actual work), but master is much slower (3x) than v0.4 was. Since startup costs are applied n^2 times, this could quickly become significant:
|
I do see a bit of improvement; precompiling DataFrames now takes 38 sec instead of 47, but this is still far slower than the 11 seconds taken to load it without precompiling. |
hm, ok, no worries, i'm still looking at ways of driving down some of the remaining cost overhead. i had seen a bigger drop on my machine, but i think that might be because i'm using llvm3.3 and the compile-time penalty for the remaining items is still significant against llvm3.7? can you test |
On my machine |
After #15934, I get 35 seconds and 0.25 seconds. |
this cuts of a couple more seconds: diff --git a/src/gf.c b/src/gf.c
index 7437d66..ea8af0d 100644
--- a/src/gf.c
+++ b/src/gf.c
@@ -1782,7 +1782,7 @@ void jl_method_table_insert(jl_methtable_t *mt, jl_tupletype_t *type, jl_tuplety
method_overwrite(newentry, (jl_method_t*)oldvalue);
else
check_ambiguous_matches(mt->defs, newentry);
- invalidate_conflicting(&mt->cache, (jl_value_t*)type, (jl_value_t*)mt);
+ //invalidate_conflicting(&mt->cache, (jl_value_t*)type, (jl_value_t*)mt);
update_max_args(mt, type);
if (jl_newmeth_tracer)
jl_newmeth_tracer(method); |
#16125 gave me another idea on how to cut down this time. we don't really need to issue multiple copies of ambiguous match warnings. so we can simply skip that step when deserializing a method during incremental compilation: diff --git a/src/gf.c b/src/gf.c
index ac6c7ea..866d065 100644
--- a/src/gf.c
+++ b/src/gf.c
@@ -868,7 +868,7 @@ void jl_method_table_insert(jl_methtable_t *mt, jl_tupletype_t *type, jl_tuplety
type, tvars, simpletype, jl_emptysvec, (jl_value_t*)method, 0, &method_defs, &oldvalue);
if (oldvalue)
method_overwrite(newentry, (jl_method_t*)oldvalue);
- else
+ else if (!jl_options.incremental || !jl_generating_output())
check_ambiguous_matches(mt->defs, newentry);
invalidate_conflicting(&mt->cache, (jl_value_t*)type, (jl_value_t*)mt);
update_max_args(mt, type); on my computer, this takes the time from 50s -> 40s (vs. 18s w/o cache) |
Nice. I guess we could add a |
I'm following up on an interrupted JuliaCon conversation, in which some concern was expressed whether precompilation was helping or hurting. My own personal perspective is that:
If no serious experiments in this direction have been done, let me know how I can help. I recognize that now may not be the time to add this to julia-0.5, but it should be considered for later development. |
For me the issue is the unpredictability when precompile will hit. It always hits me when I need a working julia instance now, not in 5 minutes (group of people in my office and I want to show them something). #16409 would eliminate this problem entirely for me. |
What is the acceptable overhead for which this can be actionably closed? |
0 |
I would say maybe 10%. |
almost getting there :)
|
is this the first time I see llvm 3.7.1 being faster at compiling than llvm 3.3? |
nope, just pasted the numbers backwards (fixed now) |
Slow precompile is a real killer of user experience. R and python users are used to loading most packages instantly. |
It takes too long to try some new code and debug. |
No you don't, use Revise.jl. |
Updated datapoint: # Precompiling
julia> @time using DataFrames
[ Info: Precompiling DataFrames [a93c6f00-e57d-5684-b7b6-d8193f3e46c0]
28.836874 seconds (3.78 M allocations: 231.428 MiB, 0.45% gc time)
# Loading with precompile
julia> @time using DataFrames
1.257824 seconds (2.62 M allocations: 174.873 MiB, 2.72% gc time)
# Loading with precompile disabled
julia> @time using DataFrames
14.189208 seconds (20.71 M allocations: 1.032 GiB, 3.25% gc time) |
Actually long precompile time affects you when you are using unit tests and have some dependencies. |
The original issue seems adequately addressed and this is good to close imo. |
Leaving it open for someone else to decide whether to close and open now specific issues, or to keep this open. |
+1 for closing this in favor of more specific issues. |
I disagree that this is fixed. Trying it on master, precompiling DataFrames takes 77 seconds (worse than ever) and loading it with --compiled-modules=no takes 35 seconds. |
I'm looking into the low-level function -> build error string -> string formatting and io code -> low-level function As one example, the following patch cuts the time to load DataFrames (--compiled-modules=no) from 35 to 21 seconds: diff --git a/base/array.jl b/base/array.jl
index f13c6ee..24f8745 100644
--- a/base/array.jl
+++ b/base/array.jl
@@ -279,7 +279,7 @@ end
# occurs, see discussion in #27874
function _throw_argerror(n)
@_noinline_meta
- throw(ArgumentError(string("tried to copy n=", n, " elements, but n should be nonnegative")))
+ throw(ArgumentError("oops"))
end
copyto!(dest::Array{T}, src::Array{T}) where {T} = copyto!(dest, 1, src, 1, length(src))
diff --git a/base/strings/io.jl b/base/strings/io.jl
index 71767ce..198635c 100644
--- a/base/strings/io.jl
+++ b/base/strings/io.jl
@@ -104,7 +104,7 @@ function sprint(f::Function, args...; context=nothing, sizehint::Integer=0)
end
tostr_sizehint(x) = 8
-tostr_sizehint(x::AbstractString) = lastindex(x)
+tostr_sizehint(x::String) = sizeof(x)
tostr_sizehint(x::Float64) = 20
tostr_sizehint(x::Float32) = 12 |
JuliaDB is an interesting case:
Another interesting data point: v1.2 passing |
Have you by any chance experimented with what happens if you turn |
I wonder if we can close this. First, tim@diva:~/src/julia-master/base$ JULIA_DEPOT_PATH=/tmp/pkg2 juliam -q
shell> rm -rf /tmp/pkg2/compiled/v1.8/DataFrames
julia> @time using DataFrames
[ Info: Precompiling DataFrames [a93c6f00-e57d-5684-b7b6-d8193f3e46c0]
2.729798 seconds (1.41 M allocations: 85.187 MiB, 0.38% gc time, 1.37% compilation time)
julia>
tim@diva:~/src/julia-master/base$ JULIA_DEPOT_PATH=/tmp/pkg2 juliam -q --compiled-modules=no
shell> rm -rf /tmp/pkg2/compiled/v1.8/DataFrames
julia> @time using DataFrames
5.334171 seconds (3.79 M allocations: 213.627 MiB, 0.76% gc time, 5.01% compilation time) |
Precompiling DataFrames takes ~47 seconds:
However if I just load it without precompilation, it takes only ~12 seconds:
The text was updated successfully, but these errors were encountered: