Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel example fails on julia 1.1 #119

Open
davidanthoff opened this issue Mar 2, 2019 · 6 comments
Open

Parallel example fails on julia 1.1 #119

davidanthoff opened this issue Mar 2, 2019 · 6 comments

Comments

@davidanthoff
Copy link
Contributor

This is what I get with master:

~\.julia\dev\BlackBoxOptim\examples [master ≡]> julia .\rosenbrock_parallel.jl
Starting optimization with optimizer XNESOpt{Float64,RandomBound{ContinuousRectSearchSpace}}
0.00 secs, 0 evals, 0 steps
sigma=1.0 |trace(ln_B)|=0.0
0.54 secs, 104 evals, 8 steps, fitness=122353.746434742
sigma=0.9982051576798082 |trace(ln_B)|=5.637851296924623e-18
1.07 secs, 247 evals, 19 steps, fitness=32111.868584104
sigma=0.9931794473616752 |trace(ln_B)|=5.204170427930421e-17
1.60 secs, 416 evals, 32 steps, fitness=15074.967363068
sigma=0.9852835141237257 |trace(ln_B)|=6.591949208711867e-17
2.11 secs, 572 evals, 44 steps, fitness=15074.967363068
sigma=0.9784491185053312 |trace(ln_B)|=5.898059818321144e-17
2.64 secs, 728 evals, 56 steps, fitness=12181.224815595
sigma=0.9693998255007409 |trace(ln_B)|=4.5102810375396984e-17
3.18 secs, 897 evals, 69 steps, fitness=11200.377032287
sigma=0.9611088605732498 |trace(ln_B)|=5.551115123125783e-17
3.70 secs, 1053 evals, 81 steps, fitness=11200.377032287
sigma=0.9541631679900614 |trace(ln_B)|=-8.326672684688674e-17
4.21 secs, 1209 evals, 93 steps, fitness=11200.377032287
sigma=0.947030998226623 |trace(ln_B)|=-2.0816681711721685e-17
4.72 secs, 1365 evals, 105 steps, fitness=10060.451657270
sigma=0.9395602410561272 |trace(ln_B)|=-4.163336342344337e-17
5.24 secs, 1521 evals, 117 steps, fitness=10060.451657270
sigma=0.9332730422696129 |trace(ln_B)|=-9.71445146547012e-17
5.77 secs, 1677 evals, 129 steps, fitness=10060.451657270
sigma=0.9258708463911819 |trace(ln_B)|=-9.71445146547012e-17
6.29 secs, 1833 evals, 141 steps, fitness=10060.451657270
sigma=0.9196027642968452 |trace(ln_B)|=-2.0816681711721685e-16
6.79 secs, 1989 evals, 153 steps, fitness=10060.451657270
sigma=0.9133177227106547 |trace(ln_B)|=-1.942890293094024e-16
7.33 secs, 2158 evals, 166 steps, fitness=9373.475368805
sigma=0.9069943095862962 |trace(ln_B)|=-4.163336342344337e-17
7.87 secs, 2327 evals, 179 steps, fitness=9373.475368805
sigma=0.9023805920988747 |trace(ln_B)|=-8.326672684688674e-17
8.38 secs, 2483 evals, 191 steps, fitness=9373.475368805
sigma=0.8954902893404368 |trace(ln_B)|=-1.3877787807814457e-16
8.91 secs, 2652 evals, 204 steps, fitness=8583.999338502
sigma=0.8892181396948056 |trace(ln_B)|=-2.498001805406602e-16
9.45 secs, 2808 evals, 216 steps, fitness=8583.999338502
sigma=0.8820132750980565 |trace(ln_B)|=-1.3877787807814457e-16
9.98 secs, 2977 evals, 229 steps, fitness=8583.999338502
sigma=0.8731894893256401 |trace(ln_B)|=-5.551115123125783e-17
10.50 secs, 3133 evals, 241 steps, fitness=7489.584123589
sigma=0.8679233801696926 |trace(ln_B)|=-2.498001805406602e-16
11.00 secs, 3289 evals, 253 steps, fitness=7489.584123589
sigma=0.8608442649100014 |trace(ln_B)|=-2.498001805406602e-16
11.52 secs, 3445 evals, 265 steps, fitness=7122.457056131
sigma=0.855225980069676 |trace(ln_B)|=-1.3877787807814457e-16
12.02 secs, 3601 evals, 277 steps, fitness=7122.457056131
sigma=0.8480777088289297 |trace(ln_B)|=-1.942890293094024e-16
12.54 secs, 3757 evals, 289 steps, fitness=6015.103831160
sigma=0.8411899889586818 |trace(ln_B)|=-1.942890293094024e-16
13.07 secs, 3913 evals, 301 steps, fitness=6015.103831160
sigma=0.8342904768757995 |trace(ln_B)|=-2.498001805406602e-16
13.61 secs, 4082 evals, 314 steps, fitness=4780.282264771
sigma=0.8269620765403849 |trace(ln_B)|=-2.498001805406602e-16
14.12 secs, 4238 evals, 326 steps, fitness=4780.282264771
sigma=0.8205543861803173 |trace(ln_B)|=-1.3877787807814457e-16
14.63 secs, 4394 evals, 338 steps, fitness=4780.282264771
sigma=0.8130142586751085 |trace(ln_B)|=-2.7755575615628914e-16
15.13 secs, 4550 evals, 350 steps, fitness=4780.282264771
sigma=0.8085303307890455 |trace(ln_B)|=-3.608224830031759e-16
15.64 secs, 4706 evals, 362 steps, fitness=4780.282264771
sigma=0.8027150651112601 |trace(ln_B)|=-1.3877787807814457e-16
16.15 secs, 4862 evals, 374 steps, fitness=4780.282264771
sigma=0.7948282044027354 |trace(ln_B)|=-3.0531133177191805e-16

Optimization stopped after 385 steps and 16.61 seconds
Termination reason: Max number of function evaluations (5000) reached
Steps per second = 23.18
Function evals per second = 301.32
Improvements/step = NaN
Total function evaluations = 5005


Best candidate found: [-0.117691, -0.549153, 1.62825, 0.128071, -0.49382, -0.298501, -0.650996, 0.40877, -0.107746, 0.896346, -0.535518, 0.178234, -1.1461, 1.14015, -0.72947, -0.404426, 0.506873, -0.00404523, 0.141733, -0.0141665, 0.475099, 0.851817, -0.153351, 0.702384, 0.593715, -0.610469, 1.86743, 1.31347, -0.0329012, 0.936947, -0.180541, 0.624921, -0.337358, 0.142574, -0.162826, -0.544692, -0.946747, -0.284673, 0.656164, -0.553401, -0.461865, 0.00229038, 0.0829926, 0.780216, 0.999123, -0.903184, -0.0683573, -0.167501, 0.660591, 1.78841]

Fitness: 4780.282264771

ERROR: LoadError: StackOverflowError:
deserialize(::Distributed.ClusterSerializer{Sockets.TCPSocket}, ::Type{RemoteChannel{Channel{BlackBoxOptim.ParallelEvaluatorWorker{FunctionBasedProblem{ScalarFitnessScheme{true},ContinuousRectSearchSpace,Nothing}}}}}) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.1\Distributed\src\remotecall.jl:310 (repeats 100 times)
Stacktrace:
 [1] #remotecall_fetch#149(::Base.Iterators.Pairs{Union{},Union{},Tuple{},NamedTuple{(),Tuple{}}}, ::Function, ::Function, ::Distributed.Worker, ::Function, ::Vararg{Any,N} where N) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.1\Distributed\src\remotecall.jl:379
 [2] remotecall_fetch(::Function, ::Distributed.Worker, ::Function, ::Vararg{Any,N} where N) at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.1\Distributed\src\remotecall.jl:371
 [3] #remotecall_fetch#152 at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.1\Distributed\src\remotecall.jl:406 [inlined]
 [4] remotecall_fetch at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.1\Distributed\src\remotecall.jl:406 [inlined]
 [5] Type at C:\cygwin\home\Administrator\buildbot\worker\package_win64\build\usr\share\julia\stdlib\v1.1\Distributed\src\remotecall.jl:108 [inlined]
 [6] #48 at .\none:0 [inlined]
 [7] iterate at .\generator.jl:47 [inlined]
 [8] collect(::Base.Generator{Array{Int64,1},getfield(BlackBoxOptim, Symbol("##48#50")){FunctionBasedProblem{ScalarFitnessScheme{true},ContinuousRectSearchSpace,Nothing},FunctionBasedProblem{ScalarFitnessScheme{true},ContinuousRectSearchSpace,Nothing}}}) at .\array.jl:606
 [9] #ParallelEvaluator#47(::Array{Int64,1}, ::Type, ::FunctionBasedProblem{ScalarFitnessScheme{true},ContinuousRectSearchSpace,Nothing}, ::TopListArchive{Float64,ScalarFitnessScheme{true}}) at C:\Users\david\.julia\dev\BlackBoxOptim\src\parallel_evaluator.jl:126
 [10] (::getfield(Core, Symbol("#kw#Type")))(::NamedTuple{(:pids,),Tuple{Array{Int64,1}}}, ::Type{BlackBoxOptim.ParallelEvaluator}, ::FunctionBasedProblem{ScalarFitnessScheme{true},ContinuousRectSearchSpace,Nothing}, ::TopListArchive{Float64,ScalarFitnessScheme{true}}) at .\none:0
 [11] make_evaluator(::FunctionBasedProblem{ScalarFitnessScheme{true},ContinuousRectSearchSpace,Nothing}, ::Nothing, ::DictChain{Symbol,Any}) at C:\Users\david\.julia\dev\BlackBoxOptim\src\opt_controller.jl:12
 [12] BlackBoxOptim.OptRunController(::XNESOpt{Float64,RandomBound{ContinuousRectSearchSpace}}, ::FunctionBasedProblem{ScalarFitnessScheme{true},ContinuousRectSearchSpace,Nothing}, ::DictChain{Symbol,Any}) at C:\Users\david\.julia\dev\BlackBoxOptim\src\opt_controller.jl:97
 [13] run!(::BlackBoxOptim.OptController{XNESOpt{Float64,RandomBound{ContinuousRectSearchSpace}},FunctionBasedProblem{ScalarFitnessScheme{true},ContinuousRectSearchSpace,Nothing}}) at C:\Users\david\.julia\dev\BlackBoxOptim\src\opt_controller.jl:435
 [14] #bboptimize#84 at C:\Users\david\.julia\dev\BlackBoxOptim\src\bboptimize.jl:66 [inlined]
 [15] bboptimize(::BlackBoxOptim.OptController{XNESOpt{Float64,RandomBound{ContinuousRectSearchSpace}},FunctionBasedProblem{ScalarFitnessScheme{true},ContinuousRectSearchSpace,Nothing}}) at C:\Users\david\.julia\dev\BlackBoxOptim\src\bboptimize.jl:63
 [16] top-level scope at util.jl:213
 [17] include at .\boot.jl:326 [inlined]
 [18] include_relative(::Module, ::String) at .\loading.jl:1038
 [19] include(::Module, ::String) at .\sysimg.jl:29
 [20] exec_options(::Base.JLOptions) at .\client.jl:267
 [21] _start() at .\client.jl:436
in expression starting at C:\Users\david\.julia\dev\BlackBoxOptim\examples\rosenbrock_parallel.jl:28
@isentropic
Copy link

@robertfeldt, I was wondering if there is anyone working on this, should we expect a fix anytime soon? One of my projects would be vastly sped up if this works.

@robertfeldt
Copy link
Owner

Hmm, parallel/distributed use of Julia is not my main competence so it would probably be more efficient if someone with more experience in this looked at it. Maybe @alyst has some input? If not I'll try to get to it soon and just investigate from scratch. Doesn't look too hard to debug.

@wallacmj
Copy link

wallacmj commented May 3, 2019

Seems to be related to this issues JuliaLang/julia#30679

You can get around this by adding the following line before calling the optimizer.

for i in procs()
    @everywhere struct A end
    @fetchfrom i A()
end

@robertfeldt
Copy link
Owner

Hmm, I tried both with and without this supposed fix but Julia 1.1 dies with Bus error: 10 on my MB Pro.

@wallacmj
Copy link

wallacmj commented May 7, 2019

Might not work for 1.1. It works for me on the platform below.

Julia Version 1.0.3
Commit 099e826241 (2018-12-18 01:34 UTC)
Platform Info:
OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Core(TM) i7-4720HQ CPU @ 2.60GHz
WORD_SIZE: 64
LIBM: libopenlibm
LLVM: libLLVM-6.0.0 (ORCJIT, haswell)

@ayadlin
Copy link
Contributor

ayadlin commented May 21, 2019

for i in procs()
    @everywhere struct A end
    @fetchfrom i A()
end

Tried this and it worked - Would anyone care to explain what this spinet is doing?
Is it just fetching whatever is in process i and integrating with the rest?

Thanks,
A

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants