-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash during many Tasks run #15017
Comments
Update: i found the code, which behaves exactly like my "abstract" example above. It crashes... :) using Manager
orgs = Creature.Organism[]
len = 500
# here, we just create 500 instances of type Creature.Organism
# it contains code::Expr and codeFn::Function, which we will
# change below...
for i = 1:len push!(orgs, Creature.create()) end
function run()
local i::Int
while true
for i = 1:len
# this code changes orgs[i].code expression in a random way,
# but with correct julia syntax
Mutator.mutate(orgs[i])
orgs[i].codeFn = eval(orgs[i].code)
try
orgs[i].codeFn(orgs[i])
end
end
end
end
run() Code above produces this error:
|
Could this be related: #14113? |
I don't know :) I use |
Guys. Is there any solutions for this? |
Without the rest of your code, this is hard to reproduce. If your code is available online, can you provide a link to it, or if you're willing to send it to someone privately, that would help too. That being said, I'm eyeing that |
Yes, you right. This is a genetic algorithm related app. And i have to generate a lot of Julia code using AST. This is why i chose Julia ;) So i need to call eval() huge amount of times. Here is an original code of this error :https://github.com/tmptrash/jevo/blob/master/tmp/err-crash1.jl Run it in root folder using this snippet after clonning the repo: include("src\\ImportFolders.jl")
include("tmp\\err-crash1.jl")
run() But, you have to wait some time, before crash occures... For my PC, it takes ~4min... I tried this 5 min ago and i got this:
|
Is this sample works on your side? |
Guys? Any progress with this issue? :) |
I can't reproduce your crash. It works fine for me but just runs for a very long time. Of course I'm on OS X, so maybe there's an upstream LLVM bug or some such issue. |
Okay, I will check this under Ubuntu... |
I don't have Mac, so i only may test this under Ubuntu v14.04: julia> run()
signal (11): Ошибка сегментирования
unknown function (ip: 0x7f0be0ad15fd)
jl_method_def at /usr/bin/../lib/x86_64-linux-gnu/julia/libjulia.so (unknown line)
anonymous at no file:0
run at /home/db/projects/jevo/tmp/err-crash1.jl:21
jlcall_run_21463 at (unknown line)
jl_apply_generic at /usr/bin/../lib/x86_64-linux-gnu/julia/libjulia.so (unknown line)
unknown function (ip: 0x7f0be0b2a683)
unknown function (ip: 0x7f0be0b29ac1)
unknown function (ip: 0x7f0be0b3eca8)
jl_toplevel_eval_in at /usr/bin/../lib/x86_64-linux-gnu/julia/libjulia.so (unknown line)
eval_user_input at REPL.jl:62
jlcall_eval_user_input_21180 at (unknown line)
jl_apply_generic at /usr/bin/../lib/x86_64-linux-gnu/julia/libjulia.so (unknown line)
anonymous at REPL.jl:92
unknown function (ip: 0x7f0be0b30bbc)
unknown function (ip: (nil)) The title near |
I've run the test code you provided in your original post – the module Test
type Organism
code::Expr
codeFn::Function
end
type OrganismTask
task::Task
organism::Organism
end
tasks = OrganismTask[]
function born(o::Organism)
return function ()
while true
produce()
try
o.codeFn(o)
end
end
end
end
function run()
for i=1:500
# in real app organism's function is more complicated
org = Organism(:(function (o) return 1 end), function (o) return 1 end)
task = Task(born(org))
push!(tasks, OrganismTask(task, org))
end
while true
for i=1:500
consume(tasks[i].task)
# here is a code, which modify tasks[i].organism.code
tasks[i].organism.codeFn = eval(tasks[i].organism.code)
end
# here is a code, which remove and add tasks from/to tasks variable
end
end
end
using Test
Test.run() |
from the backtrace, it looks like one of your organisms tried to escape (create a new method), but didn't have the syntax quite right. if you catch this in gdb, you should be able to walk up the stack to see the method body of the rogue organism |
Regarding my "original" code, you right, it doesn't produce an error (but my last code (err-crash1.jl) does). So i found a solution how you may reproduce a crash on your side. I made small change in my code for serializing it just before crash. From this: orgs[i].codeFn = eval(orgs[i].code)
try
orgs[i].codeFn(orgs[i])
catch e
end to this: orgs[i].codeFn = eval(orgs[i].code)
try
Helper.save(orgs[i], "err-code.jevo")
orgs[i].codeFn(orgs[i])
catch e
end You may load "err-code.jevo" using this function: function load(file::ASCIIString)
local io = null
local ret = null
try
io = open(file)
ret = deserialize(io)
catch(e)
println("Error: $e")
ret = null
finally
if io !== null close(io) end
end
ret
end To reproduce crash use this snippet: org = load("err-code.jevo")
org.codeFn(org) Last line in last code snippet causes a crash. But if i reevaluate the code it doesn't. I mean this: org.codeFn = eval(org.code)
org.codeFn(org) # doesn't produce crash (only exception) I tried to compare old and new generated ASTs using P.S. Here is a link to "err-code.jevo". |
I dived deep into the problem and found exact place of the error: AST(:($(Expr(:lambda, Any[:(o::(top(getfield))(Creature,:Organism))], Any[Any[Any[:o,:Any,19],Any[:func_2,:Any,3]],Any[],0,Any[]], :(begin
NewvarNode(:func_2)
o = (top(typeassert))(o,Creature.Organism)
$(Expr(:method, :func_2, :((top(svec))((top(apply_type))(Main.Tuple),(top(svec))())), AST(:($(Expr(:lambda, Any[], Any[Any[],Any[Any[:func_2,:Any,3]],0,Any[]], :(begin return func_2(97,8397162610081231316,-7678,-33) end))))), false))
$(Expr(:method, :func_2, :((top(svec))((top(apply_type))(Main.Tuple,Int8),(top(svec))())), Type{LabelNode}, false))
$(Expr(:method, :func_2, :((top(svec))((top(apply_type))(Main.Tuple,Int8,Int64),(top(svec))())), Int64, false))
$(Expr(:method, :func_2, :((top(svec))((top(apply_type))(Main.Tuple,Int8,Int64,Int16),(top(svec))())), SymbolNode, false))
$(Expr(:method, :func_2, :((top(svec))((top(apply_type))(Main.Tuple,Int8,Int64,Int16,Int8),(top(svec))())), Type{GenSym}, false))
(Creature.stepRight)(o)
(Creature.stepRight)(o)
(Creature.stepUp)(o)
(Creature.stepDown)(o)
(Creature.stepDown)(o)
(Creature.stepDown)(o)
return (Creature.stepRight)(o)
end))))) It looks like the problem in one of last four AST(:($(Expr(:lambda, Any[:(o::(top(getfield))(Creature,:Organism))], Any[Any[Any[:o,:Any,19],Any[:func_2,:Any,3]],Any[],0,Any[]], :(begin
NewvarNode(:func_2)
o = (top(typeassert))(o,Creature.Organism)
$(Expr(:method, :func_2, :((top(svec))((top(apply_type))(Main.Tuple),(top(svec))())), AST(:($(Expr(:lambda, Any[], Any[Any[],Any[Any[:func_2,:Any,3]],0,Any[]], :(begin return func_2(97,8397162610081231316,-7678,-33) end))))), false))
#=here=#$(Expr(:method, :func_2, :((top(svec))((top(apply_type))(Main.Tuple,Int8),(top(svec))())), AST(:($(Expr(:lambda, Any[:var_6], Any[Any[Any[:var_6,:Any,0]],Any[Any[:func_2,:Any,3]],0,Any[]], :(begin return func_2(var_6,8397162610081231316,-7678,-33) end))))), false))
#=here=#$(Expr(:method, :func_2, :((top(svec))((top(apply_type))(Main.Tuple,Int8,Int64),(top(svec))())), AST(:($(Expr(:lambda, Any[:var_6,:var_7], Any[Any[Any[:var_6,:Any,0],Any[:var_7,:Any,0]],Any[Any[:func_2,:Any,3]],0,Any[]], :(begin return func_2(var_6,var_7,-7678,-33) end))))), false))
#=here=#$(Expr(:method, :func_2, :((top(svec))((top(apply_type))(Main.Tuple,Int8,Int64,Int16),(top(svec))())), AST(:($(Expr(:lambda, Any[:var_6,:var_7,:var_8], Any[Any[Any[:var_6,:Any,0],Any[:var_7,:Any,0],Any[:var_8,:Any,0]],Any[Any[:fun,:Any,3]],0,Any[]], :(begin return func_2(var_6,var_7,var_8,-33) end))))), false))
#=here=#$(Expr(:method, :func_2, :((top(svec))((top(apply_type))(Main.Tuple,Int8,Int64,Int16,Int8),(top(svec))())), AST(:($(Expr(:lambda, Any[:var_6,:var_7,:var_8,:var_9], Any[Any[Any[:var_6,:Any,0],Any[:var_7,:Any,0],Any[:var_8,:Any,0],A:var_9,:Any,0]],Any[Any[:o,:Any,19]],0,Any[]],
:(begin
(Creature.stepRight)(o)
return var_6
end))))), false))
(Creature.stepRight)(o)
(Creature.stepRight)(o)
(Creature.stepUp)(o)
(Creature.stepDown)(o)
(Creature.stepDown)(o)
(Creature.stepDown)(o)
return (Creature.stepRight)(o)
end))))) |
Yeah, I'm not going to eval code I can't look at first. |
duplicate of #14113 (I didn't realize that hadn't been backported to 0.4, or I would have marked this earlier) |
Hello guys! It looks like this issue is not a duplicate of #14113, because it still reproducible on my PC. I have these crashes very often :( I waited for this fix and today i found that 14113 has fixed. It's critical, so i created a workaround making backups every minute and recover from last if crash occures. It's it really bother me... So I made a special backup file with buggy code for you guys, but it depends on my modules (depends on The scenario is the same. You have to include("src/ImportFolders.jl") # we have to be in a root folder of the project
import Manager
org = Helper.load("code-before.jevo")
org.codeFn(org) # this line produces the crash This is the crash:
The buggy backup file is here Waiting for your answer. |
I use, windows 8.1 and julia 0.4.6 |
I don't believe that was ever backport to 0.4.6 (@tkelman) |
So, where i can find fixed version to check the issue? |
The commit history is public in the repo. What is "that" referring to, exactly? |
Is it correct that this fix will be ported to 0.4.6 soon? |
We've marked the commit for backporting to what will become 0.4.7. It will need testing against the release-0.4 branch to make sure it isn't breaking there. I'll try to find some time to prepare 0.4.7 over the next few weeks. |
Thanks alot! I hope this will fix all my crashes... |
By the way, if you need i may prepare different dump files with code, which crashes the application. Because i have different stack traces for this error and may be we are speaking about many errors. Here are some of them:
|
I this this kind of error is related to this issue:
|
Hi. @tkelman, is it possible to prepare 0.4.7 with a fix? :) |
Any updates on this? |
No, sorry I've been occupied with 0.5 release candidates. Can you use them at all? I will be preparing an 0.4.7 release as the next thing on my to-do list after RC4. |
What is the deadline? |
see #18478 - though ref. #14656 (comment), not sure whether that particular fix is directly backportable without harming other functionality |
Happy to hear. Hope it will help... |
Have you tried using 0.5 ? |
No. Does v0.5 contain this fix also? |
Yes, #14113 has been fixed on 0.5 since March. |
Thanks! I will try. |
After migrating to v0.5, the problem has gone :) Thanks alot! |
Hello everybody.
It looks like i found a bundle of bugs in last version of julia. I run an application which uses many Tasks in it. It also uses generated julia code, which is run inside these tasks all the time. I obtain these crashes after ~3-20 minutes of running in some random order. I used both debug and release julia versions for tests...
This is first error message:
Second error message looks like this:
Another one:
The last one:
It's hard to provide buggy code sample, because application is big and so many processes are involved. But, i may provide a high level code:
If it's hard to find these errors using my sample, you may run my project on your machine to obtain the same errors. It doesn't require some special environment fo run. I may provide you exact steps to reproduce it.
Thanks a lot. Julia is a great language ;)
The text was updated successfully, but these errors were encountered: