Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

propagate errors on wait(::RemoteRef) and remotecall_wait #13744

Merged
merged 1 commit into from
Oct 23, 2015

Conversation

amitmurthy
Copy link
Contributor

closes #13730

With this patch and julia -p 1:

julia> t = 0.0
0.0

julia> @sync @parallel for i=1:10
           t.nonexistent
       end
ERROR: On worker 2:
type Float64 has no field nonexistent
 [inlined code] from none:2
 in anonymous at no file:0
 in anonymous at multi.jl:1348
 in anonymous at multi.jl:899
 in run_work_thunk at multi.jl:651
 in run_work_thunk at multi.jl:660
 in anonymous at task.jl:54
 in remotecall_fetch at multi.jl:737
 [inlined code] from multi.jl:373
 in call_on_owner at multi.jl:783
 in wait at multi.jl:791
 in sync_end at ./task.jl:396

julia> pmap(x->t.nonexistent, [0,1,2])
1-element Array{Any,1}:
 RemoteException(2,CapturedException(UndefVarError(:t),Any[(:anonymous,:none,1,symbol(""),-1,1),(:anonymous,symbol("multi.jl"),902,symbol(""),-1,1),(:run_work_thunk,symbol("multi.jl"),651,symbol(""),-1,1),(:anonymous,symbol("multi.jl"),902,symbol("task.jl"),59,1)]))

julia> pmap(x->t.nonexistent, [0,1,2]; err_stop=true)
1-element Array{Any,1}:
 RemoteException(2,CapturedException(UndefVarError(:t),Any[(:anonymous,:none,1,symbol(""),-1,1),(:anonymous,symbol("multi.jl"),902,symbol(""),-1,1),(:run_work_thunk,symbol("multi.jl"),651,symbol(""),-1,1),(:anonymous,symbol("multi.jl"),902,symbol("task.jl"),59,1)]))

@parallel prints an error.

pmap returns the exception objects as before.

@JeffBezanson
Copy link
Member

Great, thanks.

I notice that fetch (and wait) doesn't propagate exceptions if the ref is local:

julia> r = remotecall(error, 1)
RemoteRef{Channel{Any}}(1,1,6)

julia> fetch(r)
RemoteException(1,CapturedException(ErrorException(""),Any[(:error,symbol("error.jl"),22,symbol(""),-1,1),(:run_work_thunk,symbol("multi.jl"),651,symbol(""),-1,1),(:run_work_thunk,symbol("multi.jl"),660,symbol(""),-1,1),(:anonymous,symbol("task.jl"),54,symbol(""),-1,1)]))

Is that intentional? It doesn't seem to me that this should depend on the location of the ref.

@amitmurthy
Copy link
Contributor Author

Good catch. Will fix.

@amitmurthy
Copy link
Contributor Author

Fixed.

@JeffBezanson
Copy link
Member

Nice.

JeffBezanson added a commit that referenced this pull request Oct 23, 2015
propagate errors on wait(::RemoteRef) and remotecall_wait
@JeffBezanson JeffBezanson merged commit 18af6f7 into master Oct 23, 2015
@tkelman tkelman deleted the amitm/waitexcp branch October 23, 2015 20:19
amitmurthy added a commit that referenced this pull request Oct 31, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Code with errors does not stop when run in parallel in 0.4.0
3 participants