Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Profiling code led to segmentation fault and non-terminating state (infinite loop). #4107

Closed
nutsiepully opened this issue Aug 20, 2013 · 9 comments

Comments

@nutsiepully
Copy link
Contributor

Hi,

I was trying to profile the code provided in this OpenBLAS issue.

It was a simple modification that wraps the code inside a function and then calls it

function f()
tic()
for ii in 1:1e4
      # BD
      W_bd, S, U = bd_precoding(H, antennas_tx, antennas_rx)
      p_tx_bd = equal_power_assignment(W_bd, antennas_tx, antennas_rx, p_max)
....
      stream_rate_bd = log2(1 + p_tx_bd .* S / sigma_n)
....
      # ZF
      W_zf = zf_precoding(H)
      p_tx_zf = equal_power_assignment(W_zf, antennas_tx, antennas_rx, p_max)
      stream_rate_zf = log2(1 + p_tx_zf * ones(number_of_bs * antennas_rx) / sigma_n)
end
toc()
end

@profile f()

However, when I ran this code, in the first 2 instances I got a segmentation fault 11. After that the code just went into an non-terminating state, and I had to stop processing after around 5 mins of execution.

htop was showing cores being used, so it's possible that the code was running but was terribly slowing down due to the profiler, or it entered some kind of infinite loop.

I ran this code a couple of times (in original) without the @profile and it completed comfortable after about 80 seconds.

@timholy
Copy link
Member

timholy commented Aug 20, 2013

Let me guess, you're running OSX? If so, this is likely a dup of #3971. If not, it would be helpful if you could put the actual code being run into a gist so I can test it.

@nutsiepully
Copy link
Contributor Author

@timholy - yes, while running OSX. It looks like the same issue. I've pasted the code here. Please let me know if there is anything else that I can do. I can close this if you're certain it's a duplicate. (Also, on some occasions it was going into a non-terminating state.)

@timholy
Copy link
Member

timholy commented Aug 22, 2013

I appreciate your offer of help. Basically, I don't know what to do here---this looks like a kernel (Darwin) bug to me (instruction pointers off by one). There's a chance we could work around it by replacing the profiler with a Mac-specific version as suggested by @loladiro (I can't quite find the link right now), but I confess it's not a high priority for me (I don't have a mac, which would make testing pretty difficult, and anyway am swamped for the next couple of months with other things).

@nutsiepully
Copy link
Contributor Author

@timholy - i understand, thanks. if i absolutely have to use it, i'll just use it on a vm. but it might be good just to report it to them

@blakejohnson
Copy link
Contributor

@nutsiepully It would be great to have more eyeballs on this issue, since I am also swamped with other things. My next step was going to be do build a debug version of Apple's libunwind so that I could get run things through gdb and actually be able to step through the segfaulting code.

@quinnj
Copy link
Member

quinnj commented Aug 21, 2014

Bump, is this still an issue?

@blakejohnson
Copy link
Contributor

I imagine this is fixed ever since @Keno's patches to libunwind.

@quinnj
Copy link
Member

quinnj commented Aug 21, 2014

Cool. Closing for now. @nutsiepully, feel free to reopen if you can reproduce. (Ready for school next week?)

@quinnj quinnj closed this as completed Aug 21, 2014
@nutsiepully
Copy link
Contributor Author

@quinnj - sure, will do. (oh yes!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants