-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows profiler not profiling #3960
Comments
516 backtraces is an awful lot compared to the example. That fact, coupled with the code snippet you showed, suggests that you didn't run it once to get it compiled first? (Since compiling takes time, that's consistent with an abundance of backtraces.) So probably most of those are in the compiler. I'd run it once and then collect the profiling data, or say As far as missing any other lines: you can try I don't have any direct experience with backtraces on Windows, so of course it's also possible that there are issues in terms of completeness. |
Ah yes, I missed that. Given the quality of the windows platform (esp. the timers and backtrace infrastructure), you will probably need to run your code approximately 100x longer than on linux to get an equivalent sample. The Windows version of the code for profiling C code is not yet written, so I'm not sure what it will tell you if you try (probably nothing). |
[edit: the below reasoning is somewhat misguided, please refer to the next post] It seems like Here's part of the lidict output from
At least some libraries seem to be at ~0x68000000. I don't think it's directly related to the FIXME that you mention in Could there be an issue with |
Actually, looking close at the output, the raw data seems to be in reasonable shape after all. Might be some JIT:ed code that isn't recognized, but most of the information is there. Here's an example dump of a stack trace together with the corresponding LineInfo data:
In this case, all relevant information seems to be present, but still Profile.print() doesn't show myfunc() at all:
I'll have to do some more digging, but I'm guessing that the unidentified lines mess up the profiling logic. |
They shouldn't; I get lots of unidentified lines on Linux, and they don't cause a problem. In the abstract it's hard for me to figure out the problem. If you prefer to debug it yourself (clearly you've already learned a lot about how Profile works), first try
The only thing I've noticed is that there's at least one strange thing about your copy/paste example: do you always get 2 blank lines after Alternatively, if you want me to help debug, can you send me your backtrace data? I've already gotten this set up so that it should be pretty trivial (requires installing HDF5, see installation instructions at https://github.com/timholy/HDF5.jl)
|
Thanks, I'll take a closer look when I have a moment. From a brief look, it seems like information disappears at the tree_aggregate() stage. |
I did some more digging and it seems like the cause is OTOH, maybe it would be cleaner to use the same conventions in the backtrace code between platforms, if possible? @vtjnash? |
Thanks for figuring that out! Can you rebuild julia and test now? |
Seems to work, AFAICS:
Hopefully it doesn't break 64-bit Windows. |
At most, having Thanks for working through this with me! |
LGTM. There isn't two extra frames on windows to insert, so it's probably OK to have a different value for btskip. Alternatively, we could avoid including those extraneous two, and probably fit a few more backtraces into memory. Note, I added a new macro recently so you could write this more concisely as: |
I thought about that, but I was concerned doing that part in the C code might make it more confusing to debug if platforms differed (which indeed turned out to be the case). However, I don't think it's a major point. Perhaps more relevantly, in practice I think many backtraces have length ~20 or so, so the gain would be modest. |
Would run-length-encoding backtraces like we do when showing errors help? |
It would help a lot, and I considered it from the outset, but I have the impression that you don't want to allocate memory in a signal handler. If I'm wrong about that, then this might be viable. But note we'd have to write a "dict" type in pure C. |
...or use STL's map, if memory allocation in a signal handler is OK. |
It is not possible to allocate more memory in a signal handler. It will (eventually, randomly) lock up the program. But you can pre-allocate as much space as you want, e.g. for a fixed size dictionary. 2/20 = 10% savings isn't too bad. Alternatively, maybe it would be possible to do some really minimal but fast compression by recording how much of the prefix from the previous backtrace is duplicated in the current frame. This could also potentially help speed printing of the frames afterwards. |
I suspected as much, thanks for clarifying. A third possibility would be to use There's a (rather remote) chance that all that code is allocation-free (i.e., doesn't touch the gc()) and might therefore be callable from the signal handler. That's pretty scary, though, it might be safer to write its analog in C. Also, I think that code uses lookups, rather than instruction pointers, and from a performance perspective I don't think we want the signal handler doing lookups. So we'd probably need a rewrite anyway. Assuming there isn't some technical hurdle, I suspect this idea would lead to really massive compression. It would also be a significant influx of new code into base and add some degree of complexity. I've been enjoying having the simple |
I'm using a 32-bit Windows build (MinGW, source pulled 6-Aug-13) on Windows 8.
I wanted to try out the new built-in profiler so, following the example in the manual, I defined myfunc.jl as:
and ran the following
However, this doesn't look quite right, in particular all traces except two seem to vanish into thin air after profile.jl.
Any ideas what's happening and/or how to debug it further?
The text was updated successfully, but these errors were encountered: