Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Set Function.start_line when symbolizing #823

Closed
prattmic opened this issue Dec 7, 2023 · 5 comments
Closed

Set Function.start_line when symbolizing #823

prattmic opened this issue Dec 7, 2023 · 5 comments

Comments

@prattmic
Copy link
Member

prattmic commented Dec 7, 2023

Please answer these questions before submitting your issue. Thanks!

What version of pprof are you using?

If you are using pprof via go tool pprof, what's your go env output?
If you run pprof from GitHub, what's the Git revision?

a5a03c7

What operating system and processor architecture are you using?

linux-amd64

What did you do?

If possible, provide a recipe for reproducing the error.
Attaching a profile you are trying to analyze is good.

Analyze a perf.data profile, automatically converted via perf_to_profile.

$ pprof -raw perf.data                                                                                                            
Converting perf.data to a profile.proto... (May take a few minutes)                                                                           
[WARNING:src/quipper/perf_reader.cc:1323] Skipping 120 bytes of metadata: HEADER_CPU_TOPOLOGY                                                 
[WARNING:src/quipper/perf_reader.cc:1070] Skipping unsupported event PERF_RECORD_ID_INDEX                                                     
[WARNING:src/quipper/perf_reader.cc:1070] Skipping unsupported event PERF_RECORD_CPU_MAP                                                      
[WARNING:src/quipper/perf_reader.cc:1070] Skipping unsupported event UNKNOWN_EVENT_82                                                         
[INFO:src/quipper/perf_reader.cc:1061] Number of events stored: 2461                                                                          
[INFO:src/quipper/perf_parser.cc:272] Parser processed: 2 MMAP/MMAP2 events, 2 COMM events, 4 FORK events, 5 EXIT events, 2445 SAMPLE events, 2364 of these were mapped, 0 SAMPLE events with a data address, 0 of these were mapped
[INFO:src/perf_data_handler.cc:103] Using the build id found for the file name: [kernel.kallsyms], build id: ca67921314de38c21c74839716f10a65eba6f6ba.
[INFO:src/perf_data_handler.cc:103] Using the build id found for the file name: [kernel.kallsyms]_text, build id: ca67921314de38c21c74839716f10a65eba6f6ba.
[INFO:src/perf_data_handler.cc:103] Using the build id found for the file name: [kernel.kallsyms]_stext, build id: ca67921314de38c21c74839716f10a65eba6f6ba.
[WARNING:src/perf_data_handler.cc:577] stat: missing_callchain_mmap 2647/11713             
Comment: perf-version:6.5.6                                                                                                                   
Comment: perf-command:/usr/bin/perf record -g ./expensive                                                                                     
PeriodType:                                                                                                                                   
Period: 0                                                                                                                                     
Samples:                                                                                                                                      
cycles:Pu_sample/count cycles:Pu_event/count[dflt]
...
Locations                                                                                                                                     
     1: 0x0                                                                                                                                   
     2: 0x455180 M=1 _rt0_amd64 /usr/lib/google-golang/src/runtime/asm_amd64.s:16 s=0
     3: 0x40250c M=1 memeqbody /usr/lib/google-golang/src/internal/bytealg/equal_amd64.s:123 s=0
     4: 0x452e55 M=1 runtime.vdsoauxv /usr/lib/google-golang/src/runtime/vdso_linux.go:280 s=0
     5: 0x42a8cc M=1 runtime.sysauxv /usr/lib/google-golang/src/runtime/os_linux.go:315 s=0
     6: 0x42a624 M=1 runtime.sysargs /usr/lib/google-golang/src/runtime/os_linux.go:246 s=0
     7: 0x43dd5e M=1 runtime.args /usr/lib/google-golang/src/runtime/runtime1.go:69 s=0
     8: 0x457e04 M=1 runtime.args.abi0 ./<autogenerated>:1 s=0
     9: 0x4552b1 M=1 runtime.rt0_go.abi0 /usr/lib/google-golang/src/runtime/asm_amd64.s:347 s=0
    10: 0x41219c M=1 runtime.sysAllocOS /usr/lib/google-golang/src/runtime/mem_linux.go:34 s=0
...

What did you expect to see?

Function start lines are determined during symbolization.

What did you see instead?

No function start lines (s=0) in any of the locations.

None of the symbolizers attempt to look up function start lines, so this isn't too surprising.

However, since LLVM 13, llvm-symbolizer has a JSON output (first mentioned in #606) which provides function start line. I prototyped this in prattmic@e80491e (without the necessary fallback for old versions), and it is pretty straightforward. I am happy to finish this up and turn into a real PR.

My motivation here is using perf.data profiles for Go PGO. perf_to_profile doesn't symbolize profiles, but pprof does, so pprof -proto perf.data > perf.pprof is a convenient way to convert + symbolize. But Go PGO requires function start line, so today this isn't quite sufficient. We could of course write a separate symbolization tool, but IMO it seems reasonable for pprof to try to symbolize all of applicable proto fields.

@aalexand
Copy link
Collaborator

This seems reasonable to support and the change does look simple enough. I was going to ask whether the switch to hardcode llvm-symbolizer-14 means older version support would break but then I saw you mention explicitly that the code for that would be added.

The added benefit is that reading the JSON output is cleaner since this is clearly a machine-readable format. My only worry is whether this is going to be slower on large binaries - e.g. maybe LLVM is for any reason slower to produce JSON and parsing it would be added CPU time and RAM cost on the pprof side as well. It would be good to measure that at least in a couple of manual experiments.

@prattmic
Copy link
Member Author

Sounds good, thanks! I'll be sure to measure the performance.

I was going to ask whether the switch to hardcode llvm-symbolizer-14 means older version support would break but then I saw you mention explicitly that the code for that would be added.

That change is just a hack. My machine didn't have llvm-symbolizer in PATH (only variants with a version suffix), so I just hacked pprof's lookup code. That change won't be in a real PR.

@aalexand
Copy link
Collaborator

Ack, thanks! Will review the PR when you have it (no rush, the pace is up to you).

insilications added a commit to insilications/pprof that referenced this issue Sep 1, 2024
When analyzing a perf.data profile converted automatically via perf_to_profile via pprof -raw perf.data, no function start lines (s=0) are present in any of the locations. With google@813a5fb, this can be easily solved by using the same JSON frame data from llvm-symbolizer to provide StartLine for Function.start_line. This solves google#823.
aalexand added a commit that referenced this issue Sep 29, 2024
* Use llvm-symbolizer's JSON output to provide function start lines

When analyzing a perf.data profile converted automatically via perf_to_profile via pprof -raw perf.data, no function start lines (s=0) are present in any of the locations. With 813a5fb, this can be easily solved by using the same JSON frame data from llvm-symbolizer to provide StartLine for Function.start_line. This solves #823.

* Fixed TestPEFile windows test

* Fixed TestMachoFiles mac tests

* Fix making sure llvm-symbolizer is available in the CI env for Linux tests

Modify the `Fetch dependencies` step in `test-linux` job to include the installation of llvm. Add `Verify llvm-symbolizer installation` step to ensure that llvm-symbolizer is available in the CI environment for the Linux tests. This should resolve the test failures related to the missing llvm-symbolizer tool.

* ci: Improve LLVM installation for Ubuntu 20.04 and 22.04

Enhances the CI workflow to ensure proper installation of LLVM and the `llvm-symbolizer` across both Ubuntu 20.04 and 22.04 environments. These changes address the issue of an outdated llvm-symbolizer on Ubuntu 20.04, while maintaining compatibility with Ubuntu 22.04.

- Add LLVM official repository for Ubuntu 20.04
- Install LLVM 14 on Ubuntu 20.04 to ensure a recent version
- Use update-alternatives to manage llvm-symbolizer versions
- Maintain existing LLVM 14 for Ubuntu 22.04

* ci: Make it clear that our goal is to install llvm-symbolizer

---------

Co-authored-by: Alexey Alexandrov <[email protected]>
@insilications
Copy link
Contributor

@aalexand I guess this has been solved by 255acd7?

@aalexand
Copy link
Collaborator

Ah, yes - thanks. That PR includes " This solves #823" phrase but I didn't realize it was not magical enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants