-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/pprof: disasm not working on darwin #50891
Comments
but if the binary (which is in wd) is explicitly passed as FWIW, I see the same behaviour with go1.17.5 (above tests are from tip). (This is on Linux). |
Agreed, as far as I can tell this has never worked. This works (on linux-amd64):
Removing the release-blocker label. |
I stub my toe on this UX about every other time I use pprof. |
It seems the go command first runs the binary in the tmp directory, collects the profile, then moves it to the current directory. The file path recorded is the one in the tmp directory.
Perhaps the go command can move the file first then run it? Another part is that it seems the file names recorded in the profile are from profile.Mapping, which is from reading /proc/self/maps, which probably works only on Linux. So on macOS we may need to always pass the executable file name to pprof, even if the go command does the mv first. |
Thanks. Indeed, not a release blocker. It should be easy to run the final executable instead of the one from /tmp to fix Linux. For the Mac we should be able to record os.Executable, which might be sufficient. All good things for a future release. |
Getting the same problem on Windows:
$ go version go version go1.17.6 windows/amd64
|
The interesting thing is that if you cross-compile an amd64 version of go on an m1 mac, and then use this go to test and profile, you can successfully execute disasm in pprof. |
Any update on this issue ? Given that gdb and perf tool are not available on m1 mac, pprof has become my first choice for debugging on m1 mac, and the ineffectiveness of disasm has caused great inconvenience. |
It also works after second run of |
I've done a bit of digging into this since I distinctly recall this working circa 2016-17. In short, this did work before a change to pprof that broke things initially for intel (2018 onward) and then for arm when it was introduced in 2020. Fixing pprof is easy and this fixes intel, but not arm macs. For arm, the pc values available within a running process differ from those in the executable's symbol table; this only happens for apple silicon systems and not for other arm (linux/raspberry pi) systems. The symbol table values are "correct" in the sense that objdump displays correct looking assembly code at those locations but are modified, I presume by dyld, when the executable is loaded into memory. In addition, the values change every time the process is run, suggesting that dyld is non-deterministic, which if correct, kinda sucks. For pprof, google/pprof#313 introduced a check for 'textSegment.Addr > start' which will always fail since the text segments start at 0x100000000. The error occurs when testing for the first argument to pprof being an executable at startup (but the error is ignored there) and when attempting to load the executable when disassembly is requested. Again, no error is displayed. Fixing this fixes the problem for intel based macs, but not for arm based ones and explains why I recall this working correctly before 2018 or so. I am puzzled as to how some of the earlier notes on this thread state that the problem was the executable being in an inaccessible location. For the eg.go shown below, here's the output on an arm mac and an intel one: apple silicon - the values change every time it is run.
intel:
arm64 on a linux raspberry pi:
package main
import (
"fmt"
"runtime"
)
func main() {
pcs := make([]uintptr, 100)
n := runtime.Callers(0, pcs)
pcs = pcs[:n]
fmt.Printf("%x\n", pcs)
frames := runtime.CallersFrames(pcs[:n])
for {
frame, more := frames.Next()
if !more {
break
}
fmt.Printf("%x %s\n", frame.PC, frame.Function)
}
} |
oh, it seems that src/runtime/pprof/proto_other.go's implementation of readMapping works only on linux - i.e. it reads /proc/self/maps which doesn't exist on macos. Maybe that should be using vmmap, or the underlying APIs that it uses? |
And sure enough, a quick hack (that runs vmmap) shows that this, at least for one test case, works. Ideally readMappings should use mach_vm_region rather than vmmap. With this in place, another issue with pprof shows up whereby listing source code via the webui (not the cli list command) breaks. This has not worked for so long now that other bugs have crept in. |
See https://go-review.googlesource.com/c/go/+/498375 for a proposed fix. |
@cosnicolaou thanks for the CL! As you noted, it is better to use the |
If you give me some pointers as to how you'd like the new system call added to the runtime I can give it a try. I'm reasonably confident it will work since I already wrote a stand-alone C program to get the appropriate information, assuming you're ok with the assumption that the text section we care about is the first/lowest address? I didn't prototype any means of getting the name of the binary associated with a segment. |
https://cs.opensource.google/go/go/+/master:src/runtime/sys_darwin.go this contains the syscall code. Basically, it needs a Go function like https://cs.opensource.google/go/go/+/master:src/runtime/sys_darwin.go;l=123 , an import pragma like https://cs.opensource.google/go/go/+/master:src/runtime/sys_darwin.go;l=123 , and an assembly trampoline like https://cs.opensource.google/go/go/+/master:src/runtime/sys_darwin_amd64.s;l=415. These are defined in the runtime package. You'll need to export them to the runtime/pprof package. You could do something similar to what we did for crypto/x509: https://cs.opensource.google/go/go/+/master:src/runtime/sys_darwin.go;l=105 and https://cs.opensource.google/go/go/+/master:src/crypto/x509/internal/macos/corefoundation.go;l=62 Yeah, I tried a C program, too, but didn't get a chance to port it to the Go runtime :)
Yes, I think that should be fine. In fact I'm not sure we actually need this assumption. I think you can look for just the executable mappings, matching with the executable segments from the binary. For disassembly we only need executable segments anyway. |
I've spent a little time looking at this, and of course, have some questions!
|
@cosnicolaou I'm sorry to say that in the runtime package it's pretty much hand-editing these days for new constants and types. Getting back to a sane state is #23341. |
@ianlancetaylor thanks for responding so quickly! I'll just hand edit things which is fine, since it's not a lot of code. Do you have any suggestions for 2? Thanks! |
You need to access a global variable defined in C code that is always linked in? Is that correct? And there is no C function that returns the value? I'm not sure that a |
For 2, I think cgo_import_dynamic is the right answer. Currently I think most use cases are for functions, not variables. So it is possible that the linker just doesn't handle variables well. I can take a look. |
The variable is 'mach_task_self_' declared as an extern in
mach/mach_init.h, as 'extern mach_port_t mach_task_self_;' and is
initialized by the OS when a process starts.
I've tried referencing it from go assembly, as in:
TEXT main·mach_task_self(SB),NOSPLIT,$0
//MOVD $100, R0
MOVD mach_task_self_<>+0(SB), R0
MOVD R0, ret+8(SP)
RET
with various attempts of cgo_import_dynamic, none of which work. I believe
the syntax for cgo_import_dynamic is cgo_import_dynamic
'my-name-for-symbol' 'name-of-symbol-in-library'. My C program that
accesses it only depends on libSystem, so I assume that this is the correct
library to use. It appears undefined in the executable as _mach_task_self_
, hence I believe that its name in the library is also _mach_task_self_.
This suggests that the following line, in a go file in the same package as
the above assembly file, should work.
//go:cgo_import_dynamic mach_task_self_ _mach_task_self_
"/usr/lib/libSystem.B.dylib"
I get a linker error of `main.mach_task_self: relocation target
mach_task_self_ not defined`.
Would you expect this to work? Or am I missing something? I've tried many
combinations of names, with/without leading underscores, the Core
foundation libraries also.
I've never used this part of the system before so it's hard for me to know
if I'm messing things up or if it's a bug somewhere.
Cheers, Cos.
…On Wed, Jun 7, 2023 at 12:34 PM cherrymui ***@***.***> wrote:
For 2, I think cgo_import_dynamic is the right answer. Currently I think
most use cases are for functions, not variables. So it is possible that the
linker just doesn't handle variables well. I can take a look.
—
Reply to this email directly, view it on GitHub
<#50891 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACCMUYDVJHCMVLNQ6CWVASTXKDJUZANCNFSM5NBQ3LLQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I don't know if this is the problem, but in this line:
you don't want the
|
I took a look at the linker code and I think we just don't handle cgo_import_dynamic for variables well (because we never needed it before). I'll see if I can make the linker handle this. Thanks. |
Ah yes, Ian's suggestion that definitely helps - I forgot to remove the <> when trying out different data approaches. Now I get the error below: Which digging into the linker shows that a value is never set for the external symbol which could be because either it doesn't exist or the linker doesn't handle variables as well as functions for the cgo pragmas. Sounds like the latter. I extracted the libraries from the shared cache and found that this symbol is defined in the lowest level dyld library rather than libSystem (see below), but making that change to the cgo_import doesn't help, so I think it is the linker. $nm ./libraries/usr/lib/dyld | grep mach_task_self |
Yeah, it misses some case in the linker. The patch below seems to make it work:
(Currently AMD64 internal linking mode only. I still need to do external linking mode (i.e. linking with cgo) and ARM64.) With this patch, this seems to work: in Go, add
and in assembly,
Thanks. |
This should handle external linking. Still need to do ARM64. Thanks. |
ok, I don't have access to an intel mac right now so I tried the above in the arm64/asm.go. Unfortunately, I think a larger change is required. adddynrel is called for the symbol main.get_mach_task_self and a targ of libc_mach_task_self, but the type of the relocation is R_ADDRARM64 which is not caught by any of the switch arms in adddynrel. Note that main.get_mach_task_self is the function that references the variable. Hope this helps! |
@cosnicolaou this is the ARM64 version. It has different relocation types and instructions. Hopefully this will work. Thanks!
I'll clean it up and turn it into a CL. Thanks. |
Yep - it does indeed work for arm64! Thanks! I'll try the intel version when I get back home in a week or so and have access to an intel mbp. |
Change https://go.dev/cl/501855 mentions this issue: |
FYI, I have the arm version of vm_region_info working. What are your plans for merging the linker changes and how should I construct the PRs for my changes - i.e. would you like to see one PR with the mach_vm_region assembly code and a second one with the pprof mapping changes, or would you prefer one? I would assume one, but lmk. |
This is great! Thanks. We are currently in the freeze for Go 1.21 release, so I have to wait for the tree opens (probably late July) then I can submit the linker CL. (In the meantime, feel free to copy that code into your CL, if that's easier for you.) For your change, yeah, a single CL is fine. Thanks. |
Take a look at https://go-review.googlesource.com/c/go/+/503919 when you have a moment, it includes your changes too. |
Change https://go.dev/cl/503919 mentions this issue: |
Currently, on darwin, we only support cgo_dynamic_import for functions, but not variables, as we don't need it before. mach_task_self_ is a variable defined in the system library, which can be used to e.g. access the process's memory mappings via the mach API. The C header defines a macro mach_task_self(), which refers to the variable. To use mach_task_self_ (in pure-Go programs) we need to access it in Go. This CL handles cgo_dynamic_import for variables in the linker, loading its address via the GOT. (Currently only on Darwin, as we only need it there.) For #50891. Change-Id: Idf64fa88ba2f2381443a1ed0b42b14b581843493 Reviewed-on: https://go-review.googlesource.com/c/go/+/501855 Run-TryBot: Cherry Mui <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Reviewed-by: Than McIntosh <[email protected]>
Change https://go.dev/cl/516156 mentions this issue: |
The type machVMRegionBasicInfoData is generated from C type vm_region_basic_info_data_64_t, which is a packed struct with a 64-bit field at offset 20. We cannot use uint64 as the field type in the Go struct, as that will be aligned at offset 24, which does not match the C struct. Change back to [8]byte (which is what the cgo command generates), but keep the name Offset. Updates #61707. Updates #50891. Change-Id: I2932328d7f9dfe9d79cff89752666c794d4d3788 Reviewed-on: https://go-review.googlesource.com/c/go/+/516156 Reviewed-by: Ian Lance Taylor <[email protected]> TryBot-Result: Gopher Robot <[email protected]> Run-TryBot: Cherry Mui <[email protected]>
Change https://go.dev/cl/518315 mentions this issue: |
Change https://go.dev/cl/546475 mentions this issue: |
For #50891. For #61015. For #61422. Change-Id: I30d580814ac02fe9f3fbd1a101b2cc05947a9aaa Reviewed-on: https://go-review.googlesource.com/c/go/+/546475 Reviewed-by: Cherry Mui <[email protected]> Reviewed-by: Michael Pratt <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]>
For golang#50891. For golang#61015. For golang#61422. Change-Id: I30d580814ac02fe9f3fbd1a101b2cc05947a9aaa Reviewed-on: https://go-review.googlesource.com/c/go/+/546475 Reviewed-by: Cherry Mui <[email protected]> Reviewed-by: Michael Pratt <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]>
Change https://go.dev/cl/592499 mentions this issue: |
CL 501855 added support for cgo_dynamic_import variables on Darwin. But it didn't support the plugin build mode on amd64, where the assembler turns a direct load (R_PCREL) to a load via GOT (R_GOTPCREL). This CL adds the support. We just need to handle external linking mode, as this can only occur in plugin or shared build mode, which requires external linking. Fixes #67976. Updates #50891. Change-Id: I0f56265d50bfcb36047fa5538ad7a5ec77e7ef96 Reviewed-on: https://go-review.googlesource.com/c/go/+/592499 Reviewed-by: David Chase <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]>
Change https://go.dev/cl/595175 mentions this issue: |
…Darwin in plugin mode CL 501855 added support for cgo_dynamic_import variables on Darwin. But it didn't support the plugin build mode on amd64, where the assembler turns a direct load (R_PCREL) to a load via GOT (R_GOTPCREL). This CL adds the support. We just need to handle external linking mode, as this can only occur in plugin or shared build mode, which requires external linking. Fixes #68122. Updates #67976. Updates #50891. Change-Id: I0f56265d50bfcb36047fa5538ad7a5ec77e7ef96 Reviewed-on: https://go-review.googlesource.com/c/go/+/592499 Reviewed-by: David Chase <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]> (cherry picked from commit 44f1870) Reviewed-on: https://go-review.googlesource.com/c/go/+/595175 Reviewed-by: Joedian Reid <[email protected]> Reviewed-by: Than McIntosh <[email protected]>
CL 501855 added support for cgo_dynamic_import variables on Darwin. But it didn't support the plugin build mode on amd64, where the assembler turns a direct load (R_PCREL) to a load via GOT (R_GOTPCREL). This CL adds the support. We just need to handle external linking mode, as this can only occur in plugin or shared build mode, which requires external linking. Fixes golang#67976. Updates golang#50891. Change-Id: I0f56265d50bfcb36047fa5538ad7a5ec77e7ef96 Reviewed-on: https://go-review.googlesource.com/c/go/+/592499 Reviewed-by: David Chase <[email protected]> LUCI-TryBot-Result: Go LUCI <[email protected]>
On my Mac, if I run go test -cpuprofile=x.prof anything, and then go tool pprof x.prof, then the disasm command in pprof does not work, for any function at all:
I have not checked whether this is Mac-specific.
The text was updated successfully, but these errors were encountered: