Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: TLS slot index over 64 and crash #59213

Closed
funte opened this issue Mar 24, 2023 · 13 comments
Closed

runtime: TLS slot index over 64 and crash #59213

funte opened this issue Mar 24, 2023 · 13 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. OS-Windows
Milestone

Comments

@funte
Copy link

funte commented Mar 24, 2023

What version of Go are you using (go version)?

$ go version
 go version go1.20.2 windows/amd64

Does this issue reproduce with the latest release?

yes

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
set GO111MODULE=on
set GOARCH=amd64
set GOBIN=
set GOCACHE=C:\Users\...\AppData\Local\go-build
set GOENV=C:\Users\...\AppData\Roaming\go\env
set GOEXE=.exe
set GOEXPERIMENT=
set GOFLAGS=
set GOHOSTARCH=amd64
set GOHOSTOS=windows
set GOINSECURE=
set GOMODCACHE=...
set GONOPROXY=
set GONOSUMDB=
set GOOS=windows
set GOPATH=...
set GOPRIVATE=
set GOPROXY=https://proxy.golang.org,direct
set GOROOT=...
set GOSUMDB=sum.golang.org
set GOTMPDIR=
set GOTOOLDIR=...
set GOVCS=
set GOVERSION=go1.20.2
set GCCGO=gccgo
set GOAMD64=v1
set AR=ar
set CC=gcc
set CXX=g++
set CGO_ENABLED=1
set GOMOD=NUL
set GOWORK=
set CGO_CFLAGS=-O2 -g
set CGO_CPPFLAGS=
set CGO_CXXFLAGS=-O2 -g
set CGO_FFLAGS=-O2 -g
set CGO_LDFLAGS=-O2 -g
set PKG_CONFIG=pkg-config
set GOGCCFLAGS=-m64 -mthreads -Wl,--no-gc-sections -fmessage-length=0 -fdebug-prefix-map=C:\Users\xxx\AppData\Local\Temp\go-build2366715319=/tmp/go-build -gno-record-gcc-switches

What did you do?

  1. Build a golang dll in windows with command go build -buildmode=c-shared -o="a.dll" "./main.go":
package main

import "C"

func main() {}
  1. Using the windows api CreateRemoteThread inject a.dll to any process.
  2. Wait seconds the injected process crash and ida debugger show this:
    1
    The TlsAlloc returned slot index over 64 and the golang runtime assertion failed.

What did you expect to see?

What did you see instead?

@qmuntal qmuntal added OS-Windows NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Mar 24, 2023
@qmuntal
Copy link
Member

qmuntal commented Mar 24, 2023

Thanks for reporting this issue @funte. Could you share a code sample for the step 2?

TlsAlloc shouldn't return a slot index over 64 unless the target process already has the previous Tls slots reserved, either by itself or by loaded DLLs, which is not a typical situation. That's why it seems strange to me that you can reproduce the crash injecting the Go DLL to any process.

Previous comment aside, Windows does support more than 64 slots (more context here), but the Go runtime does not currently support them because they need some special handling on our side which is not there yet.

@golang/windows @golang/compiler

@seankhliao seankhliao changed the title affected/package: TLS slot index over 64 and crash runtime: TLS slot index over 64 and crash Mar 24, 2023
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Mar 24, 2023
@funte
Copy link
Author

funte commented Mar 24, 2023

Thanks for reporting this issue @funte. Could you share a code sample for the step 2?

TlsAlloc shouldn't return a slot index over 64 unless the target process already has the previous Tls slots reserved, either by itself or by loaded DLLs, which is not a typical situation. That's why it seems strange to me that you can reproduce the crash injecting the Go DLL to any process.

Previous comment aside, Windows does support more than 64 slots (more context here), but the Go runtime does not currently support them because they need some special handling on our side which is not there yet.

@golang/windows @golang/compiler

Reproduce project here https://github.com/funte/59213.git

@qmuntal
Copy link
Member

qmuntal commented Mar 25, 2023

@funte I can't run D3D12RaytracingHelloWorld.exe because I don't have the necessary drivers to support DirectX Raytracing, but looking at the source code of microsoft/DirectX-Graphics-Samples it looks like it makes heavy use of thread local variables, to the point that it can occupy more than 64 TLS slots.

Could you confirm this by debugging your test using windbg, run !tls -1 (see docs) and share the output? That will tell how many slots are in use.

@funte
Copy link
Author

funte commented Mar 26, 2023

@funte I can't run D3D12RaytracingHelloWorld.exe because I don't have the necessary drivers to support DirectX Raytracing, but looking at the source code of microsoft/DirectX-Graphics-Samples it looks like it makes heavy use of thread local variables, to the point that it can occupy more than 64 TLS slots.

Could you confirm this by debugging your test using windbg, run !tls -1 (see docs) and share the output? That will tell how many slots are in use.

Just i mistake the thread id, after change to main thread and run the !tls -1 command, show this, seems its full:
image
Another situtation to note, if i inject the target process immediately after its startup, it will running normaly, and i have change my dll project to c++ and this crash not happen anymore.

@qmuntal
Copy link
Member

qmuntal commented Mar 26, 2023

Thanks the detailed report @funte.

Just i mistake the thread id, after change to main thread and run the !tls -1 command, show this, seems its full

That's the cause of the panic in the Go runtime time. Go does not currently support TLS slot indexes equal or higher than 64, this should be fixed. I'll try to send a CL during this development cycle.

Another situtation to note, if i inject the target process immediately after its startup, it will running normaly, and i have change my dll project to c++ and this crash not happen anymore.

Can you share this project? I doubt it is doing exactly the same as the Go runtime does.

@qmuntal qmuntal self-assigned this Mar 26, 2023
@qmuntal qmuntal added this to the Go1.21 milestone Mar 26, 2023
@qmuntal
Copy link
Member

qmuntal commented Apr 19, 2023

Small recap: Since Go 1.20, the Go runtime allocates the TLS slot in the TEB TLS slots instead of using the TLS arbitrary slot. See CL 431775 for more context.

The problem is that the TEB TLS slots array only has capacity for 64 indices, allocating more requires some complex logic that we don't support yet. Although the Go runtime only allocates one index, a Go DLL can be loaded in a process with more than 64 TLS slots allocated, in which case we abort:

MOVQ runtime·_TlsAlloc(SB), AX
CALL AX
MOVQ 32(SP), SP
MOVQ AX, CX // TLS index
// Assert that slot is less than 64 so we can use _TEB->TlsSlots
CMPQ CX, $64
JB ok
CALL runtime·abort(SB)

I've tried to find an easy solution to support TLS indices higher than 64 that can be backported, but I haven't. It requires more investigation and I won't have bandwidth to get it done before the go1.21 freeze.

On the other hand, a good compromise could be to fallback to use the TLS arbitrary slot (as we were doing before g1.20) when the TEB TLS slots are filled, instead of panicking. Although not ideal, at least the regression would be fixed and we could backport it.

Thoughts @alexbrainman?

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/486816 mentions this issue: runtime: fallback to TEB TLS arbitrary pointer when TLS slots are full

@qmuntal
Copy link
Member

qmuntal commented Apr 20, 2023

@funte @fzwoch could you try if the CL 486816 fixes your issues?

You can do that by running gotip download 486816 (https://pkg.go.dev/golang.org/dl/gotip), which downloads the CL and builds Go from source with the patch applied, and then use gotip build instead of go build.

@funte
Copy link
Author

funte commented Apr 21, 2023

@qmuntal Its work!!

@funte funte closed this as completed Apr 21, 2023
@github-project-automation github-project-automation bot moved this from Todo to Done in Go Compiler / Runtime Apr 21, 2023
@qmuntal
Copy link
Member

qmuntal commented Apr 21, 2023

CL 486816 hasn't been merged yet, reponing the issue till it happens.

@qmuntal qmuntal reopened this Apr 21, 2023
@github-project-automation github-project-automation bot moved this from Done to In Progress in Go Compiler / Runtime Apr 21, 2023
@dmitshur dmitshur added NeedsFix The path to resolution is known, but the work has not been done. and removed NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. labels Apr 25, 2023
@github-project-automation github-project-automation bot moved this from In Progress to Done in Go Compiler / Runtime Apr 25, 2023
@qmuntal
Copy link
Member

qmuntal commented May 31, 2023

@gopherbot please backport to Go 1.20

@gopherbot
Copy link
Contributor

Backport issue(s) opened: #60535 (for 1.20).

Remember to create the cherry-pick CL(s) as soon as the patch is submitted to master, according to https://go.dev/wiki/MinorReleases.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/504475 mentions this issue: runtime: fallback to TEB arbitrary pointer when TLS slots are full

gopherbot pushed a commit that referenced this issue Jun 22, 2023
…en TLS slots are full

Note: This CL is cherry-picked from CL 486816 omitting the changes
in sys_windows_386.s, which don't apply to go1.20 release branch
because windows/386 started using the TEB TLS slot in go1.21 (CL 454675).

The Go runtime allocates the TLS slot in the TEB TLS slots instead of
using the TEB arbitrary pointer. See CL 431775 for more context.

The problem is that the TEB TLS slots array only has capacity for 64
indices, allocating more requires some complex logic that we don't
support yet.

Although the Go runtime only allocates one index, a Go DLL can be
loaded in a process with more than 64 TLS slots allocated,
in which case it abort.

This CL avoids aborting by falling back to the older behavior, that
is to use the TEB arbitrary pointer.

Fixes #60535
Updates #59213

Change-Id: I39c73286fe2da95aa9c5ec5657ee0979ecbec533
Reviewed-on: https://go-review.googlesource.com/c/go/+/486816
Reviewed-by: Dmitri Shuralyov <[email protected]>
Reviewed-by: Cherry Mui <[email protected]>
Run-TryBot: Quim Muntal <[email protected]>
Reviewed-by: Bryan Mills <[email protected]>
Reviewed-by: Alex Brainman <[email protected]>
TryBot-Result: Gopher Robot <[email protected]>
(cherry picked from commit 715d53c)
Reviewed-on: https://go-review.googlesource.com/c/go/+/504475
Reviewed-by: Dmitri Shuralyov <[email protected]>
Auto-Submit: Dmitri Shuralyov <[email protected]>
@golang golang locked and limited conversation to collaborators Jun 19, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. FrozenDueToAge NeedsFix The path to resolution is known, but the work has not been done. OS-Windows
Projects
None yet
Development

No branches or pull requests

4 participants