Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Another controller panic under very high load #1900

Closed
deniseli opened this issue Jun 27, 2024 · 5 comments · Fixed by #1904
Closed

Another controller panic under very high load #1900

deniseli opened this issue Jun 27, 2024 · 5 comments · Fixed by #1904
Assignees

Comments

@deniseli
Copy link
Contributor

It is possible to panic the controller somehow when hammering ftl dev with a very high rate of verb calls for over a minute. I managed to hit this while sending 1 request every 10 ms and have not yet been able to reproduce it.

error:runner1: panic in RPC: goroutine 5349430 [running]:
runtime/debug.Stack()
	/Users/dli/Library/Caches/hermit/pkg/go-1.22.4/src/runtime/debug/stack.go:24 +0x64
github.com/TBD54566975/ftl/internal/rpc.handlePanic({0x1016dc658?, 0x14006150520?})
	/Users/dli/Development/ftl/internal/rpc/context.go:127 +0xdc
panic({0x101451580?, 0x102318930?})
	/Users/dli/Library/Caches/hermit/pkg/go-1.22.4/src/runtime/panic.go:770 +0x124
github.com/TBD54566975/ftl/backend/runner.(*Service).Call(0x14000564f68?, {0x1016dbc38?, 0x14003987590?}, 0x140053846e0?)
	/Users/dli/Development/ftl/backend/runner/runner.go:200 +0xac
connectrpc.com/connect.NewUnaryHandler[...].func1({0x1016e6d40, 0x1400148d780})
	/Users/dli/go/pkg/mod/connectrpc.com/[email protected]/handler.go:52 +0x13c
connectrpc.com/otelconnect.(*Interceptor).WrapUnary.func1({0x1016dbc38, 0x14003987560}, {0x1016e6d40, 0x1400148d780})
	/Users/dli/go/pkg/mod/connectrpc.com/[email protected]/interceptor.go:152 +0x10c0
github.com/TBD54566975/ftl/internal/rpc.(*metadataInterceptor).WrapUnary.func1({0x1016dc658, 0x14006150520}, {0x1016e6d40, 0x1400148d780})
	/Users/dli/Development/ftl/internal/rpc/context.go:204 +0x188
github.com/TBD54566975/ftl/internal/rpc.(*panicInterceptor).WrapUnary.func1({0x1016dc658?, 0x14006150520?}, {0x1016e6d40?, 0x1400148d780?})
	/Users/dli/Development/ftl/internal/rpc/context.go:150 +0x70
connectrpc.com/connect.NewUnaryHandler[...].func2({0x102890ca8, 0x14006150540})
	/Users/dli/go/pkg/mod/connectrpc.com/[email protected]/handler.go:70 +0x84
connectrpc.com/connect.(*Handler).ServeHTTP(0x140002f8a80, {0x1016d86b0, 0x1400587fd70}, 0x14000adc480)
	/Users/dli/go/pkg/mod/connectrpc.com/[email protected]/handler.go:238 +0x7f0
github.com/TBD54566975/ftl/backend/protos/xyz/block/ftl/v1/ftlv1connect.NewVerbServiceHandler.func1({0x1016d86b0, 0x1400587fd70}, 0x14000adc480)
	/Users/dli/Development/ftl/backend/protos/xyz/block/ftl/v1/ftlv1connect/ftl.connect.go:300 +0x180
net/http.HandlerFunc.ServeHTTP(0x140006389c0?, {0x1016d86b0?, 0x1400587fd70?}, 0x100af5814?)
	/Users/dli/Library/Caches/hermit/pkg/go-1.22.4/src/net/http/server.go:2166 +0x38
net/http.(*ServeMux).ServeHTTP(0x14000adc360?, {0x1016d86b0, 0x1400587fd70}, 0x14000adc480)
	/Users/dli/Library/Caches/hermit/pkg/go-1.22.4/src/net/http/server.go:2683 +0x1a4
github.com/TBD54566975/ftl/internal/rpc.NewServer.ContextValuesMiddleware.func3({0x1016d86b0, 0x1400587fd70}, 0x14000adc360)
	/Users/dli/Development/ftl/internal/rpc/rpc.go:98 +0xb4
net/http.HandlerFunc.ServeHTTP(0x14004d22180?, {0x1016d86b0?, 0x1400587fd70?}, 0x140043e4e00?)
	/Users/dli/Library/Caches/hermit/pkg/go-1.22.4/src/net/http/server.go:2166 +0x38
golang.org/x/net/http2.(*serverConn).runHandler(0x1023b6800?, 0x0?, 0x0?, 0x0?)
	/Users/dli/go/pkg/mod/golang.org/x/[email protected]/http2/server.go:2414 +0xfc
created by golang.org/x/net/http2.(*serverConn).scheduleHandler in goroutine 3261
	/Users/dli/go/pkg/mod/golang.org/x/[email protected]/http2/server.go:2348 +0x208
: runtime error: invalid memory address or nil pointer dereference

In case the controller code changes significantly from my local version as I hit this, here's a reference to the line in the above call stack:

ftl/backend/runner/runner.go:200:

func (s *Service) Call(ctx context.Context, req *connect.Request[ftlv1.CallRequest]) (*connect.Response[ftlv1.CallResponse], \
error) {
        deployment, ok := s.deployment.Load().Get()
        if !ok {
                return nil, connect.NewError(connect.CodeUnavailable, errors.New("no deployment"))
        }
        response, err := deployment.plugin.Client.Call(ctx, req)
        return connect.NewResponse(response.Msg), err // this line right here
}
@github-actions github-actions bot added the triage Issue needs triaging label Jun 27, 2024
@ftl-robot ftl-robot mentioned this issue Jun 27, 2024
@alecthomas
Copy link
Collaborator

This is an easy one. The code should be:

        response, err := deployment.plugin.Client.Call(ctx, req)
        if err != nil {
                return nil, err
        }
        return connect.NewResponse(response.Msg), nil

@deniseli deniseli self-assigned this Jun 28, 2024
@github-actions github-actions bot removed the triage Issue needs triaging label Jun 28, 2024
@deniseli
Copy link
Contributor Author

Awesome! I'll try to get a quick fix out

@deniseli
Copy link
Contributor Author

@alecthomas did you say there was a lint rule we could turn back on to find all the lines where we have an unchecked err? I wonder if we should just do that now - should be quick enough and it could be hiding more panics

@alecthomas
Copy link
Collaborator

That was a different linter, which is now enabled. I don't think there is one for this situation, but I could be wrong.

@deniseli
Copy link
Contributor Author

Ahh, got it

deniseli added a commit that referenced this issue Jun 28, 2024
Fixes #1900

I'm not totally confident in the error code here - possibly CodeUnknown
would be more appropriate?
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants