Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: retry RPC call when no server is available #15140

Merged
merged 4 commits into from
Nov 4, 2022
Merged
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .changelog/15140.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
```release-note:bug
client: prevent allocations from failing on client reconnect by retrying RPC requests when no servers are available yet
```
43 changes: 24 additions & 19 deletions client/rpc.go
Original file line number Diff line number Diff line change
Expand Up @@ -70,34 +70,39 @@ func (c *Client) RPC(method string, args interface{}, reply interface{}) error {
}

TRY:
var rpcErr error

server := c.servers.FindServer()
if server == nil {
return noServersErr
rpcErr = noServersErr
}

// Make the request.
rpcErr := c.connPool.RPC(c.Region(), server.Addr, method, args, reply)
if server != nil {
lgfa29 marked this conversation as resolved.
Show resolved Hide resolved
// Make the request.
rpcErr = c.connPool.RPC(c.Region(), server.Addr, method, args, reply)

if rpcErr == nil {
c.fireRpcRetryWatcher()
return nil
}
if rpcErr == nil {
c.fireRpcRetryWatcher()
return nil
}

// If shutting down, exit without logging the error
select {
case <-c.shutdownCh:
return nil
default:
}
// If shutting down, exit without logging the error
select {
case <-c.shutdownCh:
return nil
default:
}

// Move off to another server, and see if we can retry.
c.rpcLogger.Error("error performing RPC to server", "error", rpcErr, "rpc", method, "server", server.Addr)
c.servers.NotifyFailedServer(server)
// Move off to another server, and see if we can retry.
c.rpcLogger.Error("error performing RPC to server", "error", rpcErr, "rpc", method, "server", server.Addr)
c.servers.NotifyFailedServer(server)

if !canRetry(args, rpcErr) {
c.rpcLogger.Error("error performing RPC to server which is not safe to automatically retry", "error", rpcErr, "rpc", method, "server", server.Addr)
return rpcErr
if !canRetry(args, rpcErr) {
c.rpcLogger.Error("error performing RPC to server which is not safe to automatically retry", "error", rpcErr, "rpc", method, "server", server.Addr)
return rpcErr
}
}

if time.Now().After(deadline) {
// Blocking queries are tricky. jitters and rpcholdtimes in multiple places can result in our server call taking longer than we wanted it to. For example:
// a block time of 5s may easily turn into the server blocking for 10s since it applies its own RPCHoldTime. If the server dies at t=7s we still want to retry
Expand Down