Skip to content

Commit

Permalink
Merge #39466
Browse files Browse the repository at this point in the history
39466: distsqlrun: log on distsqlrun.outbox Dial error r=yuzefovich a=asubiotto

It's useful to know when an outbox was unable to dial a node in a flow
since this failure could lead to timeouts.

Release note: None

Co-authored-by: Alfonso Subiotto Marqués <[email protected]>
  • Loading branch information
craig[bot] and asubiotto committed Aug 8, 2019
2 parents 458c0a4 + ebb2667 commit 311f7e4
Showing 1 changed file with 9 additions and 0 deletions.
9 changes: 9 additions & 0 deletions pkg/sql/distsqlrun/outbox.go
Original file line number Diff line number Diff line change
Expand Up @@ -219,6 +219,15 @@ func (m *outbox) mainLoop(ctx context.Context) error {
var err error
conn, err = m.flowCtx.Cfg.NodeDialer.Dial(ctx, m.nodeID)
if err != nil {
// Log any Dial errors. This does not have a verbosity check due to being
// a critical part of query execution: if this step doesn't work, the
// receiving side might end up hanging or timing out.
// TODO(asubiotto): On top of ignoring the circuit breaker here (#38602),
// we should also retry a failed Dial. Both changes rest on the argument
// that the gateway planned this query with the assumption that the
// remote node was reachable, the outbox should at least try a bit harder
// to make sure that this is in fact not the case.
log.Infof(ctx, "outbox: connection dial error: %+v", err)
return err
}
client := distsqlpb.NewDistSQLClient(conn)
Expand Down

0 comments on commit 311f7e4

Please sign in to comment.