Skip to content

Commit

Permalink
Merge #117519
Browse files Browse the repository at this point in the history
117519: roachtest: fail the latency verifier immediately if latency is too high r=stevendanna a=msbutler

Previously, the latency verifier used by c2c and cdc roachtests would not fail immediately if latency was too high. This patch fails the latency verifier immediately, as the roachtest is going to eventually fail anyway.

In addition, immediately after a latency verifier failure, this patch captures c2c debug zips to better understand the cause of the increased latency.

Fixes #117182

Release note: none

Co-authored-by: Michael Butler <[email protected]>
  • Loading branch information
craig[bot] and msbutler committed Jan 9, 2024
2 parents 0f8d589 + 846ccf9 commit 01da063
Show file tree
Hide file tree
Showing 2 changed files with 12 additions and 1 deletion.
9 changes: 8 additions & 1 deletion pkg/cmd/roachtest/tests/cluster_to_cluster.go
Original file line number Diff line number Diff line change
Expand Up @@ -930,7 +930,14 @@ func (rd *replicationDriver) main(ctx context.Context) {

latencyMonitor := rd.newMonitor(ctx)
latencyMonitor.Go(func(ctx context.Context) error {
return lv.pollLatencyUntilJobSucceeds(ctx, rd.setup.dst.db, ingestionJobID, time.Second, workloadDoneCh)
if err := lv.pollLatencyUntilJobSucceeds(ctx, rd.setup.dst.db, ingestionJobID, time.Second, workloadDoneCh); err != nil {
// The latency poller may have failed because latency got too high. Grab a
// debug zip before the replication jobs spin down.
rd.fetchDebugZip(ctx, rd.setup.src.nodes, "latency_source_debug.zip")
rd.fetchDebugZip(ctx, rd.setup.dst.nodes, "latency_dest_debug.zip")
return err
}
return nil
})
defer latencyMonitor.Wait()

Expand Down
4 changes: 4 additions & 0 deletions pkg/cmd/roachtest/tests/latency_verifier.go
Original file line number Diff line number Diff line change
Expand Up @@ -181,6 +181,10 @@ func (lv *latencyVerifier) pollLatencyUntilJobSucceeds(
lv.logger.Printf("unexpected status: %s, error: %s", status, info.GetError())
return errors.Errorf("unexpected status: %s", status)
}
if lv.targetSteadyLatency != 0 && lv.maxSeenSteadyLatency > lv.targetSteadyLatency {
return errors.Errorf("max latency was more than allowed: %s vs %s",
lv.maxSeenSteadyLatency, lv.targetSteadyLatency)
}
}
}

Expand Down

0 comments on commit 01da063

Please sign in to comment.