Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip recalculating the rate in MaxReplicationLagModule when it can't be done #12620

Merged
merged 4 commits into from
Apr 14, 2023
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions go/vt/throttler/max_replication_lag_module.go
Original file line number Diff line number Diff line change
Expand Up @@ -302,6 +302,12 @@ func (m *MaxReplicationLagModule) recalculateRate(lagRecordNow replicationLagRec
if lagRecordNow.isZero() {
panic("rate recalculation was triggered with a zero replication lag record")
}

// Protect against nil stats
if lagRecordNow.Stats == nil {
return
}

now := lagRecordNow.time
lagNow := lagRecordNow.lag()

Expand Down
49 changes: 49 additions & 0 deletions go/vt/throttler/max_replication_lag_module_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,8 @@ import (
"testing"
"time"

"github.com/stretchr/testify/assert"

"vitess.io/vitess/go/vt/log"

"vitess.io/vitess/go/vt/discovery"
Expand Down Expand Up @@ -83,6 +85,12 @@ func (tf *testFixture) process(lagRecord replicationLagRecord) {
tf.m.processRecord(lagRecord)
}

// recalculateRate does the same thing as MaxReplicationLagModule.recalculateRate() does
// for a new "lagRecord".
func (tf *testFixture) recalculateRate(lagRecord replicationLagRecord) {
tf.m.recalculateRate(lagRecord)
}

func (tf *testFixture) checkState(state state, rate int64, lastRateChange time.Time) error {
if got, want := tf.m.currentState, state; got != want {
return fmt.Errorf("module in wrong state. got = %v, want = %v", got, want)
Expand All @@ -96,6 +104,47 @@ func (tf *testFixture) checkState(state state, rate int64, lastRateChange time.T
return nil
}

func TestNewMaxReplicationLagModule_recalculateRate(t *testing.T) {
testCases := []struct {
name string
lagRecord replicationLagRecord
expectPanic bool
}{
{
name: "Zero lag",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Isn't this zero time, not zero lag?

Copy link
Contributor Author

@ejortegau ejortegau Apr 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I called it Zero lag because it tests the case here trigerring a panic when lagRecordNow.isZero() is True.

lagRecord: replicationLagRecord{
time: time.Time{},
TabletHealth: discovery.TabletHealth{Stats: nil},
},
expectPanic: true,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not in this PR, but we should revisit all the panics in the throttler code base. Better to return a well-known error and exit cleanly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, I was very surprised when I saw that this code just panics

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deepthi / @ejortegau: panics are cleaned up in this PR: #12901 👍. Reviews appreciated!

},
{
name: "nil lag record stats",
lagRecord: replicationLagRecord{
time: time.Now(),
TabletHealth: discovery.TabletHealth{Stats: nil},
},
expectPanic: false,
},
}

for _, aTestCase := range testCases {
theCase := aTestCase

t.Run(theCase.name, func(t *testing.T) {
t.Parallel()

fixture, err := newTestFixtureWithMaxReplicationLag(5)
assert.NoError(t, err)

if theCase.expectPanic {
assert.Panics(t, func() { fixture.recalculateRate(theCase.lagRecord) })
}
},
)
}
}

func TestMaxReplicationLagModule_RateNotZeroWhenDisabled(t *testing.T) {
tf, err := newTestFixtureWithMaxReplicationLag(ReplicationLagModuleDisabled)
if err != nil {
Expand Down