Skip to content

Commit

Permalink
spanconfig: mark reconciliation job as idle
Browse files Browse the repository at this point in the history
Fixes #70538.

We have a forever running background AUTO SPAN CONFIG RECONCILIATION job
on tenant pods. To know when it's safe to wind down pods, we use the
number of currently running jobs as an indicator. Given the job is
forever running, we need an indicator to suggest that despite the job's
presence, it's safe to wind down.

In #74747 we added a thin API to the jobs subsystem to do just that,
with the intent of using it for idle changefeed jobs. We just cargo-cult
that same approach here to mark the reconciliation job as always idle.

Release note: None
  • Loading branch information
irfansharif committed Feb 10, 2022
1 parent d10188f commit 1f75f55
Show file tree
Hide file tree
Showing 2 changed files with 37 additions and 0 deletions.
5 changes: 5 additions & 0 deletions pkg/spanconfig/spanconfigjob/job.go
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,11 @@ func (r *resumer) Resume(ctx context.Context, execCtxI interface{}) error {
rc := execCtx.SpanConfigReconciler()
stopper := execCtx.ExecCfg().DistSQLSrv.Stopper

// The reconciliation job is a forever running background job. It's always
// safe to wind the SQL pod down whenever it's running -- something we
// indicate through the job's idle status.
r.job.MarkIdle(true)

// Start the protected timestamp reconciler. This will periodically poll the
// protected timestamp table to cleanup stale records. We take advantage of
// the fact that there can only be one instance of the spanconfig.Resumer
Expand Down
32 changes: 32 additions & 0 deletions pkg/spanconfig/spanconfigmanager/manager_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -236,3 +236,35 @@ func TestManagerCheckJobConditions(t *testing.T) {
tdb.Exec(t, `SET CLUSTER SETTING spanconfig.reconciliation_job.check_interval = '25m'`)
_ = checkInterceptCountGreaterThan(currentCount) // the job check interval setting triggers a check
}

// TestReconciliationJobIsIdle ensures that the reconciliation job, when
// resumed, is marked as idle.
func TestReconciliationJobIsIdle(t *testing.T) {
defer leaktest.AfterTest(t)()

var jobID jobspb.JobID
ctx := context.Background()
tc := testcluster.StartTestCluster(t, 1, base.TestClusterArgs{
ServerArgs: base.TestServerArgs{
Knobs: base.TestingKnobs{
SpanConfig: &spanconfig.TestingKnobs{
ManagerCreatedJobInterceptor: func(jobI interface{}) {
jobID = jobI.(*jobs.Job).ID()
},
},
},
},
})
defer tc.Stopper().Stop(ctx)

jobRegistry := tc.Server(0).JobRegistry().(*jobs.Registry)
testutils.SucceedsSoon(t, func() error {
if jobID == jobspb.JobID(0) {
return errors.New("waiting for reconciliation job to be started")
}
if !jobRegistry.TestingIsJobIdle(jobID) {
return errors.New("expected reconciliation job to be idle")
}
return nil
})
}

0 comments on commit 1f75f55

Please sign in to comment.