Skip to content

Commit

Permalink
sql/schemachanger: avoid constructing state for complete jobs
Browse files Browse the repository at this point in the history
Previously, there were scenarios where after all
stages were complete we could attempt to reconstruct
the state either due to transaction retries or
crashes before the job status was marked as succeeded.
This would lead to us attempting to look up descriptors
that may potentially be fully removed when attempting
to resume the job again. To address this, this patch
will use remove all descriptors IDs once the last stage
has been executed from the job.

Release note: None
  • Loading branch information
fqazi committed Sep 28, 2022
1 parent 1a98e78 commit 026b23e
Show file tree
Hide file tree
Showing 178 changed files with 469 additions and 10 deletions.
2 changes: 2 additions & 0 deletions pkg/ccl/schemachangerccl/testdata/end_to_end/create_index
Original file line number Diff line number Diff line change
Expand Up @@ -388,6 +388,8 @@ write *eventpb.FinishSchemaChange to event log
create job #2 (non-cancelable: true): "GC for removed temporary index; CREATE INDEX id1 ON defaultdb.public.t1 (id, name) STORING (money) PARTITION BY LIST (id) (PARTITION p1 VALUES IN (1))"
descriptor IDs: [104]
update progress of schema change job #1: "all stages completed"
set schema change job #1 to non-cancellable
updated schema change job #1 descriptor IDs to []
commit transaction #11
notified job registry to adopt jobs: [2]
# end PostCommitPhase
Original file line number Diff line number Diff line change
Expand Up @@ -256,6 +256,8 @@ write *eventpb.DropDatabase to event log: DROP DATABASE ‹multi_region_test_db
create job #2 (non-cancelable: true): "GC for DROP DATABASE multi_region_test_db CASCADE"
descriptor IDs: [108 104]
update progress of schema change job #1: "all stages completed"
set schema change job #1 to non-cancellable
updated schema change job #1 descriptor IDs to []
commit transaction #3
notified job registry to adopt jobs: [2]
# end PostCommitPhase
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,8 @@ write *eventpb.DropTable to event log: DROP TABLE ‹multi_region_test_db›.‹
create job #2 (non-cancelable: true): "GC for DROP TABLE multi_region_test_db.public.table_regional_by_row"
descriptor IDs: [108]
update progress of schema change job #1: "all stages completed"
set schema change job #1 to non-cancellable
updated schema change job #1 descriptor IDs to []
commit transaction #3
notified job registry to adopt jobs: [2]
# end PostCommitPhase
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,8 @@ write *eventpb.DropTable to event log: DROP TABLE ‹multi_region_test_db›.‹
create job #2 (non-cancelable: true): "GC for DROP TABLE multi_region_test_db.public.table_regional_by_table CASCADE"
descriptor IDs: [108]
update progress of schema change job #1: "all stages completed"
set schema change job #1 to non-cancellable
updated schema change job #1 descriptor IDs to []
commit transaction #3
notified job registry to adopt jobs: [2]
# end PostCommitPhase
Original file line number Diff line number Diff line change
Expand Up @@ -435,6 +435,8 @@ EXPLAIN (ddl, verbose) CREATE INDEX id1
│ JobID: 1
└── • UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
IsNonCancelable: true
JobID: 1
RunningStatus: all stages completed
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,8 @@ EXPLAIN (ddl, verbose) rollback at post-commit stage 1 of 7;
│ JobID: 1
└── • UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
IsNonCancelable: true
JobID: 1
RunningStatus: all stages completed
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,8 @@ EXPLAIN (ddl, verbose) rollback at post-commit stage 2 of 7;
│ JobID: 1
└── • UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
IsNonCancelable: true
JobID: 1
RunningStatus: all stages completed
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,8 @@ EXPLAIN (ddl, verbose) rollback at post-commit stage 3 of 7;
│ JobID: 1
└── • UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
IsNonCancelable: true
JobID: 1
RunningStatus: all stages completed
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,8 @@ EXPLAIN (ddl, verbose) rollback at post-commit stage 4 of 7;
│ JobID: 1
└── • UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
IsNonCancelable: true
JobID: 1
RunningStatus: all stages completed
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,8 @@ EXPLAIN (ddl, verbose) rollback at post-commit stage 5 of 7;
│ JobID: 1
└── • UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
IsNonCancelable: true
JobID: 1
RunningStatus: all stages completed
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,8 @@ EXPLAIN (ddl, verbose) rollback at post-commit stage 6 of 7;
│ JobID: 1
└── • UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
IsNonCancelable: true
JobID: 1
RunningStatus: all stages completed
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,8 @@ EXPLAIN (ddl, verbose) rollback at post-commit stage 7 of 7;
│ JobID: 1
└── • UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
IsNonCancelable: true
JobID: 1
RunningStatus: all stages completed
Original file line number Diff line number Diff line change
Expand Up @@ -893,6 +893,12 @@ EXPLAIN (ddl, verbose) DROP DATABASE multi_region_test_db CASCADE;
│ JobID: 1
└── • UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
- 105
- 106
- 107
- 108
IsNonCancelable: true
JobID: 1
RunningStatus: all stages completed
Original file line number Diff line number Diff line change
Expand Up @@ -538,6 +538,10 @@ EXPLAIN (ddl, verbose) DROP TABLE multi_region_test_db.public.table_regional_by_
│ JobID: 1
└── • UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 105
- 107
- 108
IsNonCancelable: true
JobID: 1
RunningStatus: all stages completed
Original file line number Diff line number Diff line change
Expand Up @@ -410,6 +410,9 @@ EXPLAIN (ddl, verbose) DROP TABLE multi_region_test_db.public.table_regional_by_
│ JobID: 1
└── • UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 105
- 108
IsNonCancelable: true
JobID: 1
RunningStatus: all stages completed
16 changes: 6 additions & 10 deletions pkg/sql/drop_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -212,11 +212,9 @@ INSERT INTO t.kv VALUES ('c', 'e'), ('a', 'c'), ('b', 'd');
// TODO (lucy): Maybe this test API should use an offset starting
// from the most recent job instead.
if err := jobutils.VerifySystemJob(t, sqlRun, 0, jobspb.TypeNewSchemaChange, jobs.StatusSucceeded, jobs.Record{
Username: username.RootUserName(),
Description: "DROP DATABASE t CASCADE",
DescriptorIDs: descpb.IDs{
tbDesc.GetID(), dbDesc.GetID(), dbDesc.GetSchemaID(tree.PublicSchema),
},
Username: username.RootUserName(),
Description: "DROP DATABASE t CASCADE",
DescriptorIDs: nil,
}); err != nil {
t.Fatal(err)
}
Expand Down Expand Up @@ -647,11 +645,9 @@ func TestDropTable(t *testing.T) {
sqlRun := sqlutils.MakeSQLRunner(sqlDB)
if err := jobutils.VerifySystemJob(t, sqlRun, 0,
jobspb.TypeNewSchemaChange, jobs.StatusSucceeded, jobs.Record{
Username: username.RootUserName(),
Description: `DROP TABLE t.public.kv`,
DescriptorIDs: descpb.IDs{
tableDesc.GetID(),
},
Username: username.RootUserName(),
Description: `DROP TABLE t.public.kv`,
DescriptorIDs: nil,
}); err != nil {
t.Fatal(err)
}
Expand Down
5 changes: 5 additions & 0 deletions pkg/sql/schemachanger/scjob/job.go
Original file line number Diff line number Diff line change
Expand Up @@ -97,6 +97,11 @@ func (n *newSchemaChangeResumer) run(ctx context.Context, execCtxI interface{})
execCtx.ExtendedEvalContext().Tracing.KVTracingEnabled(),
)

// If there are no descriptors left, then we can short circuit here.
if len(payload.DescriptorIDs) == 0 {
return nil
}

err := scrun.RunSchemaChangesInJob(
ctx,
execCfg.DeclarativeSchemaChangerTestingKnobs,
Expand Down
4 changes: 4 additions & 0 deletions pkg/sql/schemachanger/scplan/internal/scstage/build.go
Original file line number Diff line number Diff line change
Expand Up @@ -583,6 +583,10 @@ func (bc buildContext) updateJobProgressOp(
var toRemove catalog.DescriptorIDSet
if next != nil {
toRemove = descIDsPresentBefore.Difference(descIDsPresentAfter)
} else {
// If the next stage is nil, simply remove al the descriptors, we are
// done processing
toRemove = descIDsPresentBefore
}
return &scop.UpdateSchemaChangerJob{
JobID: bc.scJobID,
Expand Down
17 changes: 17 additions & 0 deletions pkg/sql/schemachanger/scplan/testdata/alter_table_add_column
Original file line number Diff line number Diff line change
Expand Up @@ -128,6 +128,10 @@ PostCommitPhase stage 2 of 2 with 6 MutationType ops
DescriptorID: 106
JobID: 1
*scop.UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
- 105
- 106
IsNonCancelable: true
JobID: 1

Expand Down Expand Up @@ -414,6 +418,8 @@ PostCommitNonRevertiblePhase stage 3 of 3 with 4 MutationType ops
DescriptorID: 104
JobID: 1
*scop.UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
IsNonCancelable: true
JobID: 1

Expand Down Expand Up @@ -780,6 +786,8 @@ PostCommitNonRevertiblePhase stage 3 of 3 with 4 MutationType ops
DescriptorID: 104
JobID: 1
*scop.UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
IsNonCancelable: true
JobID: 1

Expand Down Expand Up @@ -1064,6 +1072,8 @@ PostCommitNonRevertiblePhase stage 3 of 3 with 4 MutationType ops
DescriptorID: 104
JobID: 1
*scop.UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
IsNonCancelable: true
JobID: 1

Expand Down Expand Up @@ -1247,6 +1257,9 @@ PostCommitPhase stage 2 of 2 with 7 MutationType ops
DescriptorID: 107
JobID: 1
*scop.UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
- 107
IsNonCancelable: true
JobID: 1

Expand Down Expand Up @@ -1567,6 +1580,8 @@ PostCommitNonRevertiblePhase stage 3 of 3 with 4 MutationType ops
DescriptorID: 108
JobID: 1
*scop.UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 108
IsNonCancelable: true
JobID: 1

Expand Down Expand Up @@ -1988,6 +2003,8 @@ PostCommitNonRevertiblePhase stage 2 of 2 with 8 MutationType ops
DescriptorID: 109
JobID: 1
*scop.UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 109
IsNonCancelable: true
JobID: 1

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -471,6 +471,8 @@ PostCommitNonRevertiblePhase stage 4 of 4 with 6 MutationType ops
DescriptorID: 104
JobID: 1
*scop.UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
IsNonCancelable: true
JobID: 1

Expand Down
9 changes: 9 additions & 0 deletions pkg/sql/schemachanger/scplan/testdata/alter_table_drop_column
Original file line number Diff line number Diff line change
Expand Up @@ -516,6 +516,10 @@ PostCommitNonRevertiblePhase stage 3 of 3 with 9 MutationType ops
DescriptorID: 107
JobID: 1
*scop.UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
- 105
- 107
IsNonCancelable: true
JobID: 1

Expand Down Expand Up @@ -1468,6 +1472,11 @@ PostCommitNonRevertiblePhase stage 3 of 3 with 10 MutationType ops
DescriptorID: 107
JobID: 1
*scop.UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
- 105
- 106
- 107
IsNonCancelable: true
JobID: 1

Expand Down
4 changes: 4 additions & 0 deletions pkg/sql/schemachanger/scplan/testdata/create_index
Original file line number Diff line number Diff line change
Expand Up @@ -180,6 +180,8 @@ PostCommitNonRevertiblePhase stage 2 of 2 with 4 MutationType ops
DescriptorID: 104
JobID: 1
*scop.UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
IsNonCancelable: true
JobID: 1

Expand Down Expand Up @@ -478,6 +480,8 @@ PostCommitNonRevertiblePhase stage 2 of 2 with 4 MutationType ops
DescriptorID: 104
JobID: 1
*scop.UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
IsNonCancelable: true
JobID: 1

Expand Down
15 changes: 15 additions & 0 deletions pkg/sql/schemachanger/scplan/testdata/drop_database
Original file line number Diff line number Diff line change
Expand Up @@ -1257,6 +1257,21 @@ PostCommitNonRevertiblePhase stage 1 of 1 with 74 MutationType ops
DescriptorID: 117
JobID: 1
*scop.UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
- 105
- 106
- 107
- 108
- 109
- 110
- 111
- 112
- 113
- 114
- 115
- 116
- 117
IsNonCancelable: true
JobID: 1

Expand Down
8 changes: 8 additions & 0 deletions pkg/sql/schemachanger/scplan/testdata/drop_index
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,8 @@ PostCommitNonRevertiblePhase stage 2 of 2 with 5 MutationType ops
DescriptorID: 104
JobID: 1
*scop.UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
IsNonCancelable: true
JobID: 1

Expand Down Expand Up @@ -257,6 +259,8 @@ PostCommitNonRevertiblePhase stage 2 of 2 with 7 MutationType ops
DescriptorID: 104
JobID: 1
*scop.UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
IsNonCancelable: true
JobID: 1

Expand Down Expand Up @@ -486,6 +490,8 @@ PostCommitNonRevertiblePhase stage 2 of 2 with 6 MutationType ops
DescriptorID: 104
JobID: 1
*scop.UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
IsNonCancelable: true
JobID: 1

Expand Down Expand Up @@ -765,6 +771,8 @@ PostCommitNonRevertiblePhase stage 2 of 2 with 5 MutationType ops
DescriptorID: 104
JobID: 1
*scop.UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 104
IsNonCancelable: true
JobID: 1

Expand Down
12 changes: 12 additions & 0 deletions pkg/sql/schemachanger/scplan/testdata/drop_owned_by
Original file line number Diff line number Diff line change
Expand Up @@ -794,5 +794,17 @@ PostCommitNonRevertiblePhase stage 1 of 1 with 47 MutationType ops
DescriptorID: 113
JobID: 1
*scop.UpdateSchemaChangerJob
DescriptorIDsToRemove:
- 100
- 104
- 105
- 106
- 107
- 108
- 109
- 110
- 111
- 112
- 113
IsNonCancelable: true
JobID: 1
Loading

0 comments on commit 026b23e

Please sign in to comment.