-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: lock on jobs table held in column backfill for a long time #83202
Labels
A-jobs
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
T-sql-foundations
SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Comments
ajwerner
added
the
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
label
Jun 22, 2022
ajwerner
added a commit
to ajwerner/cockroach
that referenced
this issue
Jun 23, 2022
Backports will address cockroachdb#83202. Release note (bug fix): Due to a bug in transaction lifetime management, a lock could be held for a long period of time when adding a new column to a table (or altering a column type). This contention could make the jobs page non-responsive and job adoption slow. This bug has now been fixed.
ajwerner
added a commit
to ajwerner/cockroach
that referenced
this issue
Jun 23, 2022
Backports will address cockroachdb#83202. Release note (bug fix): Due to a bug in transaction lifetime management, a lock could be held for a long period of time when adding a new column to a table (or altering a column type). This contention could make the jobs page non-responsive and job adoption slow. This bug has now been fixed.
craig bot
pushed a commit
that referenced
this issue
Jun 24, 2022
83257: sql: update progress in column backfiller non-transactionally r=ajwerner a=ajwerner Backports will address #83202. Release note (bug fix): Due to a bug in transaction lifetime management, a lock could be held for a long period of time when adding a new column to a table (or altering a column type). This contention could make the jobs page non-responsive and job adoption slow. This bug has now been fixed. 83276: sql/schemachanger/scexec/backfiller: fix race r=ajwerner a=ajwerner We were missing a lock call. ``` ================== WARNING: DATA RACE Read at 0x00c00b3e4c00 by goroutine 1874: github.com/cockroachdb/cockroach/pkg/sql/schemachanger/scexec/backfiller.(*Tracker).collectProgressForCheckpointFlush() /go/src/github.com/cockroachdb/cockroach/pkg/sql/schemachanger/scexec/backfiller/tracker.go:402 +0x104 github.com/cockroachdb/cockroach/pkg/sql/schemachanger/scexec/backfiller.(*Tracker).FlushCheckpoint() /go/src/github.com/cockroachdb/cockroach/pkg/sql/schemachanger/scexec/backfiller/tracker.go:232 +0x57 github.com/cockroachdb/cockroach/pkg/sql/schemachanger/scexec.BackfillerProgressFlusher.FlushCheckpoint-fm() /go/src/github.com/cockroachdb/cockroach/pkg/sql/schemachanger/scexec/dependencies.go:322 +0x6b github.com/cockroachdb/cockroach/pkg/sql/schemachanger/scexec/backfiller.(*periodicProgressFlusher).StartPeriodicUpdates.func1() /go/src/github.com/cockroachdb/cockroach/pkg/sql/schemachanger/scexec/backfiller/periodic_progress_flusher.go:80 +0x26b github.com/cockroachdb/cockroach/pkg/sql/schemachanger/scexec/backfiller.(*periodicProgressFlusher).StartPeriodicUpdates.func3() /go/src/github.com/cockroachdb/cockroach/pkg/sql/schemachanger/scexec/backfiller/periodic_progress_flusher.go:92 +0xf1 golang.org/x/sync/errgroup.(*Group).Go.func1() /go/src/github.com/cockroachdb/cockroach/vendor/golang.org/x/sync/errgroup/errgroup.go:74 +0x91 Previous write at 0x00c00b3e4c00 by goroutine 115: github.com/cockroachdb/cockroach/pkg/sql/schemachanger/scexec/backfiller.(*Tracker).SetBackfillProgress() /go/src/github.com/cockroachdb/cockroach/pkg/sql/schemachanger/scexec/backfiller/tracker.go:197 +0x2bb github.com/cockroachdb/cockroach/pkg/sql.(*IndexBackfillPlanner).BackfillIndexes.func2() /go/src/github.com/cockroachdb/cockroach/pkg/sql/index_backfiller.go:92 +0x21e github.com/cockroachdb/cockroach/pkg/sql.(*MetadataCallbackWriter).AddMeta() /go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:730 +0x65 github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLReceiver).pushMeta() /go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:955 +0xd5 github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLReceiver).Push() /go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:1084 +0x4cc github.com/cockroachdb/cockroach/pkg/sql/rowflow.(*copyingRowReceiver).Push() /go/src/github.com/cockroachdb/cockroach/pkg/sql/rowflow/row_based_flow.go:455 +0x2d5 github.com/cockroachdb/cockroach/pkg/sql/rowexec.(*indexBackfiller).Run() /go/src/github.com/cockroachdb/cockroach/pkg/sql/rowexec/indexbackfiller.go:374 +0xad1 github.com/cockroachdb/cockroach/pkg/sql/flowinfra.(*FlowBase).Run() /go/src/github.com/cockroachdb/cockroach/pkg/sql/flowinfra/flow.go:475 +0x429 github.com/cockroachdb/cockroach/pkg/sql/rowflow.(*rowBasedFlow).Run() <autogenerated>:1 +0x73 github.com/cockroachdb/cockroach/pkg/sql.(*DistSQLPlanner).Run() /go/src/github.com/cockroachdb/cockroach/pkg/sql/distsql_running.go:596 +0x1234 github.com/cockroachdb/cockroach/pkg/sql.(*IndexBackfillPlanner).plan.func2() /go/src/github.com/cockroachdb/cockroach/pkg/sql/index_backfiller.go:204 +0x449 github.com/cockroachdb/cockroach/pkg/sql.(*IndexBackfillPlanner).BackfillIndexes() /go/src/github.com/cockroachdb/cockroach/pkg/sql/index_backfiller.go:119 +0x71d github.com/cockroachdb/cockroach/pkg/sql/schemachanger/scexec.runBackfill() /go/src/github.com/cockroachdb/cockroach/pkg/sql/schemachanger/scexec/exec_backfill.go:381 +0x24b github.com/cockroachdb/cockroach/pkg/sql/schemachanger/scexec.runBackfiller.func2() /go/src/github.com/cockroachdb/cockroach/pkg/sql/schemachanger/scexec/exec_backfill.go:328 +0x22e github.com/cockroachdb/cockroach/pkg/sql/schemachanger/scexec.forEachProgressConcurrently.func1.1() /go/src/github.com/cockroachdb/cockroach/pkg/sql/schemachanger/scexec/exec_backfill.go:257 +0x155 github.com/cockroachdb/cockroach/pkg/util/ctxgroup.Group.GoCtx.func1() /go/src/github.com/cockroachdb/cockroach/pkg/util/ctxgroup/ctxgroup.go:169 +0x4c golang.org/x/sync/errgroup.(*Group).Go.func1() /go/src/github.com/cockroachdb/cockroach/vendor/golang.org/x/sync/errgroup/errgroup.go:74 +0x91 ``` Release note: None Co-authored-by: Andrew Werner <[email protected]>
blathers-crl bot
pushed a commit
that referenced
this issue
Jun 24, 2022
Backports will address #83202. Release note (bug fix): Due to a bug in transaction lifetime management, a lock could be held for a long period of time when adding a new column to a table (or altering a column type). This contention could make the jobs page non-responsive and job adoption slow. This bug has now been fixed.
ajwerner
added a commit
to ajwerner/cockroach
that referenced
this issue
Jun 27, 2022
Backports will address cockroachdb#83202. Release note (bug fix): Due to a bug in transaction lifetime management, a lock could be held for a long period of time when adding a new column to a table (or altering a column type). This contention could make the jobs page non-responsive and job adoption slow. This bug has now been fixed.
🎉 |
exalate-issue-sync
bot
added
T-sql-foundations
SQL Foundations Team (formerly SQL Schema + SQL Sessions)
and removed
T-sql-schema-deprecated
Use T-sql-foundations instead
T-jobs
labels
May 10, 2023
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-jobs
C-bug
Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior.
T-sql-foundations
SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Describe the problem
In
cockroach/pkg/sql/backfill.go
Lines 1356 to 1358 in 5df10fe
Additional context
The transaction in use in that block, more generally, is dubious. It's trying to hold a lease on the descriptor during the backfill, but it's not clear what that accomplishes.
Jira issue: CRDB-16918
The text was updated successfully, but these errors were encountered: