Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: add lock operator #111662

Merged
merged 2 commits into from
Oct 10, 2023
Merged

opt: add lock operator #111662

merged 2 commits into from
Oct 10, 2023

Conversation

michae2
Copy link
Collaborator

@michae2 michae2 commented Oct 3, 2023

execbuilder: change error message to match PostgreSQL

Release note (sql change): Change the following error message:

statement error cannot execute FOR UPDATE in a read-only transaction

to this to match PostgreSQL:

statement error cannot execute SELECT FOR UPDATE in a read-only transaction

opt: add lock operator

Add a new implementation of SELECT FOR UPDATE and SELECT FOR SHARE
statements. Instead of locking during the initial row fetch, this new
implementation constructs a Lock operator on the top of the query plan
which performs the locking phase using a locking semi-join lookup.

During optbuilder we build plans with both Lock operators and
initial-row-fetch locking. During execbuilder we decide which
implementation to use based on the isolation level and whether
optimizer_use_lock_op_for_serializable is set. If the new
implementation is chosen, Lock operators become locking
semi-LookupJoins.

In some cases these new plans will have superfluous lookup joins. A
future PR will optimize away some of these superfluous lookup joins.

Fixes: #57031, #75457

Epic: CRDB-25322

Release note (sql change): Add a new session variable,
optimizer_use_lock_op_for_serializable, which when set enables a new
implementation of SELECT FOR UPDATE. This new implementation of
SELECT FOR UPDATE acquires row locks after any joins and filtering,
and always acquires row locks on the primary index of the table being
locked. This more closely matches SELECT FOR UPDATE behavior in
PostgreSQL, but at the cost of more round trips from gateway node to
replica leaseholder.

Under read committed isolation (and other isolation levels weaker than
serializable) we will always use this new implementation of SELECT FOR UPDATE regardless of the value of
optimizer_use_lock_op_for_serializable to ensure correctness.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments. Feel free to disregard any or all as you keep iterating.

Reviewed 2 of 2 files at r1, 10 of 10 files at r2, 17 of 17 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @michae2)


pkg/sql/opt/exec/execbuilder/builder.go line 318 at r3 (raw file):

	b.allowInsertFastPath = b.allowInsertFastPath && canAutoCommit

	b.allowLocking = b.evalCtx.TxnIsoLevel == isolation.Serializable &&

Do we not want to set this for read committed?


pkg/sql/opt/exec/execbuilder/mutation.go line 1134 at r3 (raw file):

	if !lock.Locking.IsLocking() ||
		!b.evalCtx.Settings.Version.IsActive(b.ctx, clusterversion.V23_2) ||
		(b.evalCtx.TxnIsoLevel == isolation.Serializable &&

[nit] Maybe you could split out the Serializable conditions to make it more clear?


pkg/sql/opt/exec/execbuilder/mutation.go line 1145 at r3 (raw file):

		Input: lock.Input,
		LookupJoinPrivate: memo.LookupJoinPrivate{
			JoinType:              opt.InnerJoinOp,

Could this be a SemiJoin? It looked like we don't project columns from the lookups, and we're joining on a unique key anyway...


pkg/sql/opt/exec/execbuilder/mutation.go line 1158 at r3 (raw file):

	if !b.allowLocking {
		b.allowLocking = true
		defer func() { b.allowLocking = false }()

Can't lock operators be nested?


pkg/sql/opt/ops/README.md line 76 at r3 (raw file):

   expression in the first child. Such expressions must implement a
   `WithBindingID()` method.

Was this intentional?


pkg/sql/opt/optbuilder/join.go line 36 at r3 (raw file):

	join *tree.JoinTableExpr, lockCtx lockingContext, inScope *scope,
) (outScope *scope) {
	// TODO(michae2): poison the lockCtx for the null-extended side(s) if this is

The tracking issue is #97434


pkg/sql/opt/optbuilder/locking.go line 94 at r3 (raw file):

			panic(errors.AssertionFailedf("fetchScope did not have column %d", ord))
		}
		scopeCol := fetchScope.cols[ord]

Retrieving the column like this relies on the scope having the columns in the same order as the locked table. Maybe that's true in practice, but it seems fragile. You can get the column ID directly from the TableMeta/TableID - see an example here:

// getPkCols gets the primary key columns for the given table as a ColList.
func (c *CustomFuncs) getPkCols(tabID opt.TableID) opt.ColList {
tab := c.e.mem.Metadata().Table(tabID)
pkIndex := tab.Index(cat.PrimaryIndex)
pkCols := make(opt.ColList, pkIndex.KeyColumnCount())
for i := range pkCols {
pkCols[i] = tabID.IndexColumnID(pkIndex, i)
}
return pkCols
}
.


pkg/sql/opt/optbuilder/locking.go line 272 at r3 (raw file):

	for i := range lockScope.cols {
		b.addOrderByOrDistinctOnColumn(inScope, projectionsScope, lockScope, &lockScope.cols[i])

I don't think this is necessary, since the locking columns are only ever references to table columns, rather than expressions that need to be built. You could probably combine this step with analyzeLockArgs and just skip the step with lockScope, but I think it's fine either way.


pkg/sql/opt/optbuilder/locking.go line 366 at r3 (raw file):

	}
	//fmt.Println("buildLock, keyCols:", lb.keyCols, "scope.cols", inScope.cols, "scope.extraCols", inScope.extraCols)
	for _, col := range inScope.cols {

You can use colSetWithExtraCols here :)


pkg/sql/opt/optbuilder/locking.go line 381 at r3 (raw file):

		}
	}
	inScope.expr = b.factory.ConstructLock(inScope.expr, private)

Did you find a reason why not to directly construct a LookupJoinExpr here? It seems like doing that could simplify the logic down the line. Especially since we plan to make locking an enforced property in the future, with LockExpr only requiring the property.


pkg/sql/opt/optbuilder/select.go line 160 at r3 (raw file):

			)

			if lockCtx.locking.isSet() {

Should this just be inlined into buildScan?

@michae2 michae2 force-pushed the locking_op branch 4 times, most recently from 00a8a13 to 1ec4ae2 Compare October 9, 2023 22:54
Copy link
Collaborator Author

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comments! I'm still ironing out a few small issues but this is RFAL.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball)


pkg/sql/opt/exec/execbuilder/builder.go line 318 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Do we not want to set this for read committed?

I've reversed the logic here, so now we're setting this (newly-named) boolean for read committed.


pkg/sql/opt/exec/execbuilder/mutation.go line 1145 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Could this be a SemiJoin? It looked like we don't project columns from the lookups, and we're joining on a unique key anyway...

Ah, good catch! Done.


pkg/sql/opt/exec/execbuilder/mutation.go line 1158 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Can't lock operators be nested?

Yes, this was broken. Good catch. Now we're marking these opt.Locking structs by setting Forced when they come from here, and keeping onlyForcedLocking the same.


pkg/sql/opt/ops/README.md line 76 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Was this intentional?

Yes (emacs highlights extra trailing newlines, so this was annoying me).


pkg/sql/opt/optbuilder/join.go line 36 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

The tracking issue is #97434

done.


pkg/sql/opt/optbuilder/locking.go line 94 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Retrieving the column like this relies on the scope having the columns in the same order as the locked table. Maybe that's true in practice, but it seems fragile. You can get the column ID directly from the TableMeta/TableID - see an example here:

// getPkCols gets the primary key columns for the given table as a ColList.
func (c *CustomFuncs) getPkCols(tabID opt.TableID) opt.ColList {
tab := c.e.mem.Metadata().Table(tabID)
pkIndex := tab.Index(cat.PrimaryIndex)
pkCols := make(opt.ColList, pkIndex.KeyColumnCount())
for i := range pkCols {
pkCols[i] = tabID.IndexColumnID(pkIndex, i)
}
return pkCols
}
.

Great point, thank you for the example. Changed to get the column ID from the TableMeta.


pkg/sql/opt/optbuilder/locking.go line 272 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

I don't think this is necessary, since the locking columns are only ever references to table columns, rather than expressions that need to be built. You could probably combine this step with analyzeLockArgs and just skip the step with lockScope, but I think it's fine either way.

I was wondering this, thank you. I don't think we can combine with analyzeLockArgs because that needs to run after `analyze


pkg/sql/opt/optbuilder/locking.go line 366 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

You can use colSetWithExtraCols here :)

Gah, I am such a newb. 🙂 Done.

Actually, there might be a small bug here. It should work with colSet instead of colSetWithExtraCols. Stay tuned...


pkg/sql/opt/optbuilder/locking.go line 381 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Did you find a reason why not to directly construct a LookupJoinExpr here? It seems like doing that could simplify the logic down the line. Especially since we plan to make locking an enforced property in the future, with LockExpr only requiring the property.

Two reasons:

  1. Because we want LockExpr to act as an optimization barrier during exploration, and I don't think LookupJoinExpr would.
  2. In 23.2 I'm trying to only use LockExpr under read committed isolation to reduce risk. This means that in optbuilder we don't know for certain which locking implementation we will use (because we could prepare a SELECT FOR UPDATE statement under serializable isolation and then execute it under read committed isolation or vice versa). So we need to wait until execbuilder, when we know for certain, to either convert the LockExpr to LookupJoinExpr or elide it.

This will become simpler when we fully commit to LockExpr and make locking an enforced property. Then execbuilder will go back to simply eliding the LockExpr.


pkg/sql/opt/optbuilder/select.go line 160 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Should this just be inlined into buildScan?

I considered it, but out of over a dozen calls to buildScan only these two calls deal with locking contexts.

@michae2 michae2 marked this pull request as ready for review October 9, 2023 23:28
@michae2 michae2 requested a review from a team as a code owner October 9, 2023 23:28
@michae2 michae2 requested review from DrewKimball, mgartner and msirek and removed request for a team October 9, 2023 23:28
Copy link
Contributor

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 23 files at r4, 5 of 40 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball, @mgartner, and @michae2)


pkg/sql/opt/exec/execbuilder/mutation.go line 1145 at r3 (raw file):

Previously, michae2 (Michael Erickson) wrote…

Ah, good catch! Done.

Just wondering why the join is built in the execbuilder instead of the optbuilder. Is it because you want to make sure the lookup join is never removed by some rule firing? I can imagine it might mess up the stats if we don't recognize the lookup join is essentially an index join. I'm just wondering if placing more logic in the execbuilder will make it more difficult to maintain in the long run, instead of being a straightforward translation of what's in the optimized expression tree.

Copy link
Contributor

@msirek msirek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball, @mgartner, and @michae2)

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking pretty good. My main concern right now is plan regressions for serializable isolation - there are a couple places where we're stopping column-pruning or join-elimination rules from firing when they did before, and I don't think it's necessary:

In 23.2 I'm trying to only use LockExpr under read committed isolation to reduce risk. This means that in optbuilder we don't know for certain which locking implementation we will use (because we could prepare a SELECT FOR UPDATE statement under serializable isolation and then execute it under read committed isolation or vice versa). So we need to wait until execbuilder, when we know for certain, to either convert the LockExpr to LookupJoinExpr or elide it.

We should be invalidating the memo and re-planning when the isolation level changes for prepared statements and regular ones. So, it should be safe to expect the same isolation level during execution as we observe during planning, even for prepared statements. So we should be able to avoid building the lock operator under the default settings. Maybe that will simplify some of your reasoning too?

Reviewed 23 of 23 files at r4, 40 of 40 files at r5, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @michae2)


pkg/sql/opt/locking.go line 74 at r5 (raw file):

	// locking is disabled under weaker isolation levels. It is used to
	// distinguish Lock-operator locking from initial-row-fetch locking.
	Forced bool

[nit] Forced was a bit confusing. How would you feel about UseLockOp and ForceUseLockOperator, or something like that? Or maybe UseExplicitLocking


pkg/sql/opt/exec/execbuilder/mutation.go line 1180 at r5 (raw file):

		},
	}
	memo.CopyLockGroupIntoLookupJoin(lock, join)

Probably what you want here is AddLookupJoinToGroup.


pkg/sql/opt/exec/execbuilder/testdata/select_for_update_read_committed line 131 at r5 (raw file):

                  table: supermarket@supermarket_pkey
                  spans: /"matilda"/0
                  locking strength: for update

Why is the initial-scan locking still printed here? I notice it's been removed in some places, but not others.


pkg/sql/opt/xform/testdata/rules/limit line 907 at r5 (raw file):

SELECT * FROM kuv ORDER BY u LIMIT 5 FOR UPDATE
----
sort

We should add the lock operator to the list in opt/ordering/ordering/go, since it should be able to pass through its input ordering. See opt/ordering/lookup_join.go. Maybe that doesn't have to happen in this PR, though.

@michae2 michae2 force-pushed the locking_op branch 2 times, most recently from fde67ce to ef846d1 Compare October 10, 2023 07:32
Copy link
Collaborator Author

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out, it did simplify things! The "forced" locking stuff is gone, as are the plan regressions for serializable.

I'm still fighting with the optimizer about a few things, but this is closer now. Also, needs more tests.

First commit is #112082 which can be ignored.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball, @mgartner, and @msirek)


pkg/sql/opt/exec/execbuilder/mutation.go line 1134 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

[nit] Maybe you could split out the Serializable conditions to make it more clear?

I changed this to simply call buildLocking (which is what it was emulating).


pkg/sql/opt/exec/execbuilder/mutation.go line 1145 at r3 (raw file):

Previously, msirek (Mark Sirek) wrote…

Just wondering why the join is built in the execbuilder instead of the optbuilder. Is it because you want to make sure the lookup join is never removed by some rule firing? I can imagine it might mess up the stats if we don't recognize the lookup join is essentially an index join. I'm just wondering if placing more logic in the execbuilder will make it more difficult to maintain in the long run, instead of being a straightforward translation of what's in the optimized expression tree.

Yes, the reason is to avoid matching any rules that would push something above or below the lookup join, or remove the lookup join.

I agree that it's a little funny to create the lookup join in execbuilder. Maybe one way to justify it is to consider lookup join as merely the implementation of Lock, and then the call to buildLookupJoin is merely sharing code, and not some sneaky attempt to modify the plan in exec builder. Similar to how you were calling ConstructOrdinality from buildUniqueKey in your prototype. But maybe I'm just arguing semantics.

The long-term plan is to make locking a physical property, make Lock require the physical property, and make lookup join an enforcer for the physical property. I think in that plan we still build Lock during optbuilder, and it still acts as an optimization barrier during exploration. But I didn't quite get there in this PR.


pkg/sql/opt/exec/execbuilder/mutation.go line 1180 at r5 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Probably what you want here is AddLookupJoinToGroup.

I've been thinking about this, but I don't actually want to add the lookup join to the memo. Just want to create it to simplify calling buildLookupJoin. Maybe I'm being quixotic.


pkg/sql/opt/exec/execbuilder/testdata/select_for_update_read_committed line 131 at r5 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Why is the initial-scan locking still printed here? I notice it's been removed in some places, but not others.

Good eye! This is because forceForUpdateLocking is set for everything below the update. I might have to turn off implicit update locking under read committed for this reason, but I'd rather do it in a follow-up PR.


pkg/sql/opt/xform/testdata/rules/limit line 907 at r5 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

We should add the lock operator to the list in opt/ordering/ordering/go, since it should be able to pass through its input ordering. See opt/ordering/lookup_join.go. Maybe that doesn't have to happen in this PR, though.

Done. (Hopefully I did it correctly... I think it's pretty simple, since we just pass through the ordering.)


pkg/sql/opt/locking.go line 74 at r5 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

[nit] Forced was a bit confusing. How would you feel about UseLockOp and ForceUseLockOperator, or something like that? Or maybe UseExplicitLocking

Removed it (thanks to your suggestion!)

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "forced" locking stuff is gone, as are the plan regressions for serializable.

Nice! Glad you managed to fix those regressions. I think there might(?) be a bug with propagating key columns for nested locks, but other than that I'm about ready to stamp. BTW, are there logic tests for locking under read committed somewhere?

Reviewed 38 of 38 files at r6, 3 of 3 files at r7, 6 of 27 files at r8.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @mgartner and @michae2)


pkg/sql/opt/exec/execbuilder/mutation.go line 1180 at r5 (raw file):

Previously, michae2 (Michael Erickson) wrote…

I've been thinking about this, but I don't actually want to add the lookup join to the memo. Just want to create it to simplify calling buildLookupJoin. Maybe I'm being quixotic.

Well, it's hacky, but it seems correct. But maybe we could use MemoizeLookupJoin instead - that won't add it to the query plan tree, and there's precedent for using it for throwaway calculation:

// We need to determine the row count of the join before the
// ON conditions are applied.
withoutOn := e.Memo().MemoizeLookupJoin(t.Input, nil /* on */, lookupJoinPrivate)

It would be less efficient, but I don't think it's worth worrying about since it's only once per lock operator. I'd be alright with keeping the current code for now, though, up to you.


pkg/sql/opt/memo/logical_props_builder.go line 1530 at r8 (raw file):

	rel.NotNullCols.IntersectionWith(lock.Cols)
	rel.FuncDeps.CopyFrom(&inputProps.FuncDeps)
	rel.FuncDeps.ProjectCols(rel.OutputCols)

As I mentioned elsewhere, I think we should simplify things as much as possible and have the operator just logically pass through columns (apart from possible filtering for skip locked). These lines wouldn't be incorrect in that case, just unnecessary.


pkg/sql/opt/optbuilder/locking.go line 366 at r3 (raw file):

colSet instead of colSetWithExtraCols

Aren't primary key cols for lock operators potentially propagated as extra cols? What happens if lock operators are nested? I think it might be best to simplify things by just always returning exactly the set of input columns. I think that will "just work" - a projection will be added on top if necessary to remove extra columns after the lock. It won't be totally efficient, but that's probably ok.

If I missed something and the nested-locks case isn't a bug, the above request is just a nit that we can worry about later.


pkg/sql/opt/ordering/mutation.go line 77 at r8 (raw file):

Need to remember the ordering to set in lookupjoin?

This should just work :)


pkg/sql/opt/xform/testdata/rules/limit line 907 at r5 (raw file):

Previously, michae2 (Michael Erickson) wrote…

Done. (Hopefully I did it correctly... I think it's pretty simple, since we just pass through the ordering.)

The ordering code looks good to me!

Release note (sql change): Change the following error message:

```
statement error cannot execute FOR UPDATE in a read-only transaction
```

to this to match PostgreSQL:

```
statement error cannot execute SELECT FOR UPDATE in a read-only transaction
```
Copy link
Collaborator Author

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, are there logic tests for locking under read committed somewhere?

There are some in pkg/sql/logictest/testdata/logic_test/read_committed as well as in pkg/sql/opt/exec/execbuilder/testdata/select_for_update_read_committed but I'm working on adding more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball and @mgartner)


pkg/sql/opt/exec/execbuilder/mutation.go line 1180 at r5 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Well, it's hacky, but it seems correct. But maybe we could use MemoizeLookupJoin instead - that won't add it to the query plan tree, and there's precedent for using it for throwaway calculation:

// We need to determine the row count of the join before the
// ON conditions are applied.
withoutOn := e.Memo().MemoizeLookupJoin(t.Input, nil /* on */, lookupJoinPrivate)

It would be less efficient, but I don't think it's worth worrying about since it's only once per lock operator. I'd be alright with keeping the current code for now, though, up to you.

Oh, nice. Done.


pkg/sql/opt/memo/logical_props_builder.go line 1530 at r8 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

As I mentioned elsewhere, I think we should simplify things as much as possible and have the operator just logically pass through columns (apart from possible filtering for skip locked). These lines wouldn't be incorrect in that case, just unnecessary.

Done.


pkg/sql/opt/optbuilder/locking.go line 366 at r3 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

colSet instead of colSetWithExtraCols

Aren't primary key cols for lock operators potentially propagated as extra cols? What happens if lock operators are nested? I think it might be best to simplify things by just always returning exactly the set of input columns. I think that will "just work" - a projection will be added on top if necessary to remove extra columns after the lock. It won't be totally efficient, but that's probably ok.

If I missed something and the nested-locks case isn't a bug, the above request is just a nit that we can worry about later.

You're correct, it wasn't working. This was one of the things I was fighting with. It was also causing problems with ORDER BY when the ordering column was in extra cols. Changed back to colSetWithExtraCols.

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take another look tomorrow morning, but this :lgtm: pending reversion of the MemoizeLookupJoin change and some interesting tests.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @mgartner and @michae2)


pkg/sql/opt/exec/execbuilder/mutation.go line 1180 at r5 (raw file):

Previously, michae2 (Michael Erickson) wrote…

Oh, nice. Done.

Sorry, it looks like I misled you, building the props is causing NPE's because some fields are initialized. I think your initial code is ok; the group is only used to get the memo (for some reason) anyway.

@michae2
Copy link
Collaborator Author

michae2 commented Oct 10, 2023

I'm going to go for it, and we'll see where things are at when the dust settles tomorrow.

Thank you, Drew!!

bors r=DrewKimball

@craig
Copy link
Contributor

craig bot commented Oct 10, 2023

Build failed:

Add a new implementation of `SELECT FOR UPDATE` and `SELECT FOR SHARE`
statements. Instead of locking during the initial row fetch, this new
implementation constructs a `Lock` operator on the top of the query plan
which performs the locking phase using a locking semi-join lookup.

During optbuilder we build plans with both `Lock` operators and
initial-row-fetch locking. During execbuilder we decide which
implementation to use based on the isolation level and whether
`optimizer_use_lock_op_for_serializable` is set. If the new
implementation is chosen, `Lock` operators become locking
semi-LookupJoins.

In some cases these new plans will have superfluous lookup joins. A
future PR will optimize away some of these superfluous lookup joins.

Fixes: cockroachdb#57031, cockroachdb#75457

Epic: CRDB-25322

Release note (sql change): Add a new session variable,
`optimizer_use_lock_op_for_serializable`, which when set enables a new
implementation of `SELECT FOR UPDATE`. This new implementation of
`SELECT FOR UPDATE` acquires row locks *after* any joins and filtering,
and always acquires row locks on the primary index of the table being
locked. This more closely matches `SELECT FOR UPDATE` behavior in
PostgreSQL, but at the cost of more round trips from gateway node to
replica leaseholder.

Under read committed isolation (and other isolation levels weaker than
serializable) we will always use this new implementation of `SELECT FOR
UPDATE` regardless of the value of
`optimizer_use_lock_op_for_serializable` to ensure correctness.
@michae2
Copy link
Collaborator Author

michae2 commented Oct 10, 2023

Once more, with feeling:

bors r=DrewKimball

@michae2
Copy link
Collaborator Author

michae2 commented Oct 10, 2023

bors r=DrewKimball

Copy link
Collaborator

@DrewKimball DrewKimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 10 of 36 files at r10, 2 of 13 files at r11, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @mgartner and @michae2)


pkg/sql/opt/optbuilder/testdata/select_for_update line 681 at r12 (raw file):


build set=optimizer_use_lock_op_for_serializable=true
SELECT (SELECT a FROM t) FOR UPDATE

Why doesn't this one lock a?

@craig
Copy link
Contributor

craig bot commented Oct 10, 2023

Build failed (retrying...):

Copy link
Collaborator Author

@michae2 michae2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @DrewKimball and @mgartner)


pkg/sql/opt/optbuilder/testdata/select_for_update line 681 at r12 (raw file):

Previously, DrewKimball (Drew Kimball) wrote…

Why doesn't this one lock a?

FOR UPDATE only locks tables in the FROM clause. This is a scalar subquery in the SELECT, so it is not locked by the FOR UPDATE. (This matches PostgreSQL, and also matches what we were doing before.)

@craig
Copy link
Contributor

craig bot commented Oct 10, 2023

Build succeeded:

@craig craig bot merged commit 4447a52 into cockroachdb:master Oct 10, 2023
@michae2
Copy link
Collaborator Author

michae2 commented Oct 10, 2023

blathers backport 23.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: explicit SELECT FOR UPDATE only locks index scanned by query
4 participants