-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: explicit SELECT FOR UPDATE only locks index scanned by query #57031
Comments
Thank you for your reply on slack and the issue, and good luck with cockroachdb |
Alternatively, we could change the behavior to only acquire unreplicated locks on a table's primary index. In most cases, this would be free. However, it would lead to some index-only scans now requiring index joins onto the table's primary key. It may also undermine some of the benefit of |
@nvanbenschoten what's the urgency on this issue? |
Moving this to the backlog for now, @nvanbenschoten please let us know if there is urgency on this. Thanks! |
I'm not aware of any strong urgency, but this is a bug that users have noticed which can lead to surprising behavior. I think #75457 is higher priority. |
Temporarily disallow `SELECT FOR UPDATE` statements under all isolation levels that are not `SERIALIZABLE` (i.e. `SNAPSHOT` and `READ COMMITTED`). We will allow them again when the following issues are fixed: - cockroachdb#57031 - cockroachdb#75457 - cockroachdb#100193 - cockroachdb#100194 Fixes: cockroachdb#100144 Release note: None
Temporarily disallow `SELECT FOR UPDATE` statements under all isolation levels that are not `SERIALIZABLE` (i.e. `SNAPSHOT` and `READ COMMITTED`). We will allow them again when the following issues are fixed: - cockroachdb#57031 - cockroachdb#75457 - cockroachdb#100193 - cockroachdb#100194 Fixes: cockroachdb#100144 Release note: None
Temporarily disallow `SELECT FOR UPDATE` statements under all isolation levels that are not `SERIALIZABLE` (i.e. `SNAPSHOT` and `READ COMMITTED`). We will allow them again when the following issues are fixed: - cockroachdb#57031 - cockroachdb#75457 - cockroachdb#100193 - cockroachdb#100194 Fixes: cockroachdb#100144 Release note: None
103734: opt: disallow SELECT FOR UPDATE under weak isolation levels r=nvanbenschoten,mgartner a=michae2 **querycache: remove unused field from CachedData** Remove the `IsCorrelated` flag from `querycache.CachedData`, which is no longer used. Release note: None --- **sql/opt: add locking durability** In addition to strength and wait policy, we now add a third property to locks: durability. Locks with `LockDurabilityGuaranteed` are guaranteed to be held until commit (if the transaction commits). Durable locks must be used when correctness depends on locking. This is never the case under our `SERIALIZABLE` isolation, but under `SNAPSHOT` and `READ COMMITTED` isolation it will be the case for `SELECT FOR UPDATE` statements, which will be the first users of durable locks. This commit adds locking durability to the optimizer and `EXPLAIN` output, but does not plumb it into KV yet. It will be used in the next commit to temporarily disallow `SELECT FOR UPDATE` statements under `SNAPSHOT` and `READ COMMITTED` isolation. Release note: None --- **opt: disallow SELECT FOR UPDATE under weak isolation levels** Temporarily disallow `SELECT FOR UPDATE` statements under all isolation levels that are not `SERIALIZABLE` (i.e. `SNAPSHOT` and `READ COMMITTED`). We will allow them again when the following issues are fixed: - #57031 - #75457 - #100193 - #100194 Fixes: #100144 Release note: None Co-authored-by: Michael Erickson <[email protected]>
Locking clauses such as FOR UPDATE and FOR SHARE apply to some or all of the data sources in a query's FROM list, depending on whether they have targets (FOR UPDATE OF x). Without targets, they always apply within subqueries in the FROM list. With targets, they apply within subqueries if the subquery alias matches the target. Because of this scope-like nature of FOR UPDATE and FOR SHARE, we implement semantic analysis using a stack of locking items that grow as we build inner subqueries deeper in the recursive optbuilder calls. Prior to this change, we only used the stack of locking items during buildScan, at the very bottom of the recursion. Because of this, calls to `lockingSpec.filter` could afford to compress the stack into a single locking item on our way deeper in the recursion. As part of the upcoming fix for cockroachdb#75457, however, we will need to build a new Lock operator when popping off locking items after returning from the recursion. That Lock operator will need some information gathered from buildScan at the bottom of the recursion. To support this, we refactor the stack of locking items to be two stacks: one that tracks all locking items in scope, and a second that tracks which locking items currently apply. This will allow buildScan to associate table information with the correct locking item(s), which can then be used to build Lock operators when popping the locking items. As a bonus, by using only the applied locking item stack in `validateLockingInFrom` we can make validation of SELECT FOR UPDATE queries a little more precise, which allows some queries we were incorrectly disallowing. Informs: cockroachdb#57031, cockroachdb#75457 Epic: CRDB-25322 Release note (sql change): Allow FOR UPDATE on some queries that were previously disallowed. Queries that use the following operations are now allowed to have FOR UPDATE OF as long as the prohibited operation is in a subquery not locked by the FOR UPDATE OF: - UNION - INTERSECT - EXCEPT - DISTINCT - GROUP BY - HAVING - aggregations - window functions For example, the following query is now allowed because the subquery using the prohibited operations is not affected by the FOR UPDATE OF: ``` SELECT * FROM t, (SELECT DISTINCT 0, 0 UNION SELECT a, count(*) FROM t GROUP BY a HAVING a > 0) AS u FOR UPDATE OF t; ``` This matches PostgreSQL.
Locking clauses such as FOR UPDATE and FOR SHARE apply to some or all of the data sources in a query's FROM list, depending on whether they have targets (FOR UPDATE OF x). Without targets, they always apply within subqueries in the FROM list. With targets, they apply within subqueries if the subquery alias matches the target. Because of this scope-like nature of FOR UPDATE and FOR SHARE, we implement semantic analysis using a stack of locking items that grow as we build inner subqueries deeper in the recursive optbuilder calls. Prior to this change, we only used the stack of locking items during buildScan, at the very bottom of the recursion. Because of this, calls to `lockingSpec.filter` could afford to compress the stack into a single locking item on our way deeper in the recursion. As part of the upcoming fix for cockroachdb#75457, however, we will need to build a new Lock operator when popping off locking items after returning from the recursion. That Lock operator will need some information gathered from buildScan at the bottom of the recursion. To support this, we refactor the stack of locking items to be two stacks: one that tracks all locking items in scope, and a second that tracks which locking items currently apply. This will allow buildScan to associate table information with the correct locking item(s), which can then be used to build Lock operators when popping the locking items. As a bonus, by using only the applied locking item stack in `validateLockingInFrom` we can make validation of SELECT FOR UPDATE queries a little more precise, which allows some queries we were incorrectly disallowing. Informs: cockroachdb#57031, cockroachdb#75457 Epic: CRDB-25322 Release note (sql change): Allow FOR UPDATE on some queries that were previously disallowed. Queries that use the following operations are now allowed to have FOR UPDATE OF as long as the prohibited operation is in a subquery not locked by the FOR UPDATE OF: - UNION - INTERSECT - EXCEPT - DISTINCT - GROUP BY - HAVING - aggregations - window functions For example, the following query is now allowed because the subquery using the prohibited operations is not affected by the FOR UPDATE OF: ``` SELECT * FROM t, (SELECT DISTINCT 0, 0 UNION SELECT a, count(*) FROM t GROUP BY a HAVING a > 0) AS u FOR UPDATE OF t; ``` This matches PostgreSQL.
Locking clauses such as FOR UPDATE and FOR SHARE apply to some or all of the data sources in a query's FROM list, depending on whether they have targets (FOR UPDATE OF x). Without targets, they always apply within subqueries in the FROM list. With targets, they apply within subqueries if the subquery alias matches the target. Because of this scope-like nature of FOR UPDATE and FOR SHARE, we implement semantic analysis using a stack of locking items that grow as we build inner subqueries deeper in the recursive optbuilder calls. Prior to this change, we only used the stack of locking items during buildScan, at the very bottom of the recursion. Because of this, calls to `lockingSpec.filter` could afford to compress the stack into a single locking item on our way deeper in the recursion. As part of the upcoming fix for cockroachdb#75457, however, we will need to build a new Lock operator when popping off locking items after returning from the recursion. That Lock operator will need some information gathered from buildScan at the bottom of the recursion. To support this, we refactor the stack of locking items to be two stacks: one that tracks all locking items in scope, and a second that tracks which locking items currently apply. This will allow buildScan to associate table information with the correct locking item(s), which can then be used to build Lock operators when popping the locking items. As a bonus, by using only the applied locking item stack in `validateLockingInFrom` we can make validation of SELECT FOR UPDATE queries a little more precise, which allows some queries we were incorrectly disallowing. Informs: cockroachdb#57031, cockroachdb#75457 Epic: CRDB-25322 Release note (sql change): Allow FOR UPDATE on some queries that were previously disallowed. Queries that use the following operations are now allowed to have FOR UPDATE OF as long as the prohibited operation is in a subquery not locked by the FOR UPDATE OF: - UNION - INTERSECT - EXCEPT - DISTINCT - GROUP BY - HAVING - aggregations - window functions For example, the following query is now allowed because the subquery using the prohibited operations is not affected by the FOR UPDATE OF: ``` SELECT * FROM t, (SELECT DISTINCT 0, 0 UNION SELECT a, count(*) FROM t GROUP BY a HAVING a > 0) AS u FOR UPDATE OF t; ``` This matches PostgreSQL.
111258: optbuilder: refactor semantic analysis of FOR UPDATE r=DrewKimball,msirek a=michae2 **optbuilder: a few minor tweaks to building of FOR UPDATE** Make a few minor tweaks to semantic analysis of FOR UPDATE locking clauses. 1. Disallow multiple FOR UPDATE clauses on parenthesized queries. We do not currently handle scopes of parenthesized queries correctly. Because of this, we disallow multiple ORDER BY and LIMIT clauses on parenthesized queries. The previous implementation of FOR UPDATE was simple enough that we could work around this, but the upcoming changes make it impossible to support. 2. Allow FOR UPDATE on statements with VALUES in the FROM list (but continue to disallow FOR UPDATE on VALUES directly). This matches Postgres. Release note (sql change): Two minor changes to FOR UPDATE clauses: 1. Multiple FOR UPDATE clauses on fully-parenthesized queries are now disallowed. For example, the following statements are now disallowed: ``` (SELECT 1 FOR UPDATE) FOR UPDATE; SELECT * FROM ((SELECT 1 FOR UPDATE) FOR UPDATE) AS x; ``` whereas statements like the following are still allowed: ``` SELECT * FROM (SELECT 1 FOR UPDATE) AS x FOR UPDATE; SELECT (SELECT 1 FOR UPDATE) FOR UPDATE; ``` This does not match PostgreSQL, which allows all of these, but does match our behavior for ORDER BY and LIMIT. 2. FOR UPDATE is now allowed on statements with VALUES in the FROM list or as a subquery. For example, the following statements are now allowed: ``` SELECT (VALUES (1)) FOR UPDATE; SELECT * FROM (VALUES (1)) AS x FOR UPDATE; ``` Using FOR UPDATE directly on VALUES is still disallowed: ``` VALUES (1) FOR UPDATE; (VALUES (1)) FOR UPDATE; INSERT INTO t VALUES (1) FOR UPDATE; ``` This matches PostgreSQL. **optbuilder: refactor semantic analysis of FOR UPDATE** Locking clauses such as FOR UPDATE and FOR SHARE apply to some or all of the data sources in a query's FROM list, depending on whether they have targets (FOR UPDATE OF x). Without targets, they always apply within subqueries in the FROM list. With targets, they apply within subqueries if the subquery alias matches the target. Because of this scope-like nature of FOR UPDATE and FOR SHARE, we implement semantic analysis using a stack of locking items that grow as we build inner subqueries deeper in the recursive optbuilder calls. Prior to this change, we only used the stack of locking items during buildScan, at the very bottom of the recursion. Because of this, calls to `lockingSpec.filter` could afford to compress the stack into a single locking item on our way deeper in the recursion. As part of the upcoming fix for #75457, however, we will need to build a new Lock operator when popping off locking items after returning from the recursion. That Lock operator will need some information gathered from buildScan at the bottom of the recursion. To support this, we refactor the stack of locking items to be two stacks: one that tracks all locking items in scope, and a second that tracks which locking items currently apply. This will allow buildScan to associate table information with the correct locking item(s), which can then be used to build Lock operators when popping the locking items. As a bonus, by using only the applied locking item stack in `validateLockingInFrom` we can make validation of SELECT FOR UPDATE queries a little more precise, which allows some queries we were incorrectly disallowing. Informs: #57031, #75457 Epic: CRDB-25322 Release note (sql change): Allow FOR UPDATE on some queries that were previously disallowed. Queries that use the following operations are now allowed to have FOR UPDATE OF as long as the prohibited operation is in a subquery not locked by the FOR UPDATE OF: - UNION - INTERSECT - EXCEPT - DISTINCT - GROUP BY - HAVING - aggregations - window functions For example, the following query is now allowed because the subquery using the prohibited operations is not affected by the FOR UPDATE OF: ``` SELECT * FROM t, (SELECT DISTINCT 0, 0 UNION SELECT a, count(*) FROM t GROUP BY a HAVING a > 0) AS u FOR UPDATE OF t; ``` This matches PostgreSQL. Co-authored-by: Michael Erickson <[email protected]>
Add a new implementation of `SELECT FOR UPDATE` and `SELECT FOR SHARE` statements. Instead of locking during the initial row fetch, this new implementation constructs a `Lock` operator on the top of the query plan which performs the locking phase using a locking semi-join lookup. During optbuilder we build plans with both `Lock` operators and initial-row-fetch locking. During execbuilder we decide which implementation to use based on the isolation level and whether `optimizer_use_lock_op_for_serializable` is set. If the new implementation is chosen, `Lock` operators become locking semi-LookupJoins. In some cases these new plans will have superfluous lookup joins. A future PR will optimize away some of these superfluous lookup joins. Fixes: cockroachdb#57031, cockroachdb#75457 Epic: CRDB-25322 Release note (sql change): Add a new session variable, `optimizer_use_lock_op_for_serializable`, which when set enables a new implementation of `SELECT FOR UPDATE`. This new implementation of `SELECT FOR UPDATE` acquires row locks *after* any joins and filtering, and always acquires row locks on the primary index of the table being locked. This more closely matches `SELECT FOR UPDATE` behavior in PostgreSQL, but at the cost of more round trips from gateway node to replica leaseholder. Under read committed isolation (and other isolation levels weaker than serializable) we will always use this new implementation of `SELECT FOR UPDATE` regardless of the value of `optimizer_use_lock_op_for_serializable` to ensure correctness.
Add a new implementation of `SELECT FOR UPDATE` and `SELECT FOR SHARE` statements. Instead of locking during the initial row fetch, this new implementation constructs a `Lock` operator on the top of the query plan which performs the locking phase using a locking semi-join lookup. During optbuilder we build plans with both `Lock` operators and initial-row-fetch locking. During execbuilder we decide which implementation to use based on the isolation level and whether `optimizer_use_lock_op_for_serializable` is set. If the new implementation is chosen, `Lock` operators become locking semi-LookupJoins. In some cases these new plans will have superfluous lookup joins. A future PR will optimize away some of these superfluous lookup joins. Fixes: cockroachdb#57031, cockroachdb#75457 Epic: CRDB-25322 Release note (sql change): Add a new session variable, `optimizer_use_lock_op_for_serializable`, which when set enables a new implementation of `SELECT FOR UPDATE`. This new implementation of `SELECT FOR UPDATE` acquires row locks *after* any joins and filtering, and always acquires row locks on the primary index of the table being locked. This more closely matches `SELECT FOR UPDATE` behavior in PostgreSQL, but at the cost of more round trips from gateway node to replica leaseholder. Under read committed isolation (and other isolation levels weaker than serializable) we will always use this new implementation of `SELECT FOR UPDATE` regardless of the value of `optimizer_use_lock_op_for_serializable` to ensure correctness.
Add a new implementation of `SELECT FOR UPDATE` and `SELECT FOR SHARE` statements. Instead of locking during the initial row fetch, this new implementation constructs a `Lock` operator on the top of the query plan which performs the locking phase using a locking semi-join lookup. During optbuilder we build plans with both `Lock` operators and initial-row-fetch locking. During execbuilder we decide which implementation to use based on the isolation level and whether `optimizer_use_lock_op_for_serializable` is set. If the new implementation is chosen, `Lock` operators become locking semi-LookupJoins. In some cases these new plans will have superfluous lookup joins. A future PR will optimize away some of these superfluous lookup joins. Fixes: cockroachdb#57031, cockroachdb#75457 Epic: CRDB-25322 Release note (sql change): Add a new session variable, `optimizer_use_lock_op_for_serializable`, which when set enables a new implementation of `SELECT FOR UPDATE`. This new implementation of `SELECT FOR UPDATE` acquires row locks *after* any joins and filtering, and always acquires row locks on the primary index of the table being locked. This more closely matches `SELECT FOR UPDATE` behavior in PostgreSQL, but at the cost of more round trips from gateway node to replica leaseholder. Under read committed isolation (and other isolation levels weaker than serializable) we will always use this new implementation of `SELECT FOR UPDATE` regardless of the value of `optimizer_use_lock_op_for_serializable` to ensure correctness.
Add a new implementation of `SELECT FOR UPDATE` and `SELECT FOR SHARE` statements. Instead of locking during the initial row fetch, this new implementation constructs a `Lock` operator on the top of the query plan which performs the locking phase using a locking semi-join lookup. During optbuilder we build plans with both `Lock` operators and initial-row-fetch locking. During execbuilder we decide which implementation to use based on the isolation level and whether `optimizer_use_lock_op_for_serializable` is set. If the new implementation is chosen, `Lock` operators become locking semi-LookupJoins. In some cases these new plans will have superfluous lookup joins. A future PR will optimize away some of these superfluous lookup joins. Fixes: cockroachdb#57031, cockroachdb#75457 Epic: CRDB-25322 Release note (sql change): Add a new session variable, `optimizer_use_lock_op_for_serializable`, which when set enables a new implementation of `SELECT FOR UPDATE`. This new implementation of `SELECT FOR UPDATE` acquires row locks *after* any joins and filtering, and always acquires row locks on the primary index of the table being locked. This more closely matches `SELECT FOR UPDATE` behavior in PostgreSQL, but at the cost of more round trips from gateway node to replica leaseholder. Under read committed isolation (and other isolation levels weaker than serializable) we will always use this new implementation of `SELECT FOR UPDATE` regardless of the value of `optimizer_use_lock_op_for_serializable` to ensure correctness.
111662: opt: add lock operator r=DrewKimball a=michae2 **execbuilder: change error message to match PostgreSQL** Release note (sql change): Change the following error message: ``` statement error cannot execute FOR UPDATE in a read-only transaction ``` to this to match PostgreSQL: ``` statement error cannot execute SELECT FOR UPDATE in a read-only transaction ``` **opt: add lock operator** Add a new implementation of `SELECT FOR UPDATE` and `SELECT FOR SHARE` statements. Instead of locking during the initial row fetch, this new implementation constructs a `Lock` operator on the top of the query plan which performs the locking phase using a locking semi-join lookup. During optbuilder we build plans with both `Lock` operators and initial-row-fetch locking. During execbuilder we decide which implementation to use based on the isolation level and whether `optimizer_use_lock_op_for_serializable` is set. If the new implementation is chosen, `Lock` operators become locking semi-LookupJoins. In some cases these new plans will have superfluous lookup joins. A future PR will optimize away some of these superfluous lookup joins. Fixes: #57031, #75457 Epic: CRDB-25322 Release note (sql change): Add a new session variable, `optimizer_use_lock_op_for_serializable`, which when set enables a new implementation of `SELECT FOR UPDATE`. This new implementation of `SELECT FOR UPDATE` acquires row locks *after* any joins and filtering, and always acquires row locks on the primary index of the table being locked. This more closely matches `SELECT FOR UPDATE` behavior in PostgreSQL, but at the cost of more round trips from gateway node to replica leaseholder. Under read committed isolation (and other isolation levels weaker than serializable) we will always use this new implementation of `SELECT FOR UPDATE` regardless of the value of `optimizer_use_lock_op_for_serializable` to ensure correctness. 111934: sql: cache role existence checks to fix perf regression r=rafiss a=rafiss In 12ec26f, we started to check for the existence of a role whenever a privilege for the public role was checked. This can hapen multiple times during some pg_catalog queries, so it introduced a performance regression. Now, the role existence is cached so we avoid the penalty. The existence of a role is cached for the duration of the entire transaction, and also gets inherited by any internal executor used to implement a command run by the user's transaction. No release note since this is new in 23.2. informs #20718 fixes #112102 Release note: None Co-authored-by: Michael Erickson <[email protected]> Co-authored-by: Rafi Shamim <[email protected]>
Add a new implementation of `SELECT FOR UPDATE` and `SELECT FOR SHARE` statements. Instead of locking during the initial row fetch, this new implementation constructs a `Lock` operator on the top of the query plan which performs the locking phase using a locking semi-join lookup. During optbuilder we build plans with both `Lock` operators and initial-row-fetch locking. During execbuilder we decide which implementation to use based on the isolation level and whether `optimizer_use_lock_op_for_serializable` is set. If the new implementation is chosen, `Lock` operators become locking semi-LookupJoins. In some cases these new plans will have superfluous lookup joins. A future PR will optimize away some of these superfluous lookup joins. Fixes: #57031, #75457 Epic: CRDB-25322 Release note (sql change): Add a new session variable, `optimizer_use_lock_op_for_serializable`, which when set enables a new implementation of `SELECT FOR UPDATE`. This new implementation of `SELECT FOR UPDATE` acquires row locks *after* any joins and filtering, and always acquires row locks on the primary index of the table being locked. This more closely matches `SELECT FOR UPDATE` behavior in PostgreSQL, but at the cost of more round trips from gateway node to replica leaseholder. Under read committed isolation (and other isolation levels weaker than serializable) we will always use this new implementation of `SELECT FOR UPDATE` regardless of the value of `optimizer_use_lock_op_for_serializable` to ensure correctness.
This is fixed in 23.2 when using the new SFU implementation ( |
Describe the problem
SELECT FOR UPDATE
currently only places an unreplicated lock on the index being scanned by the query. This was an important optimization that allowed implicit SELECT FOR UPDATE (#45511, #50180, #50181) on a mutation's initial row scan to come for free. Without it, grabbing unreplicated locks during a mutation would have been too expensive.However, for explicit
SELECT ... FOR UPDATE
statements, this is surprising and divergent from PG, which acquires a lock on all indexes. We should expand explicitSELECT ... FOR UPDATE
's implementation to acquire an unreplicated lock on all of a table's indexes.To Reproduce
Setup
Example
Expected behavior
txn2's
SELECT
query should block, because txn1 should have acquired anUPGRADE
lock on all of tb02's indexes.Jira issue: CRDB-2879
Epic CRDB-25322
The text was updated successfully, but these errors were encountered: