Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql/opt: add implicit SELECT FOR SHARE support for FK checks #80683

Closed
michae2 opened this issue Apr 28, 2022 · 7 comments · Fixed by #105857
Closed

sql/opt: add implicit SELECT FOR SHARE support for FK checks #80683

michae2 opened this issue Apr 28, 2022 · 7 comments · Fixed by #105857
Assignees
Labels
A-read-committed Related to the introduction of Read Committed A-sql-optimizer SQL logical planning and optimizations. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) E-quick-win Likely to be a quick win for someone experienced. sync-me-8 T-sql-queries SQL Queries Team

Comments

@michae2
Copy link
Collaborator

michae2 commented Apr 28, 2022

Similar to #50180 and #50181 we should add implicit SELECT FOR UPDATE locking to the FK checks (and other constraint checks) performed by mutation statements, in accordance with the enable_implicit_select_for_update session variable.

Here's a demonstration:

CREATE TABLE account (id INT PRIMARY KEY, name STRING);
CREATE TABLE transfer (
  from_id INT,
  to_id INT,
  amount INT,
  FOREIGN KEY (from_id) REFERENCES account (id),
  FOREIGN KEY (to_id) REFERENCES account (id)
);
INSERT INTO account VALUES (1, 'Alice'), (2, 'Bob');

-- connection 1 (single explicit transaction)
BEGIN;
SELECT * FROM transfer FOR UPDATE;

-- connection 2 (two implicit transactions)
UPDATE account SET name = 'Carol' WHERE id = 2;
SELECT * FROM transfer;

-- connection 1 (resuming open transaction)
INSERT INTO transfer VALUES (1, 2, 100);
COMMIT;

The FK checks performed by the INSERT do not use SELECT FOR UPDATE and so are at a timestamp before the update transaction in connection 2, and hence impossible to refresh.

[email protected]:26257/defaultdb  OPEN> COMMIT;
ERROR: restart transaction: TransactionRetryWithProtoRefreshError: TransactionRetryError: retry txn (RETRY_SERIALIZABLE - failed preemptive refresh due to a conflict: committed value on key /Table/104/1/2/0): "sql txn" meta={id=bc8069e8 key=/Table/105/1 pri=0.00422485 epo=0 ts=1651120066.267562000,2 min=1651120005.931353000,0 seq=1} lock=true stat=PENDING rts=1651120005.931353000,0 wto=false gul=1651120006.431353000,0
SQLSTATE: 40001
HINT: See: https://www.cockroachlabs.com/docs/v22.1/transaction-retry-error-reference.html#retry_serializable

Jira issue: CRDB-15550

Epic CRDB-25322

@michae2 michae2 added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. A-sql-optimizer SQL logical planning and optimizations. T-sql-queries SQL Queries Team E-quick-win Likely to be a quick win for someone experienced. labels Apr 28, 2022
@michae2 michae2 added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) and removed C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. labels Apr 28, 2022
@michae2
Copy link
Collaborator Author

michae2 commented Apr 28, 2022

(It might help to SET CLUSTER SETTING kv.closed_timestamp.target_duration = '10 min'; in this repro if you want to be sure that the RETRY_SERIALIZABLE error is due to contention with another transaction rather than hitting the closed timestamp interval.)

@blathers-crl blathers-crl bot added the T-kv KV Team label May 3, 2022
@jlinder jlinder added sync-me and removed sync-me labels May 20, 2022
@nvanbenschoten
Copy link
Member

I'm not sure this is quite the same thing as the other cases where we use implicit SELECT FOR UPDATE. In this case, we aren't planning to immediately return and write to the row that we read, so we probably wouldn't want to grab an exclusive lock during the FK check. Doing so would eliminate some retries, but at the expense of reduced concurrency. I think you could make an argument that we could grab implicit SELECT FOR SHARE locks during the read, but is this any different than other reads in CockroachDB, which defaults to optimistic reads over pessimistic reads?

One key area where FK checks differ from standard reads is that they only check for existence of the referenced row — they don't care about the value of the referenced row. This distinction is exploited in Postgres with the FOR KEY SHARE locking mode, which commutes with FOR NO KEY UPDATE locks. This eliminates conflicts between FK checks and updates to non-PK columns in rows. We discuss this and other related ideas in #52420.

@michae2
Copy link
Collaborator Author

michae2 commented Jun 1, 2022

I'm not sure this is quite the same thing as the other cases where we use implicit SELECT FOR UPDATE. In this case, we aren't planning to immediately return and write to the row that we read, so we probably wouldn't want to grab an exclusive lock during the FK check. Doing so would eliminate some retries, but at the expense of reduced concurrency. I think you could make an argument that we could grab implicit SELECT FOR SHARE locks during the read,

Good point! Yes, seems like SELECT FOR SHARE locks would be good enough. (Or FOR KEY SHARE locks, cool!)

but is this any different than other reads in CockroachDB, which defaults to optimistic reads over pessimistic reads?

I propose doing this for mutation queries when enable_implicit_select_for_update is true. I think in that case the user is choosing (or defaulting into) pessimism over optimism, and we should respect that.

@nvanbenschoten
Copy link
Member

enable_implicit_select_for_update defaults to true, so I don't think it's an indication that they're choosing pessimism over optimism. I think (and hope!) most users aren't even aware that there's a choice to be made — things should just work.

I'm curious whether you think there's a meaningful difference between the reads (e.g. FK checks) performed in a mutation query and the reads performed by other queries in a read-write transaction. Are reads in a mutation query more likely to be contended? Are reads in a mutation query more likely to be precise point reads?

If there's no meaningful difference then I don't see how you can stop at the statement boundary when saying that reads should be more pessimistic. I think you'd need to go all the way to saying that all reads in a read-write transaction should be pessimistic. We can have that discussion, but it's a big one with many implications across the system.

@michae2
Copy link
Collaborator Author

michae2 commented Jun 9, 2022

enable_implicit_select_for_update defaults to true, so I don't think it's an indication that they're choosing pessimism over optimism. I think (and hope!) most users aren't even aware that there's a choice to be made — things should just work.

I'm curious whether you think there's a meaningful difference between the reads (e.g. FK checks) performed in a mutation query and the reads performed by other queries in a read-write transaction.

The only difference I was thinking about was the lack of a way to choose pessimism for these FK checks if they are involved in contention. For reads performed by other select queries in the transaction, there is FOR UPDATE or FOR SHARE. For reads performed by updates, there is enable_implicit_select_for_update. But for these FK checks there is no way to choose pessimism if it is beneficial.

Maybe piggybacking on enable_implicit_select_for_update is the wrong way to go about this. Maybe a separate variable would be better?

@rytaft
Copy link
Collaborator

rytaft commented Jul 21, 2022

Is this a priority for 22.2?

@michae2
Copy link
Collaborator Author

michae2 commented Jul 21, 2022

Is this a priority for 22.2?

I think it could be dropped from 22.2.

@exalate-issue-sync exalate-issue-sync bot removed the T-sql-queries SQL Queries Team label Jan 4, 2023
@exalate-issue-sync exalate-issue-sync bot added T-sql-queries SQL Queries Team and removed T-kv KV Team labels Mar 24, 2023
@nvanbenschoten nvanbenschoten changed the title sql/opt: add implicit SELECT FOR UPDATE support for FK checks sql/opt: add implicit SELECT FOR SHARE support for FK checks Mar 24, 2023
@nvanbenschoten nvanbenschoten added the A-read-committed Related to the introduction of Read Committed label Mar 30, 2023
michae2 added a commit to michae2/cockroach that referenced this issue Jul 19, 2023
Add SELECT FOR SHARE locking to FK parent checks. Under serializable
isolation, this locking is only used when `enable_implicit_fk_locking`
is set. Under weaker isolation levels (snapshot and read committed) this
locking is always used.

Fixes: cockroachdb#80683
Informs: cockroachdb#100156

Epic: CRDB-25322

Release note (sql change): Add a new session setting,
`enable_implicit_fk_locking`, which controls locking during foreign key
checks under serializable isolation. With this set to true, foreign key
checks of the referenced (parent) table, such as those performed during
an INSERT or UPDATE of the referencing (child) table, will lock the
referenced row using SELECT FOR SHARE locking. This is similar to the
existing `enable_implicit_select_for_update` setting.

Under weaker isolation levels such as read committed, this SELECT FOR
SHARE locking will always be used to ensure the database maintains the
foreign key constraint, regardless of the current value of
`enable_implicit_fk_locking`.
michae2 added a commit to michae2/cockroach that referenced this issue Jul 21, 2023
Add SELECT FOR SHARE locking to FK parent checks. Under serializable
isolation, this locking is only used when `enable_implicit_fk_locking`
is set. Under weaker isolation levels (snapshot and read committed) this
locking is always used.

We only need to lock during the insertion-side FK checks, which verify the
existence of a parent row. Deletion-side FK checks verify the non-existence of a
child row, and these do not need to lock. Instead, to prevent concurrent inserts
or updates to the child that would violate the FK constraint, we rely on the
intent(s) created by the deletion conflicting with the FK locking of those
concurrent inserts or updates.

Fixes: cockroachdb#80683
Informs: cockroachdb#100156

Epic: CRDB-25322

Release note (sql change): Add a new session setting,
`enable_implicit_fk_locking`, which controls locking during foreign key
checks under serializable isolation. With this set to true, foreign key
checks of the referenced (parent) table, such as those performed during
an INSERT or UPDATE of the referencing (child) table, will lock the
referenced row using SELECT FOR SHARE locking. This is similar to the
existing `enable_implicit_select_for_update` setting.

Under weaker isolation levels such as read committed, this SELECT FOR
SHARE locking will always be used to ensure the database maintains the
foreign key constraint, regardless of the current value of
`enable_implicit_fk_locking`.
@mgartner mgartner moved this to 23.2 Release in SQL Queries Jul 24, 2023
michae2 added a commit to michae2/cockroach that referenced this issue Jul 26, 2023
Add SELECT FOR SHARE locking to FK parent checks. Under serializable
isolation, this locking is only used when
`enable_implicit_fk_locking_for_serializable` is set. Under weaker
isolation levels (snapshot and read committed) this locking is always
used.

We only need to lock during the insertion-side FK checks, which verify
the existence of a parent row. Deletion-side FK checks verify the
non-existence of a child row, and these do not need to lock. Instead, to
prevent concurrent inserts or updates to the child that would violate
the FK constraint, we rely on the intent(s) created by the deletion
conflicting with the FK locking of those concurrent inserts or updates.

Fixes: cockroachdb#80683
Informs: cockroachdb#100156

Epic: CRDB-25322

Release note (sql change): Add a new session variable,
`enable_implicit_fk_locking_for_serializable`, which controls locking
during foreign key checks under serializable isolation. With this set to
true, foreign key checks of the referenced (parent) table, such as those
performed during an INSERT or UPDATE of the referencing (child) table,
will lock the referenced row using SELECT FOR SHARE locking. (This is
somewhat analogous to the existing `enable_implicit_select_for_update`
variable but applies to the foreign key checks of a mutation statement
instead of the initial row fetch.)

Under weaker isolation levels such as read committed, SELECT FOR SHARE
locking will always be used to ensure the database maintains the foreign
key constraint, regardless of the current setting of
`enable_implicit_fk_locking_for_serializable`.
craig bot pushed a commit that referenced this issue Jul 26, 2023
105857: sql: add implicit SELECT FOR SHARE locking to FK checks r=DrewKimball,nvanbenschoten,rytaft,mgartner a=michae2

**explain: add transaction information to EXPLAIN ANALYZE**

Add transaction isolation, priority, and quality-of-service to the
output of `EXPLAIN ANALYZE`.

Release note (sql change): `EXPLAIN ANALYZE` output now includes:
- the isolation level of the statement's transaction
- the priority of the statement's transaction
- the quality of service level of the statement's transaction

---

**opt: do not use LockDurabilityGuaranteed under serializable isolation**

This is a follow-up from #103734.

We do not want to use guaranteed-durable (a.k.a. replicated) locking
under serializable isolation, because it prevents pipelining and other
optimizations, and is unnecessary for correctness. This commit amends
8cbc6d1 to only set durability for
`SELECT FOR UPDATE` locking under weaker isolation levels.

This means that query plans will be slightly different under different
isolation levels, and so we must add isolation level to the optimizer
memo staleness calculation.

Furthermore, this commit changes the error message added by
e633d5e to be about guaranteed-durable
locking rather than `SELECT FOR UPDATE`, because in a later commit this
specific error will also be triggered by foreign key checks under weaker
isolation levels.

Informs: #100144, #100156, #100193, #100194

Release note: None

---

**opt: show locking durability in EXPLAIN (OPT) output**

Because the "guaranteed-durable locking not yet implemented" error
condition is checked in execbuilder, it prevents not only execution but
also `EXPLAIN` of queries using guaranteed-durable locking. Thankfully
`EXPLAIN (OPT)` bypasses execbuilder, and hence still works, so use this
for now to verify that we are enabling durable locking for `SELECT FOR
UPDATE` under read committed isolation.

(Note that we have not yet fixed the `SELECT FOR UPDATE` plans to use
more precise locking, that will come in a later PR.)

Informs: #100194

Release note: None

---

**sql: add implicit SELECT FOR SHARE locking to FK parent checks**

Add SELECT FOR SHARE locking to FK parent checks. Under serializable
isolation, this locking is only used when
`enable_implicit_fk_locking_for_serializable` is set. Under weaker
isolation levels (snapshot and read committed) this locking is always
used.

We only need to lock during the insertion-side FK checks, which verify
the existence of a parent row. Deletion-side FK checks verify the
non-existence of a child row, and these do not need to lock. Instead, to
prevent concurrent inserts or updates to the child that would violate
the FK constraint, we rely on the intent(s) created by the deletion
conflicting with the FK locking of those concurrent inserts or updates.

Fixes: #80683
Informs: #100156

Epic: CRDB-25322

Release note (sql change): Add a new session variable,
`enable_implicit_fk_locking_for_serializable`, which controls locking
during foreign key checks under serializable isolation. With this set to
true, foreign key checks of the referenced (parent) table, such as those
performed during an INSERT or UPDATE of the referencing (child) table,
will lock the referenced row using SELECT FOR SHARE locking. (This is
somewhat analogous to the existing `enable_implicit_select_for_update`
variable but applies to the foreign key checks of a mutation statement
instead of the initial row fetch.)

Under weaker isolation levels such as read committed, SELECT FOR SHARE
locking will always be used to ensure the database maintains the foreign
key constraint, regardless of the current setting of
`enable_implicit_fk_locking_for_serializable`.

107212: ui-e2e-tests: steps to enable cypress tests r=maryliag a=rickystewart

This doesn't get the job fully working yet, but it's an improvement.

Epic: none
Part of #106584
Release note: None

107517: roachtest: add read committed variants of ycsb r=michae2 a=nvanbenschoten

Closes #107112.

This PR adds the following six roachtest variants:
```
ycsb/A/nodes=3/cpu=32/isolation-level=read-committed
ycsb/B/nodes=3/cpu=32/isolation-level=read-committed
ycsb/C/nodes=3/cpu=32/isolation-level=read-committed
ycsb/D/nodes=3/cpu=32/isolation-level=read-committed
ycsb/E/nodes=3/cpu=32/isolation-level=read-committed
ycsb/F/nodes=3/cpu=32/isolation-level=read-committed
```

It does so after adding an `--isolation-level` flag to ycsb, which controls the isolation level to run the workload transactions under. If unset, the workload will run with the default isolation level of the database.

Release note: None

107636: schemachanger: deflake TestConcurrentDeclarativeSchemaChanges r=postamar a=postamar

This commit deflakes this test by checking that the second schema change actually does block because of the first one, rather than checking that it has blocked. The bug was that the latter wasn't always guaranteed to happen because we didn't force the schema changes to run in parallel.

Fixes #106732.

Release note: None

Co-authored-by: Michael Erickson <[email protected]>
Co-authored-by: Ricky Stewart <[email protected]>
Co-authored-by: Nathan VanBenschoten <[email protected]>
Co-authored-by: Marius Posta <[email protected]>
@craig craig bot closed this as completed in bb857aa Jul 26, 2023
@github-project-automation github-project-automation bot moved this from 23.2 Release to Done in SQL Queries Jul 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-read-committed Related to the introduction of Read Committed A-sql-optimizer SQL logical planning and optimizations. C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) E-quick-win Likely to be a quick win for someone experienced. sync-me-8 T-sql-queries SQL Queries Team
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants