sql: SELECT FOR UPDATE not able to be optimized away #114282

michae2 · 2023-11-11T00:25:21Z

This exact combination of SELECT FOR UPDATE, EXISTS, and NULL parameter using a prepared statement is able to be optimized to a constant false in 22.2.16 but becomes a full table scan in 23.1.11:

CREATE TABLE a (a INT, INDEX (a));
PREPARE p AS SELECT EXISTS (SELECT NULL FROM a WHERE a = $1 FOR UPDATE);
EXPLAIN ANALYZE EXECUTE p (NULL);

Here's v22.2.16:

[email protected]:26257/defaultdb> CREATE TABLE a (a INT, INDEX (a));                                                                                                                                                                                                                                                                                                                                                                          PREPARE p AS SELECT EXISTS (SELECT NULL FROM a WHERE a = $1 FOR UPDATE);                                                                                                                                                                                                                                                                                                                                                                    EXPLAIN ANALYZE EXECUTE p (NULL);
CREATE TABLE


Time: 2ms total (execution 2ms / network 0ms)

PREPARE


Time: 5ms total (execution 4ms / network 0ms)

               info
-----------------------------------
  planning time: 406µs
  execution time: 476µs
  distribution: local
  vectorized: true
  maximum memory usage: 10 KiB
  network usage: 0 B (0 messages)
  regions: us-east1

  • values
    nodes: n1
    regions: us-east1
    actual row count: 1
    size: 1 column, 1 row
(13 rows)


Time: 1ms total (execution 1ms / network 0ms)

Here's v23.1.11:

[email protected]:26257/defaultdb> CREATE TABLE a (a INT, INDEX (a));
                             -> PREPARE p AS SELECT EXISTS (SELECT NULL FROM a WHERE a = $1 FOR UPDATE);
                             -> EXPLAIN ANALYZE EXECUTE p (NULL);
                             ->
CREATE TABLE

Time: 2ms total (execution 2ms / network 0ms)

PREPARE

Time: 8ms total (execution 8ms / network 0ms)

                                info
--------------------------------------------------------------------
  planning time: 124µs
  execution time: 372µs
  distribution: local
  vectorized: true
  cumulative time spent in KV: 101µs
  maximum memory usage: 30 KiB
  network usage: 0 B (0 messages)
  regions: us-east1
  sql cpu time: 39µs

  • root
  │
  ├── • values
  │     nodes: n1
  │     regions: us-east1
  │     actual row count: 1
  │     sql cpu time: 11µs
  │     size: 1 column, 1 row
  │
  └── • subquery
      │ id: @S1
      │ original sql: (SELECT NULL FROM a WHERE a = $1 FOR UPDATE)
      │ exec mode: one row
      │
      └── • render
          │
          └── • filter
              │ nodes: n1
              │ regions: us-east1
              │ actual row count: 0
              │ sql cpu time: 24µs
              │ estimated row count: 0
              │ filter: false
              │
              └── • scan
                    nodes: n1
                    regions: us-east1
                    actual row count: 0
                    KV time: 101µs
                    KV contention time: 0µs
                    KV rows read: 0
                    KV bytes read: 0 B
                    KV gRPC calls: 1
                    estimated max memory allocated: 20 KiB
                    sql cpu time: 4µs
                    missing stats
                    table: a@a_pkey
                    spans: FULL SCAN
                    locking strength: for update
(49 rows)

Time: 1ms total (execution 1ms / network 0ms)

In v23.2.0-alpha.6 it's better if we use the new SELECT FOR UPDATE behavior (optimizer_use_lock_op_for_serializable):

[email protected]:26257/system/defaultdb> SET optimizer_use_lock_op_for_serializable = true;
SET

Time: 0ms total (execution 0ms / network 0ms)

[email protected]:26257/system/defaultdb> EXPLAIN ANALYZE EXECUTE p (NULL);
                                info
--------------------------------------------------------------------
  planning time: 1ms
  execution time: 201µs
  distribution: local
  vectorized: true
  maximum memory usage: 10 KiB
  network usage: 0 B (0 messages)
  regions: us-east1
  isolation level: serializable
  priority: normal
  quality of service: regular

  • root
  │
  ├── • values
  │     nodes: n1
  │     regions: us-east1
  │     actual row count: 1
  │     size: 1 column, 1 row
  │
  └── • subquery
      │ id: @S1
      │ original sql: (SELECT NULL FROM a WHERE a = $1 FOR UPDATE)
      │ exec mode: one row
      │
      └── • render
          │
          └── • lookup join (semi)
              │ nodes: n1
              │ regions: us-east1
              │ actual row count: 0
              │ KV time: 0µs
              │ KV contention time: 0µs
              │ KV rows decoded: 0
              │ KV bytes read: 0 B
              │ KV gRPC calls: 0
              │ estimated max memory allocated: 0 B
              │ estimated row count: 0
              │ table: a@a_pkey
              │ equality: (rowid) = (rowid)
              │ equality cols are key
              │ locking strength: for update
              │
              └── • norows
                    nodes: n1
                    regions: us-east1
                    actual row count: 0
(46 rows)

Time: 2ms total (execution 2ms / network 0ms)

Jira issue: CRDB-33431

The text was updated successfully, but these errors were encountered:

DrewKimball · 2023-11-12T09:25:37Z

We don't remove the scan with a normal SQL statement, either:

root@localhost:26257/system/defaultdb> explain (opt, verbose) SELECT NULL FROM a WHERE a = NULL FOR UPDATE;
                            info
-------------------------------------------------------------
  project
   ├── columns: "?column?":5
   ├── cardinality: [0 - 0]
   ├── volatile
   ├── stats: [rows=0]
   ├── cost: 1058.25
   ├── fd: ()-->(5)
   ├── prune: (5)
   ├── select
   │    ├── cardinality: [0 - 0]
   │    ├── volatile
   │    ├── stats: [rows=0]
   │    ├── cost: 1058.24
   │    ├── scan a
   │    │    ├── locking: for-update
   │    │    ├── volatile
   │    │    ├── stats: [rows=1000]
   │    │    └── cost: 1048.22
   │    └── filters
   │         └── false [constraints=(contradiction; tight)]
   └── projections
        └── NULL [as="?column?":5]
(22 rows)

Time: 11ms total (execution 11ms / network 0ms)

We don't fire the rule that removes a zero-cardinality group when the expression isn't leakproof. The scan isn't leakproof here because of the locking. I'm not sure what was removing the scan in 22.2 that's different - it must have been some other rule that doesn't check this.

DrewKimball · 2023-11-12T10:06:50Z

We probably shouldn't just remove locking (or any database state changes, like mutations), but in this case the SELECT FOR UPDATE only applies to rows that pass the filter, which is none. So this is potentially an instance of #75457.

Also, we should probably be removing that lookup join too even though it's locking, since it should be guaranteed not to do any lookups.

DrewKimball · 2023-11-13T11:07:39Z

Also, possible dupe of #73074

michae2 · 2023-11-20T20:28:49Z

The lack of optimization of SFU does seem like a dupe of #73074, which is improved in 23.2 when using the new SFU implementation, so the only remaining question here is why we were optimizing at all in 22.2. I think there is some other minor difference due to the prepared statement. I"m going to keep this open because of that.

cockroach-teamcity added this to SQL Queries Nov 11, 2023

github-project-automation bot moved this to Triage in SQL Queries Nov 11, 2023

michae2 mentioned this issue Nov 20, 2023

opt: enable new implementation of SELECT FOR UPDATE under serializable #114737

Open

13 tasks

michae2 moved this from Triage to 24.1 Release in SQL Queries Nov 20, 2023

michae2 added A-read-committed Related to the introduction of Read Committed P-3 Issues/test failures with no fix SLA labels Nov 20, 2023

mgartner moved this from 24.1 Release to New Backlog in SQL Queries Nov 28, 2023

DrewKimball changed the title ~~sql: prepared SELECT FOR UPDATE not able to be optimized away~~ sql: SELECT FOR UPDATE not able to be optimized away Dec 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

sql: SELECT FOR UPDATE not able to be optimized away #114282

sql: SELECT FOR UPDATE not able to be optimized away #114282

michae2 commented Nov 11, 2023 •

edited by rickystewart

Loading

DrewKimball commented Nov 12, 2023

DrewKimball commented Nov 12, 2023

DrewKimball commented Nov 13, 2023

michae2 commented Nov 20, 2023

sql: SELECT FOR UPDATE not able to be optimized away #114282

sql: SELECT FOR UPDATE not able to be optimized away #114282

Comments

michae2 commented Nov 11, 2023 • edited by rickystewart Loading

DrewKimball commented Nov 12, 2023

DrewKimball commented Nov 12, 2023

DrewKimball commented Nov 13, 2023

michae2 commented Nov 20, 2023

michae2 commented Nov 11, 2023 •

edited by rickystewart

Loading