`COALESCE` appears to unconditionally evaluate all expressions in some cases #82498

bnaecker · 2022-06-07T00:12:39Z

Describe the problem

The documentation for the COALESCE function states:

Arguments to the right of the first non-null argument are not evaluated.

It appears that this is not always the case, and that all arguments may be evaluated in some situations.

To Reproduce

To reproduce this, we can start a basic CockroachDB shell with:

$ cockroach demo

Then at the SQL shell, run:

[email protected]:26257/movr> select coalesce((select 1), (select max(x) from generate_series(1, 10000000) as x));
  coalesce
------------
         1
(1 row)


Time: 1.395s total (execution 1.394s / network 0.001s)

[email protected]:26257/movr>

That query takes significantly longer than one would expect, and appears to be evaluating the entire select max(x) .... argument, despite the fact that the first argument is non-null.

We can see that it is indeed evaluating that subquery by looking at the explain output:

[email protected]:26257/movr> explain analyze select coalesce((select 1), (select max(x) from generate_series(1, 10000000) as x));
                                           info
------------------------------------------------------------------------------------------
  planning time: 357µs
  execution time: 1.2s
  distribution: local
  vectorized: true
  maximum memory usage: 102 KiB
  network usage: 0 B (0 messages)
  regions: us-east1

  • root
  │
  ├── • values
  │     nodes: n1
  │     regions: us-east1
  │     actual row count: 1
  │     size: 1 column, 1 row
  │
  ├── • subquery
  │   │ id: @S1
  │   │ original sql: (SELECT 1)
  │   │ exec mode: one row
  │   │
  │   └── • values
  │         nodes: n1
  │         regions: us-east1
  │         actual row count: 1
  │         size: 1 column, 1 row
  │
  └── • subquery
      │ id: @S2
      │ original sql: (SELECT max(x) FROM ROWS FROM (generate_series(1, 10000000)) AS x)
      │ exec mode: one row
      │
      └── • group (scalar)
          │ nodes: n1
          │ regions: us-east1
          │ actual row count: 1
          │ estimated row count: 1
          │
          └── • project set
              │ nodes: n1
              │ regions: us-east1
              │ actual row count: 10,000,000
              │ estimated row count: 10
              │
              └── • emptyrow
                    nodes: n1
                    regions: us-east1
                    actual row count: 1
(48 rows)


Time: 1.244s total (execution 1.244s / network 0.000s)

[email protected]:26257/movr>

Expected behavior

I would expect the second subquery not to run at all, based on the documentation. It looks like that is indeed true for a "simpler" version of the statement:

[email protected]:26257/movr> explain analyze select coalesce(1, (select max(x) from generate_series(1, 10000000) as x));
               info
-----------------------------------
  planning time: 192µs
  execution time: 370µs
  distribution: local
  vectorized: true
  maximum memory usage: 10 KiB
  network usage: 0 B (0 messages)
  regions: us-east1

  • values
    nodes: n1
    regions: us-east1
    actual row count: 1
    size: 1 column, 1 row
(13 rows)


Time: 1ms total (execution 1ms / network 0ms)

[email protected]:26257/movr>

The only difference here is the first argument to COALESCE, which is SELECT 1 in the first case, and just the literal 1 in the
latter. I would expect this query to run in the same time in both cases, based on the documentation.

On the other hand, if there is a more nuanced description of when such expressions are evaluated, it would be good to update the documentation to reflect that.

Additional data / screenshots

N/A

Environment:

CockroachDB version 21.2.9
Server OS: illumos
Client app is cockroach sql

The full buildinfo for us is:

$ cockroach version
Build Tag:        v21.2.9
Build Time:       2022/04/28 04:02:42
Distribution:     OSS
Platform:         illumos amd64 (x86_64-pc-solaris2.11)
Go Version:       go1.16.10
C Compiler:       gcc 10.3.0
Build Commit ID:  11787edfcfc157a0df951abc34684e4e18b3ef20
Build Type:       release

Additional context

We're using queries with the COALESCE statement to conditionally run certain subqueries. Those queries may be expensive, so we'd like to only run them if required. By placing them in the later arguments to COALESCE, the documentation suggests that should be achievable.

As an additional piece of context, it does appear that such conditional evaluation, even when both arguments are subqueries, does work in other cases:

[email protected]:26257/movr> explain analyze select if(1 = 1, (select 1), (select max(x) from generate_series(1, 10000000) as x));
               info
-----------------------------------
  planning time: 224µs
  execution time: 661µs
  distribution: local
  vectorized: true
  maximum memory usage: 10 KiB
  network usage: 0 B (0 messages)
  regions: us-east1

  • root
  │
  ├── • values
  │     nodes: n1
  │     regions: us-east1
  │     actual row count: 1
  │     size: 1 column, 1 row
  │
  └── • subquery
      │ id: @S1
      │ original sql: (SELECT 1)
      │ exec mode: one row
      │
      └── • values
            nodes: n1
            regions: us-east1
            actual row count: 1
            size: 1 column, 1 row
(26 rows)


Time: 1ms total (execution 1ms / network 0ms)

However, I should note that the IF function also appears to evaluate both arguments, so long as the condition itself is a subquery:

[email protected]:26257/movr> explain analyze select if((select 1 = 1), 1, (select max(x) from generate_series(1, 10000000) as x));
                                           info
------------------------------------------------------------------------------------------
  planning time: 358µs
  execution time: 1.7s
  distribution: local
  vectorized: true
  maximum memory usage: 102 KiB
  network usage: 0 B (0 messages)
  regions: us-east1

  • root
  │
  ├── • values
  │     nodes: n1
  │     regions: us-east1
  │     actual row count: 1
  │     size: 1 column, 1 row
  │
  ├── • subquery
  │   │ id: @S1
  │   │ original sql: (SELECT 1 = 1)
  │   │ exec mode: one row
  │   │
  │   └── • values
  │         nodes: n1
  │         regions: us-east1
  │         actual row count: 1
  │         size: 1 column, 1 row
  │
  └── • subquery
      │ id: @S2
      │ original sql: (SELECT max(x) FROM ROWS FROM (generate_series(1, 10000000)) AS x)
      │ exec mode: one row
      │
      └── • group (scalar)
          │ nodes: n1
          │ regions: us-east1
          │ actual row count: 1
          │ estimated row count: 1
          │
          └── • project set
              │ nodes: n1
              │ regions: us-east1
              │ actual row count: 10,000,000
              │ estimated row count: 10
              │
              └── • emptyrow
                    nodes: n1
                    regions: us-east1
                    actual row count: 1
(48 rows)


Time: 1.749s total (execution 1.749s / network 0.000s)

[email protected]:26257/movr>

Jira issue: CRDB-16422

The text was updated successfully, but these errors were encountered:

blathers-crl · 2022-06-07T00:12:41Z

Hello, I am Blathers. I am here to help you get the issue triaged.

Hoot - a bug! Though bugs are the bane of my existence, rest assured the wretched thing will get the best of care here.

I have CC'd a few people who may be able to assist you:

@cockroachdb/sql-queries (found keywords: vectorized,plan)

If we have not gotten back to your issue within a few business days, you can try the following:

Join our community slack channel and ask on #cockroachdb.
Try find someone from here if you know they worked closely on the area and CC them.

_{🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.}

mgartner · 2022-06-07T13:53:35Z

Thanks for the report @bnaecker. I agree that the documentation is misleading. As an immediate step, we'll work to make it more clear.

I'll add some more context to explain the behaviors you're seeing.

In your first example, select coalesce((select 1), (select max(x) from generate_series(1, 10000000) as x));, the inner select clauses are uncorrelated subqueries which do not reference any columns from the outer part of the query. In many cases, the optimizer can inline uncorrelated subqueries into the query plan. When uncorrelated subqueries aren't inlined, they are eagerly evaluated and the results are cached before the execution of the outer query. In this example, select 1 and select max(x) from generate_series(1, 10000000) as x are executed first, then the select coalesce(...) outer query is executed, accessing the cached subquery results when necessary.

In your later example, select if(1 = 1, (select 1), (select max(x) from generate_series(1, 10000000) as x)), the conditional part of the if operator is an expression that can be evaluated before execution-time. The optimizer folds 1 = 1 to true, and then folds the if operator to simply (select 1), eliminating the expensive subquery from the query plan.

When you change the if conditional to a subquery, select if((select 1 = 1), 1, (select max(x) from generate_series(1, 10000000) as x));, the optimizer can no longer fold the conditional to true, so the if and the expensive subquery remain.

As far as I can tell, there is nothing incorrect about this behavior, but it would be great to improve the performance of queries like this. Maybe there's a way to inline these subqueries in a way that ensures the expensive part of the query won't run unless necessary. Alternatively, we could explore evaluating subqueries lazily instead of eagerly.

mgartner · 2022-06-07T13:57:51Z

Alternatively, we could explore evaluating subqueries lazily instead of eagerly.

It appears that Postgres behaves this way:

marcus=# EXPLAIN ANALYZE SELECT max(x) FROM generate_series(1, 10000000) AS x;
                                                                  QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=125000.00..125000.01 rows=1 width=4) (actual time=2626.653..2626.654 rows=1 loops=1)
   ->  Function Scan on generate_series x  (cost=0.00..100000.00 rows=10000000 width=4) (actual time=873.188..1844.573 rows=10000000 loops=1)
 Planning Time: 0.034 ms
 Execution Time: 2633.763 ms
(4 rows)



marcus=# EXPLAIN ANALYZE SELECT COALESCE((SELECT 1), (SELECT max(x) FROM generate_series(1, 10000000) AS x));
                                                   QUERY PLAN
-----------------------------------------------------------------------------------------------------------------
 Result  (cost=125000.02..125000.03 rows=1 width=4) (actual time=0.004..0.004 rows=1 loops=1)
   InitPlan 1 (returns $0)
     ->  Result  (cost=0.00..0.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)
   InitPlan 2 (returns $1)
     ->  Aggregate  (cost=125000.00..125000.01 rows=1 width=4) (never executed)
           ->  Function Scan on generate_series x  (cost=0.00..100000.00 rows=10000000 width=4) (never executed)
 Planning Time: 0.053 ms
 Execution Time: 0.025 ms
(8 rows)

bnaecker · 2022-06-07T18:25:43Z

Thanks @mgartner! I agree with you that this doesn't seem to be a correctness issue, and I didn't mean to imply that. I was mostly confused about the documentation, which doesn't match the implementation. Updating that to indicate that non-chosen branches are probably not executed, and also a warning to always verify the final query, would be great.

Thanks also for the context about correlated vs. uncorrelated subqueries, that's very helpful. Would you be able to recommend another way to write this query that's more likely to result in pruned subqueries? It seems like the optimizer I've experimented with both IF and CASE expressions, all of which appear to fully evaluate the subqueries prior to evaluating the actual conditional or branch.

mgartner · 2022-06-07T21:29:49Z

I can rewrite your particular query to be faster:

-- Executes in <1ms.
SELECT * FROM (
  SELECT 1
  UNION ALL
  SELECT max(x) FROM generate_series(1, 10000000) AS x
) AS s(i)
WHERE i IS NOT NULL
LIMIT 1;

-- Executes in ~1.5s.
SELECT * FROM (
  SELECT NULL
  UNION ALL
  SELECT max(x) FROM generate_series(1, 10000000) AS x
) AS s(i)
WHERE i IS NOT NULL
LIMIT 1;

But this approach doesn't generalize to all queries, so I don't think it's really what you are looking for.

bnaecker · 2022-06-07T21:35:57Z

Thanks, that's an interesting approach I hadn't considered. You're right it won't generalize, but that's my problem! Thanks again, and let me know if there's any further context or data you need about the issue at hand. Appreciate your time!

See cockroachdb/cockroach#82498

mgartner · 2022-06-22T16:15:05Z

Docs PR: cockroachdb/docs#14329
Issue tracking lazy evaluation of subqueries: #20298

See cockroachdb#20298 and cockroachdb#82498 for additional context. Release note: None

82703: builtins: remove certain overloads for to_timestamp r=mgartner a=otan Paving the way for this to be backported - remove some overloads that were added to remove ambiguity for the purpose of better backwards compat. Release note: None 83196: opt: clarify exception to conditional evaluation guarantee r=mgartner a=mgartner See #20298 and #82498 for additional context. Release note: None Co-authored-by: Oliver Tan <[email protected]> Co-authored-by: Marcus Gartner <[email protected]>

bnaecker added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Jun 7, 2022

blathers-crl bot added O-community Originated from the community X-blathers-triaged blathers was able to find an owner labels Jun 7, 2022

blathers-crl bot added the T-sql-queries SQL Queries Team label Jun 7, 2022

bnaecker mentioned this issue Jun 7, 2022

Revisit the use of CAST and COALESCE in CockroachDB queries oxidecomputer/omicron#1166

Open

mgartner self-assigned this Jun 7, 2022

mgartner added the docs-todo label Jun 7, 2022

mgartner added a commit to mgartner/docs that referenced this issue Jun 22, 2022

Clarify evaluation of conditional branches

057ac03

See cockroachdb/cockroach#82498

mgartner mentioned this issue Jun 22, 2022

Clarify evaluation of conditional branches cockroachdb/docs#14329

Merged

mgartner closed this as completed Jun 22, 2022

mgartner mentioned this issue Jun 22, 2022

opt: clarify exception to conditional evaluation guarantee #83196

Merged

mgartner added a commit to mgartner/cockroach that referenced this issue Jun 22, 2022

opt: clarify exception to conditional evaluation guarantee

d5b490b

See cockroachdb#20298 and cockroachdb#82498 for additional context. Release note: None

mgartner added this to SQL Queries Jul 24, 2023

mgartner moved this to Done in SQL Queries Jul 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`COALESCE` appears to unconditionally evaluate all expressions in some cases #82498

`COALESCE` appears to unconditionally evaluate all expressions in some cases #82498

bnaecker commented Jun 7, 2022 •

edited by cockroach-jira-scripts

Loading

blathers-crl bot commented Jun 7, 2022

mgartner commented Jun 7, 2022 •

edited

Loading

mgartner commented Jun 7, 2022

bnaecker commented Jun 7, 2022

mgartner commented Jun 7, 2022

bnaecker commented Jun 7, 2022

mgartner commented Jun 22, 2022

COALESCE appears to unconditionally evaluate all expressions in some cases #82498

COALESCE appears to unconditionally evaluate all expressions in some cases #82498

Comments

bnaecker commented Jun 7, 2022 • edited by cockroach-jira-scripts Loading

blathers-crl bot commented Jun 7, 2022

mgartner commented Jun 7, 2022 • edited Loading

mgartner commented Jun 7, 2022

bnaecker commented Jun 7, 2022

mgartner commented Jun 7, 2022

bnaecker commented Jun 7, 2022

mgartner commented Jun 22, 2022

`COALESCE` appears to unconditionally evaluate all expressions in some cases #82498

`COALESCE` appears to unconditionally evaluate all expressions in some cases #82498

bnaecker commented Jun 7, 2022 •

edited by cockroach-jira-scripts

Loading

mgartner commented Jun 7, 2022 •

edited

Loading