-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
COALESCE
appears to unconditionally evaluate all expressions in some cases
#82498
Comments
Hello, I am Blathers. I am here to help you get the issue triaged. Hoot - a bug! Though bugs are the bane of my existence, rest assured the wretched thing will get the best of care here. I have CC'd a few people who may be able to assist you:
If we have not gotten back to your issue within a few business days, you can try the following:
🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan. |
Thanks for the report @bnaecker. I agree that the documentation is misleading. As an immediate step, we'll work to make it more clear. I'll add some more context to explain the behaviors you're seeing. In your first example, In your later example, When you change the As far as I can tell, there is nothing incorrect about this behavior, but it would be great to improve the performance of queries like this. Maybe there's a way to inline these subqueries in a way that ensures the expensive part of the query won't run unless necessary. Alternatively, we could explore evaluating subqueries lazily instead of eagerly. |
It appears that Postgres behaves this way:
|
Thanks @mgartner! I agree with you that this doesn't seem to be a correctness issue, and I didn't mean to imply that. I was mostly confused about the documentation, which doesn't match the implementation. Updating that to indicate that non-chosen branches are probably not executed, and also a warning to always verify the final query, would be great. Thanks also for the context about correlated vs. uncorrelated subqueries, that's very helpful. Would you be able to recommend another way to write this query that's more likely to result in pruned subqueries? It seems like the optimizer I've experimented with both |
I can rewrite your particular query to be faster: -- Executes in <1ms.
SELECT * FROM (
SELECT 1
UNION ALL
SELECT max(x) FROM generate_series(1, 10000000) AS x
) AS s(i)
WHERE i IS NOT NULL
LIMIT 1;
-- Executes in ~1.5s.
SELECT * FROM (
SELECT NULL
UNION ALL
SELECT max(x) FROM generate_series(1, 10000000) AS x
) AS s(i)
WHERE i IS NOT NULL
LIMIT 1; But this approach doesn't generalize to all queries, so I don't think it's really what you are looking for. |
Thanks, that's an interesting approach I hadn't considered. You're right it won't generalize, but that's my problem! Thanks again, and let me know if there's any further context or data you need about the issue at hand. Appreciate your time! |
Docs PR: cockroachdb/docs#14329 |
See cockroachdb#20298 and cockroachdb#82498 for additional context. Release note: None
82703: builtins: remove certain overloads for to_timestamp r=mgartner a=otan Paving the way for this to be backported - remove some overloads that were added to remove ambiguity for the purpose of better backwards compat. Release note: None 83196: opt: clarify exception to conditional evaluation guarantee r=mgartner a=mgartner See #20298 and #82498 for additional context. Release note: None Co-authored-by: Oliver Tan <[email protected]> Co-authored-by: Marcus Gartner <[email protected]>
Describe the problem
The documentation for the
COALESCE
function states:It appears that this is not always the case, and that all arguments may be evaluated in some situations.
To Reproduce
To reproduce this, we can start a basic CockroachDB shell with:
Then at the SQL shell, run:
That query takes significantly longer than one would expect, and appears to be evaluating the entire
select max(x) ....
argument, despite the fact that the first argument is non-null.We can see that it is indeed evaluating that subquery by looking at the explain output:
Expected behavior
I would expect the second subquery not to run at all, based on the documentation. It looks like that is indeed true for a "simpler" version of the statement:
The only difference here is the first argument to
COALESCE
, which isSELECT 1
in the first case, and just the literal1
in thelatter. I would expect this query to run in the same time in both cases, based on the documentation.
On the other hand, if there is a more nuanced description of when such expressions are evaluated, it would be good to update the documentation to reflect that.
Additional data / screenshots
N/A
Environment:
cockroach sql
The full buildinfo for us is:
Additional context
We're using queries with the
COALESCE
statement to conditionally run certain subqueries. Those queries may be expensive, so we'd like to only run them if required. By placing them in the later arguments toCOALESCE
, the documentation suggests that should be achievable.As an additional piece of context, it does appear that such conditional evaluation, even when both arguments are subqueries, does work in other cases:
However, I should note that the
IF
function also appears to evaluate both arguments, so long as the condition itself is a subquery:Jira issue: CRDB-16422
The text was updated successfully, but these errors were encountered: