-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Look into IS DISTINCT FROM as an alternative to our current null compensation scheme #29624
Comments
Note that especially for complex/expensive expressions, this has the potential to eliminate the duplicate inherent in null compensation, i.e. instead of |
Hi @roji I was watching your video on 2nd level caching and my understand is that it is only exists because different SQL needs to be generated depending on whether a parameter is NULL or not. If the sql server provider can leverage IS DISTINCT FROM, the generated sql would be the same regardless. Does it mean that the sql server provider could then bypass the 2nd level cache? |
@clement911 unfortunately not - equality is just one of many SQL expression types for which we ned to perform nullability-related rewriting. Take a look at SqlNullabilityProcessor to get an idea of the type of things this involves. IS DISTINCT FROM can still potentially cut down a lot of null compensation, which may significantly improve performance. But note also that we need to carefully check how it performs (especially with regards to index usage) compared to normal equality. |
Modern versions of SQLite have the standard SQL syntax (i.e. MySQL has the Related: #10514 |
@ranma42 I may be missing it, but I can't see mention of index usage specifically for IS DISTINCT FROM in the SQLite docs. I definitely believe IS DISTINCT FROM is a promising improvement, and we can add support for it in relational and allow specific providers to opt into it where that makes sense. But careful investigation needs to happen for each databases to ensure whether indexes are used etc. |
I think indexes are not used for
also the docs states that
which makes me believe they are treated the same way internally.
Yes, a "strict equality" is definitely an interesting option; its support (both in term of syntax and performance) seems to be very different across providers, so that is something each provider might have to decide how to handle (and in some cases different choices even within the same provider, as SQL Server < 2022 does not seem to support Another option worth investigating is CTE, which would be more general, but I am afraid it would also involve more extensive changes. |
https://github.com/ranma42/efcore/tree/is-not-distinct-from is a WIP branch that explores what happens to the test suite baselines when using
|
I missed this original comment - this indeed seems to mean that we should seriously consider switching to IS in SQLite instead of compensating with additional IS NULL checks... For other databases, do you happen to know for sure whether SQL Server uses indexes for IS NOT DISTINCT FROM? I've never actually checked this. Similarly, for PG I checked a very long time ago, but things may very well have changed (see npgsql/efcore.pg#28 from 2016...). One additional thought is that we should consider the same change for the negated case (i.e. translate @ranma42 can you please submit your branch as a draft PR rather than as a branch? This way we can have a conversation around it more easily (it's totally fine if it's still an experiment etc.). |
Looks like IS DISTINCT FROM enables index seeks: https://sqlperformance.com/2022/08/sql-server-2022/additional-t-sql-improvements-in-sql-server-2022 |
Perfect, thanks @ErikEJ! So it looks like IS DISTINCT FROM should be a good general choice replacement to compensation on SQL Server 2022! And as @ranma42 mentioned in a recent conversation, compared to the current compensation it also has the advantage of not duplicating the expressions (which may be expensive scalar subqueries etc.). I'll reach out internally just to confirm that there aren't any gotchas, but otherwise I think we can definitely consider doing this. Maybe, as this is quite a fundamental query change, we'd want to preemptively include some sort of opt-out, just in case there's trouble. FYI I've opened npgsql/efcore.pg#3206 to track this on the EFCore.PG side. |
SQL Server 2022 has support for IS [NOT] DISTINCT FROM (docs); this is an equality check that treats two nulls as equal (similar to how DISTINCT works in SQL queries).
PostgreSQL has had this for a long time (docs), but last time I checked it didn't utilize indexes. If the SQL Server version does, then we can use this as a terser alternative to the equality operator, which doesn't require null compensation.
The text was updated successfully, but these errors were encountered: