Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement physical plan for EXISTS subquery #123

Open
Tracked by #2248
alamb opened this issue Apr 26, 2021 · 2 comments
Open
Tracked by #2248

Implement physical plan for EXISTS subquery #123

alamb opened this issue Apr 26, 2021 · 2 comments
Labels
datafusion Changes in the datafusion crate

Comments

@alamb
Copy link
Contributor

alamb commented Apr 26, 2021

Note: migrated from original JIRA: https://issues.apache.org/jira/browse/ARROW-10819

The TPC-H queries include use of the EXISTS which is used to test for the existence of any record in a subquery. For example:

and *exists* (
    select
        *
    from
        lineitem
    where
        l_orderkey = o_orderkey
        and l_commitdate < l_receiptdate
)
@alamb alamb added the datafusion Changes in the datafusion crate label Apr 26, 2021
@alamb
Copy link
Contributor Author

alamb commented Apr 26, 2021

Comment from Andy Grove(andygrove) @ 2020-12-31T19:42:45.132+0000:

The example given here is a correlated subquery that can be translated into a join.

Here is a random stackoverflow discussion on this for reference (I have not reviewed it)

https://stackoverflow.com/questions/1772609/procedurally-transform-subquery-into-join

@alamb
Copy link
Contributor Author

alamb commented Oct 21, 2022

In case anyone is curious -- we support correlated versions of these queries (via a join) but if there is no correlation (not super useful) we do not

❯ create table foo as select * from (values (1), (2), (NULL)) as sql
;
0 rows in set. Query took 0.022 seconds.
3 rows in set. Query took 0.007 seconds.
❯ create table bar as select * from (values (1), (2), (NULL)) as sql;
0 rows in set. Query took 0.000 seconds.
❯ select * from foo where exists (select column1 from bar);
NotImplemented("Physical plan does not support logical expression EXISTS (<subquery>)")
❯ select * from foo where exists (select column1 from bar where foo.column1 = bar.column1);
+---------+
| column1 |
+---------+
| 2       |
| 1       |
+---------+

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
datafusion Changes in the datafusion crate
Projects
None yet
Development

No branches or pull requests

2 participants