-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
minor: port some join tests to sqllogictests #5567
Conversation
statement ok | ||
set datafusion.optimizer.repartition_joins = false; | ||
|
||
include ./join.slt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most join tests need run twice with different repartition_joins config(on/off).
This slt file is for tests when the config is off.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL "include
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I verified the queries run as expected
Running tests/sqllogictests/src/main.rs (/Users/alamb/Software/target-df2/debug/deps/sqllogictests-18a04f4d360aa062)
[join_disable_repartition_joins.slt] Running query: "set datafusion.optimizer.repartition_joins = false;"
[join_disable_repartition_joins.slt] Running query: "CREATE TABLE IF NOT EXISTS students(name TEXT, mark INT) AS VALUES
('Stuart', 28),
('Amina', 89),
('Christen', 50),
('Salma', 77),
('Samantha', 21);"
[join_disable_repartition_joins.slt] Running query: "CREATE TABLE IF NOT EXISTS grades(grade INT, min INT, max INT) AS VALUES
(1, 0, 14),
(2, 15, 35),
(3, 36, 55),
(4, 56, 79),
(5, 80, 100);"
[join_disable_repartition_joins.slt] Running query: "SELECT s.*, g.grade FROM students s join grades g on s.mark between g.min and g.max WHERE grade > 2 ORDER BY s.mark DESC"
[join_disable_repartition_joins.slt] Running query: "drop table IF EXISTS students;"
[join_disable_repartition_joins.slt] Running query: "drop table IF EXISTS grades;"
[join_disable_repartition_joins.slt] Running query: "CREATE TABLE IF NOT EXISTS test1(a int, b int) as select 1 as a, 2 as b;"
[join_disable_repartition_joins.slt] Running query: "CREATE TABLE IF NOT EXISTS test2(a int, b int) as select 1 as a, 2 as b;"
[join_disable_repartition_joins.slt] Running query: "SELECT * FROM test2 FULL JOIN test1 ON true;"
[join_disable_repartition_joins.slt] Running query: "drop table IF EXISTS test1;"
[join_disable_repartition_joins.slt] Running query: "drop table IF EXISTS test2;"
[join_disable_repartition_joins.slt] Running query: "CREATE TABLE IF NOT EXISTS t1(t1_id INT, t1_name TEXT, t1_int INT) AS VALUES
(11, 'a', 1),
(22, 'b', 2),
(33, 'c', 3),
(44, 'd', 4);"
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great to me -- thank you @ygf11
statement ok | ||
set datafusion.optimizer.repartition_joins = false; | ||
|
||
include ./join.slt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL "include
👍
statement ok | ||
set datafusion.optimizer.repartition_joins = false; | ||
|
||
include ./join.slt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I verified the queries run as expected
Running tests/sqllogictests/src/main.rs (/Users/alamb/Software/target-df2/debug/deps/sqllogictests-18a04f4d360aa062)
[join_disable_repartition_joins.slt] Running query: "set datafusion.optimizer.repartition_joins = false;"
[join_disable_repartition_joins.slt] Running query: "CREATE TABLE IF NOT EXISTS students(name TEXT, mark INT) AS VALUES
('Stuart', 28),
('Amina', 89),
('Christen', 50),
('Salma', 77),
('Samantha', 21);"
[join_disable_repartition_joins.slt] Running query: "CREATE TABLE IF NOT EXISTS grades(grade INT, min INT, max INT) AS VALUES
(1, 0, 14),
(2, 15, 35),
(3, 36, 55),
(4, 56, 79),
(5, 80, 100);"
[join_disable_repartition_joins.slt] Running query: "SELECT s.*, g.grade FROM students s join grades g on s.mark between g.min and g.max WHERE grade > 2 ORDER BY s.mark DESC"
[join_disable_repartition_joins.slt] Running query: "drop table IF EXISTS students;"
[join_disable_repartition_joins.slt] Running query: "drop table IF EXISTS grades;"
[join_disable_repartition_joins.slt] Running query: "CREATE TABLE IF NOT EXISTS test1(a int, b int) as select 1 as a, 2 as b;"
[join_disable_repartition_joins.slt] Running query: "CREATE TABLE IF NOT EXISTS test2(a int, b int) as select 1 as a, 2 as b;"
[join_disable_repartition_joins.slt] Running query: "SELECT * FROM test2 FULL JOIN test1 ON true;"
[join_disable_repartition_joins.slt] Running query: "drop table IF EXISTS test1;"
[join_disable_repartition_joins.slt] Running query: "drop table IF EXISTS test2;"
[join_disable_repartition_joins.slt] Running query: "CREATE TABLE IF NOT EXISTS t1(t1_id INT, t1_name TEXT, t1_int INT) AS VALUES
(11, 'a', 1),
(22, 'b', 2),
(33, 'c', 3),
(44, 'd', 4);"
...
|
||
# turn on repartition_joins | ||
statement ok | ||
set datafusion.optimizer.repartition_joins = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice. The only thing a bit concerning me, is
- sqllogictest we output the sql, which makes the output noisy and probably not very useful, I was thinking to create a ticket to ouputit file name and indented test name. the similar way as cargo test
- do you guys think we should create a .slt file per use case like disabled repartition join? or reuse join.slt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sqllogictest we output the sql, which makes the output noisy and probably not very useful, I was thinking to create a ticket to ouputit file name and indented test name. the similar way as cargo test
I think this is a good idea
do you guys think we should create a .slt file per use case like disabled repartition join? or reuse join.slt?
I like the idea of reusing join.slt but it will likely make some tests in the future harder if they vary based on the setting (e.g. plan tests)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you guys think we should create a .slt file per use case like disabled repartition join? or reuse join.slt?
I think it's worth creating this file now. I first tried adding everything to join.slt
and found a lot of duplicated code, since most tests need to be run twice.
# query x
statement ok
set datafusion.optimizer.repartition_joins = false;
query ITI rowsort
SELECT xxx
----
a a a
# switch to another config
statement ok
set datafusion.optimizer.repartition_joins = true;
query ITI rowsort
SELECT xxx
----
a a a
Using include reduces a lot of codes, originally the addition was 1100+
, now it is 655
.
For the tests whose setting is complex, we can also add them to join.slt
instead of adding a new slt
file.
Thanks @ygf11 ! |
Which issue does this PR close?
Part of #4495 and #4870
Rationale for this change
What changes are included in this PR?
This pr mainly ports tests whose context is
join_context
tojoin.slt
.Are these changes tested?
Yes.
Are there any user-facing changes?