-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add values list expression #1165
Conversation
bb74f4d
to
1216110
Compare
71c04fc
to
7f2170e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome @jimexist -- it will make creating small examples / demos much easier 🏅 🏅 🏅
The only thing I think that needs to be fixed prior to merge is the contents of from_plan
in utils.rs
where values
is not recreated correctly from the input exprs.
The test cases are very thorough as is the implementation.
I took it for a spin locally:
> SELECT * FROM (VALUES (1, 'one'), (2, 'two'), (3, 'three'));
Plan("subquery in FROM must have an alias")
>
>
>
> SELECT * FROM (VALUES (1, 'one'), (2, 'two'), (3, 'three')) as t;
+---------+---------+
| column1 | column2 |
+---------+---------+
| 1 | one |
| 2 | two |
| 3 | three |
+---------+---------+
3 rows in set. Query took 0.007 seconds.
>
SELECT * FROM (VALUES (1, 'one'), (2, 'two'), (3, 'three')) as t(num, letter);
+-----+--------+
| num | letter |
+-----+--------+
| 1 | one |
| 2 | two |
| 3 | three |
+-----+--------+
3 rows in set. Query took 0.005 seconds.
Which is very cool (it is working the same as Postgres!)
assert!(plan.is_err()); | ||
} | ||
{ | ||
let sql = "VALUES (1),('2')"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 for testing the negative case
if data.is_empty() { | ||
return Err(DataFusionError::Plan("Values list cannot be empty".into())); | ||
} | ||
// we have this empty batch as a placeholder to satisfy evaluation argument |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder what you think about moving the creation of the actual RecordBatch
from ValuesExec::try_new
to execute
-- the rationale would be to make PhysicalPlan
creation faster and push the actual work into execute
where if can potentially be run concurrently with other parts
Given the size of data in a VALUES
statement, this is not likely to be any real difference so I am fine with leaving the creation in the same place too -- I just wanted to mention it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i can address this in a subsequent PR where more expr types are supported (e.g. CAST
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -77,6 +77,7 @@ impl OptimizerRule for ConstantFolding { | |||
| LogicalPlan::Aggregate { .. } | |||
| LogicalPlan::Repartition { .. } | |||
| LogicalPlan::CreateExternalTable { .. } | |||
| LogicalPlan::Values { .. } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect it is not likely to matter, but constant folding could be applied to the Expr
s in values
. As written this code will not apply constant folding to those expressions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
address in #1170
datafusion/src/optimizer/utils.rs
Outdated
@@ -124,6 +124,10 @@ pub fn from_plan( | |||
schema: schema.clone(), | |||
alias: alias.clone(), | |||
}), | |||
LogicalPlan::Values { schema, values } => Ok(LogicalPlan::Values { | |||
schema: schema.clone(), | |||
values: values.to_vec(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the values here should be derived from expr
- the various optimizers call from_plan
to create a new logical plan after potentially rewriting expressions returned from LogicalPlan::expressions
.
something like (untested)
values : exprs.windows(values[0].len()).map(|w| w.to_vec()).collect()
I'll add some comments to make that clearer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert_batches_eq!(expected, &actual); | ||
} | ||
{ | ||
let sql = "VALUES (NULL,'a'),(NULL,'b'),(3,'c')"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LOL every case I could come up with to test you have already covered
It also would be cool to add |
let me track that here |
3ee8066
to
b5d8f04
Compare
8f0dcf7
to
ba42411
Compare
🎉 |
Which issue does this PR close?
Closes #1080
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?