-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixes for working with functions in dataframes, additional documentation #1430
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to try and catch errors like this in future, as well as provide some extra documentation of intent, I've changed the helper macros to explicitly accept arguments rather than use a fixed arity.
Thank you so much @tobyhede -- this is a great contribution.
Other than figuring out why the tests for regex_expressions
are commented out I think this PR is looking ready to merge.
❤️
@@ -1564,7 +1564,7 @@ pub fn approx_distinct(expr: Expr) -> Expr { | |||
/// Create an convenience function representing a unary scalar function | |||
macro_rules! unary_scalar_expr { | |||
($ENUM:ident, $FUNC:ident) => { | |||
#[doc = "this scalar function is not documented yet"] | |||
#[doc = concat!("Unary scalar function definition for ", stringify!($FUNC) ) ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
datafusion/src/logical_plan/expr.rs
Outdated
// scalar_expr!(Btrim, btrim, string); | ||
// scalar_expr!(Btrim, btrim_chars, string, characters); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happened to these two functions (as in do you mean to leave them commented out)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@alamb Left these in to demonstrate the alternative approach to handling the differnt function arities with different functions. Will clean up.
binary_scalar_expr!(DateTrunc, date_trunc); | ||
binary_scalar_expr!(Digest, digest); | ||
scalar_expr!(DatePart, date_part, part, date); | ||
scalar_expr!(DateTrunc, date_trunc, part, date); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for other reviewers, digest
was moved above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, digest was with date functions but is arguably string function or could be a section with some of the other hashes
@@ -2575,6 +2575,19 @@ mod tests { | |||
Int32, | |||
Int32Array | |||
); | |||
// #[cfg(feature = "regex_expressions")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you intend to fix this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this one doesn't work and I have removed.
There is a bug in the regex handling #1429
I might pick up that one next.
// specific language governing permissions and limitations | ||
// under the License. | ||
|
||
use std::sync::Arc; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This module of tests is epic -- thank you @tobyhede
I also kicked off the CI |
@tobyhede there appears to be some small clippy failures: https://github.com/apache/arrow-datafusion/runs/4506011831?check_suite_focus=true |
Co-authored-by: Andrew Lamb <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @tobyhede
I ran |
Which issue does this PR close?
Closes #1364 Closes #1173
Several of the built-in function definitions are not setup correctly and the functions cannot actually be used at all.
Adds a test suite for using most of the functions with a dataframe.
In order to try and catch errors like this in future, as well as provide some extra documentation of intent, I've changed the helper macros to explicitly accept arguments rather than use a fixed arity.
I've played with a couple of options for functions that have a mixed arity.
btrim, as an example has two forms:
At the moment, functions with varied arity expect a Vec
Alternative I played with was using two different definitions:
We could also make these functions macros. Doing that would mean that some functions would be functions, some macros. Felt a bit strange.