-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split binary_expr.rs
into smaller modules
#3026
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even after this change binary.rs is still 2500 lines long, but that is at least a trend in the right direction
binary_expr.rs
into smaller modules
Codecov Report
@@ Coverage Diff @@
## master #3026 +/- ##
=======================================
Coverage 85.84% 85.85%
=======================================
Files 283 286 +3
Lines 51658 51666 +8
=======================================
+ Hits 44347 44356 +9
+ Misses 7311 7310 -1
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
cc @avantgardnerio in case you have made changes to the decimal code as part of the |
An alternative for splitting
|
Thanks @yjshen -- my ideal world is that there are no actual calculation kernels in datafusion (that they are all in arrow) but if we get more kernels into data fusion the split you suggest would be good. Or maybe I can split |
Benchmark runs are scheduled for baseline = 3d27987 and contender = d07a775. d07a775 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
/// Concat lhs and rhs String Array, any `NULL` exists on lhs or rhs will come out result `NULL` | ||
/// 1. 'a' || 'b' || 32 = 'ab32' | ||
/// 2. 'a' || NULL = NULL | ||
pub(crate) fn string_concat(left: ArrayRef, right: ArrayRef) -> Result<ArrayRef> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 it seems like this may now be part of arrow:
https://docs.rs/arrow/19.0.0/arrow/compute/kernels/concat_elements/fn.concat_elements_utf8.html
I'll and switch to use that as a follow on PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOTE: This change looks large but it is only moving code around
Which issue does this PR close?
re #1474
Rationale for this change
I find it really hard to work on
binary.rs
. Partly this is because what it is doing is complicated however, I think it can be less complicated than currently, and thus I am trying to unravel the Gordian knotPart of why this module is complicated is because it has both dispatch logic (figure out what functions to call based on argument types) as well as kernel implementations (both stuff that should eventually end up in arrow as well as things that are datafusion specific).
I think it will be easier to work with if the file were smaller.
What changes are included in this PR?
Thus, break binary.rs into several smaller modules:
binary.rs
: The main dispatch logicbinary/kernels.rs
: kernels that are unique to datafusionbinary/arrow_kernels.rs
: Kernels that are eventually destined for arrow-rs but are in datafusion until we get around tot hatbinary/adapte.rs
: functions that might change types or names to make them compatible with the main dispatch logicAre there any user-facing changes?
No