Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Minor] Extract interval parsing logic, add unit tests #2984

Merged
merged 2 commits into from
Jul 31, 2022

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Jul 29, 2022

Which issue does this PR close?

N/A

Rationale for this change

Something I noticed while working on https://github.com/apache/arrow-datafusion/pull/2981/files was that the parsing of strings like 3 days to interval constants was in the sql parser logic directly which made it somewhat hard to follow

Since apache/datafusion-sqlparser-rs#517 added support for arbitrary Exprs as interval arguments (rather than just strings), I thought refactoring this code into a function would be a step towards supporting generic arguments

It also made it it easier to write tests

I was thinking eventually we could use some of the code from @avantgardnerio from apache/arrow-rs#2031 in Arrow but before I did that I wanted to make sure the existing code had reasonable test coverage (to make sure we didn't break things).

What changes are included in this PR?

  1. Pull interval parsing logic into its own module and function. I tried to not make any logical changes
  2. Add some (very) basic tests
  3. Add functions to easily create ScalarValue::Interval*

Are there any user-facing changes?

@github-actions github-actions bot added the sql SQL Planner label Jul 29, 2022
Copy link
Contributor Author

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ovr I would appreciate a review if you had time


/// Parses a string with an interval like `'0.5 MONTH'` to an
/// appropriately typed [`ScalarValue`]
pub(crate) fn parse_interval(leading_field: &str, value: &str) -> Result<ScalarValue> {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code was moved as closely as possible from parser.rs

let leading_field = leading_field
.as_ref()
.map(|dt| dt.to_string())
.unwrap_or_else(|| "second".to_string());
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use super::*;

#[test]
fn test_parse_ym() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests are new

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And thank you for them! Comparing the results of differing implementations is an extremely useful test.

datafusion/sql/src/interval.rs Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

Codecov Report

Merging #2984 (46851dc) into master (2d23860) will increase coverage by 0.05%.
The diff coverage is 86.74%.

@@            Coverage Diff             @@
##           master    #2984      +/-   ##
==========================================
+ Coverage   85.76%   85.81%   +0.05%     
==========================================
  Files         281      282       +1     
  Lines       51515    51531      +16     
==========================================
+ Hits        44182    44222      +40     
+ Misses       7333     7309      -24     
Impacted Files Coverage Δ
datafusion/common/src/scalar.rs 84.88% <66.66%> (+0.02%) ⬆️
datafusion/sql/src/interval.rs 88.57% <88.57%> (ø)
datafusion/sql/src/planner.rs 81.88% <100.00%> (+0.72%) ⬆️
datafusion/expr/src/logical_plan/plan.rs 77.60% <0.00%> (-0.18%) ⬇️
datafusion/core/src/physical_plan/metrics/value.rs 87.43% <0.00%> (+0.50%) ⬆️
datafusion/core/tests/parquet_pruning.rs 99.43% <0.00%> (+1.17%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us.

@@ -618,6 +619,28 @@ impl ScalarValue {
precision, scale
)))
}

/// Returns a [`ScalarValue::IntervalYearMonth`] representing
/// `years` years and `months` months
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These seem very nice to have. I think more convenience functions like this are the key to terser and more readable code.

use super::*;

#[test]
fn test_parse_ym() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And thank you for them! Comparing the results of differing implementations is an extremely useful test.

@alamb alamb merged commit c7fa789 into apache:master Jul 31, 2022
@ursabot
Copy link

ursabot commented Jul 31, 2022

Benchmark runs are scheduled for baseline = 0fa6a93 and contender = c7fa789. c7fa789 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@alamb alamb deleted the alamb/break_out_interval_parsing branch August 8, 2023 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sql SQL Planner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants