Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement current_time scalar function #4054

Merged
merged 5 commits into from
Nov 1, 2022
Merged

Conversation

naosense
Copy link
Contributor

@naosense naosense commented Nov 1, 2022

Which issue does this PR close?

Closes #3982 .

Rationale for this change

DataFusion users need to be able to use the current time to calculate things like "all data in the last 5 hours"

What changes are included in this PR?

Scalar current_time function

Are there any user-facing changes?

@github-actions github-actions bot added core Core DataFusion crate logical-expr Logical plan and expressions physical-expr Physical Expressions labels Nov 1, 2022
@waitingkuo
Copy link
Contributor

LGTM, thank you @naosense

@naosense naosense marked this pull request as draft November 1, 2022 06:37
@naosense
Copy link
Contributor Author

naosense commented Nov 1, 2022

I followed @comphead 's current_date test case

let sql = "select case when current_time() = cast(now() as time) then 'OK' else 'FAIL' end result";
    let results = execute_to_batches(&ctx, sql).await;

    let expected = vec![
        "+--------+",
        "| result |",
        "+--------+",
        "| OK     |",
        "+--------+",
    ];

    assert_batches_eq!(expected, &results);

but it reported an error

thread 'sql::timestamp::test_current_time' panicked at 'called `Result::unwrap()` on an `Err` value: "NotImplemented(\"Unsupported CAST from Timestamp(Nanosecond, Some(\\\"UTC\\\")) to Time64(Nanosecond)\") at Creating physical plan for 'select case when current_time() = cast(now() as time) then 'OK' else 'FAIL' end result': Projection: CASE WHEN currenttime() = CAST(now() AS Time64(Nanosecond)) THEN Utf8(\"OK\") ELSE Utf8(\"FAIL\") END AS result\n  EmptyRelation"', datafusion/core/tests/sql/mod.rs:806:10

Is anything i missed?

@mingmwang
Copy link
Contributor

If there are multiple current_time() in a SQL statement, will it be invoked multiple times or just once ?

@waitingkuo
Copy link
Contributor

casting Timestamp to Time isn't supported in arrow-rs
https://github.com/apache/arrow-rs/blob/master/arrow/src/compute/kernels/cast.rs#L236-L265

@naosense naosense marked this pull request as ready for review November 1, 2022 07:42
@waitingkuo
Copy link
Contributor

waitingkuo commented Nov 1, 2022

@naosense
you can do the following for now if you want to add that test case

select (now()::bigint % 86400000000000)::time;
+-------------------------------+
| now() % Int64(86400000000000) |
+-------------------------------+
| 07:41:27.421667               |
+-------------------------------+

i'll be great if you could fire a issue (or even a pr) to arrow-rs to support cast Timestamp to Time

Comment on lines +438 to +440
Arc::new(datetime_expressions::make_current_time(
execution_props.query_execution_start_time,
))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@mingmwang all the current_time within a query uses the same query_execution_start_time

@naosense naosense requested a review from waitingkuo November 1, 2022 10:51
Copy link
Contributor

@waitingkuo waitingkuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thank you @naosense

Comment on lines 212 to 214
let nano =
Some(now_ts.timestamp_nanos() % 86400000000000);
move |_arg| Ok(ColumnarValue::Scalar(ScalarValue::Time64(nano)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines +1671 to +1680
let sql = "select case when current_time() = (now()::bigint % 86400000000000)::time then 'OK' else 'FAIL' end result";
let results = execute_to_batches(&ctx, sql).await;

let expected = vec![
"+--------+",
"| result |",
"+--------+",
"| OK |",
"+--------+",
];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@naosense
Copy link
Contributor Author

naosense commented Nov 1, 2022

@waitingkuo

i'll be great if you could fire a issue (or even a pr) to arrow-rs to support cast Timestamp to Time

it's a pleasure, I'll investigate how to do that

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great @naosense . Thanks @waitingkuo for the review!

@alamb
Copy link
Contributor

alamb commented Nov 1, 2022

Hi @naosense -- it seems this PR needs to have cargo fmt to run on it in order to pass CI. I took the liberty of doing so (and merging up from master) to prevent another 24 hour cycle

@alamb alamb merged commit 97f2e4f into apache:master Nov 1, 2022
@alamb
Copy link
Contributor

alamb commented Nov 1, 2022

Thanks again @naosense and @waitingkuo !

@ursabot
Copy link

ursabot commented Nov 1, 2022

Benchmark runs are scheduled for baseline = 8c26530 and contender = 97f2e4f. 97f2e4f is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

@naosense naosense deleted the current_time branch November 2, 2022 07:43
Dandandan pushed a commit to yuuch/arrow-datafusion that referenced this pull request Nov 5, 2022
* implement `current_time`

* edit test case

* fix nanosecond after midnight

* fix: fmt

Co-authored-by: pingao <[email protected]>
Co-authored-by: Andrew Lamb <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate logical-expr Logical plan and expressions physical-expr Physical Expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement current_time Function
5 participants