Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

to_timestamp for negative values gives wrong result #7802

Closed
comphead opened this issue Oct 12, 2023 · 7 comments · Fixed by #7844
Closed

to_timestamp for negative values gives wrong result #7802

comphead opened this issue Oct 12, 2023 · 7 comments · Fixed by #7844
Assignees
Labels
bug Something isn't working

Comments

@comphead
Copy link
Contributor

Describe the bug

In Datafusion to_timestamp builtin scalar function gives wrong result for negative values


❯ select to_timestamp(-62125747200);
+-----------------------------------+
| to_timestamp(Int64(-62125747200)) |
+-----------------------------------+
| 1969-12-31T23:58:57.874252800     |
+-----------------------------------+
1 row in set. Query took 0.035 seconds.

PG gives correct result

0001-04-25T00:00:00.000Z

To Reproduce

Run

❯ select to_timestamp(-62125747200);

Expected behavior

Query results should match

Additional context

No response

@comphead comphead added the bug Something isn't working label Oct 12, 2023
@Weijun-H
Copy link
Member

/// to_timestamp SQL function
pub fn to_timestamp(args: &[ColumnarValue]) -> Result<ColumnarValue> {
    handle::<TimestampNanosecondType, _, TimestampNanosecondType>(
        args,
        string_to_timestamp_nanos_shim,
        "to_timestamp",
    )
}

The problem is that the function 'to_timestamp' treats all input as 'nano', which is causing an issue.

@comphead
Copy link
Contributor Author

it might be a bigger problem.....

 select to_timestamp_millis(-621257472);
+----------------------------------------+
| to_timestamp_millis(Int64(-621257472)) |
+----------------------------------------+
| 1969-12-24T19:25:42.528                |
+----------------------------------------+
1 row in set. Query took 0.002 seconds.

❯ select to_timestamp_micros(-621257472);
+----------------------------------------+
| to_timestamp_micros(Int64(-621257472)) |
+----------------------------------------+
| 1969-12-31T23:49:38.742528             |
+----------------------------------------+
1 row in set. Query took 0.005 seconds.

❯ select to_timestamp_micros(-621257472000);
+-------------------------------------------+
| to_timestamp_micros(Int64(-621257472000)) |
+-------------------------------------------+
| 1969-12-24T19:25:42.528                   |
+-------------------------------------------+
1 row in set. Query took 0.002 seconds.

@alamb
Copy link
Contributor

alamb commented Oct 13, 2023

I agree the difference with postgres seems related to the fact that to_timestamp takes an argument in terms of second for postgres, and nanoseconds for datafusion:

https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-TABLE

to_timestamp ( double precision ) → timestamp with time zone

Convert Unix epoch (seconds since 1970-01-01 00:00:00+00) to timestamp with time zone

to_timestamp(1284352323) → 2010-09-13 04:32:03+00

@comphead
Copy link
Contributor Author

comphead commented Oct 13, 2023

@alamb @Weijun-H
I agree

❯ select to_timestamp_seconds(-62125747200);
+-------------------------------------------+
| to_timestamp_seconds(Int64(-62125747200)) |
+-------------------------------------------+
| 0001-04-25T00:00:00                       |
+-------------------------------------------+
1 row in set. Query took 0.002 seconds.

I'll prepare a fix to make to_timestamp to be in sync with PG and make and input as seconds, do you think we need a to_timestamp_nanos instead?

@comphead comphead self-assigned this Oct 13, 2023
@alamb
Copy link
Contributor

alamb commented Oct 13, 2023

I'll prepare a fix to make to_timestamp to be in sync with PG and make and input as seconds, do you think we need a to_timestamp_nanos instead?

Yes I think this is important.

Also I was recently looking for a reference to the docs where we say "datafusion aims to have the postgres dialect" but could not find it 🤔 Do you know where it is?

@comphead
Copy link
Contributor Author

postgres

https://github.com/apache/arrow-datafusion/blob/main/datafusion/sqllogictest/README.md#running-tests-postgres-compatibility
This is the only thing I was able to find.

I plan to renew a bit a Datafusion main page to be more user-friendly and easier to start so we can include the PG compatibility statement there as well.

@alamb
Copy link
Contributor

alamb commented Oct 13, 2023

I plan to renew a bit a Datafusion main page to be more user-friendly and easier to start so we can include the PG compatibility statement there as well.

That would be awesome -- thank you @comphead

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants