Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Serialize INTERVAL to Apache-arrow #1204

Closed
2 tasks done
handstuyennn opened this issue Mar 1, 2023 · 3 comments
Closed
2 tasks done

Serialize INTERVAL to Apache-arrow #1204

handstuyennn opened this issue Mar 1, 2023 · 3 comments

Comments

@handstuyennn
Copy link

handstuyennn commented Mar 1, 2023

What happens?

Hi @Mytherin

When DuckDB serializes an INTERVAL to Arrow it converts it to a fixed-ms duration. This is problematic in that a month is not always 30 days (and the serialization assumes a month is 30 days), and so serializing an interval that contains a MONTH or YEAR date-part is usually inaccurate. The relevant code is in the following places:

I would propose that we keep the information that is currently in the INTERVAL struct when serializing to arrow, that is, {months, days, nanos}.
If you agree to this approach, I would like to implement this fix immediately.

Thank you.

To Reproduce

run query: SELECT INTERVAL 1 YEAR; in duckdb-wasm.

OS:

MacOS

DuckDB Version:

Any

DuckDB Client:

duckdb-wasm

Full Name:

Tuyen Nguyen

Affiliation:

maintainer of the duckdb/geo extension

Have you tried this on the latest master branch?

  • I agree

Have you tried the steps to reproduce? Do they include all relevant data and configuration? Does the issue you report still appear there?

  • I agree
@Mytherin
Copy link
Contributor

Mytherin commented Mar 1, 2023

Thanks for the report!

Yes, that makes sense, as it seems that Arrow has added support for a month,day,ns interval. The reason we used the other interval types was because this type did not exist back when we added the Arrow integration.

@Mytherin Mytherin transferred this issue from duckdb/duckdb Mar 23, 2023
@carlopi
Copy link
Collaborator

carlopi commented Mar 23, 2023

Hi @handstuyennn!

I believe the situation changed from when you opened the query, now Arrow intervals are used, but this gives back a non-sensical result.

I will take a look at this, seems the same problem behind #1203.

@carlopi
Copy link
Collaborator

carlopi commented Jun 17, 2024

Thanks for reporting, this is solved thanks to #1769

@carlopi carlopi closed this as completed Jun 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants