Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to control whether day/week/month/quarter from extract is 0- or 1-indexed #477

Closed
ianmcook opened this issue Mar 22, 2023 · 8 comments

Comments

@ianmcook
Copy link
Contributor

This discussion in the DuckDB repo suggests that we should consider adding an option to the Substrait extract datetime function to allow Substrait producers to specify whether they expect returned day/week/month/quarter integer values to be 0- or 1-indexed.

@ianmcook
Copy link
Contributor Author

@rok do you have any thoughts about this?

@rok
Copy link
Contributor

rok commented Mar 22, 2023

I suppose this is down to API design. Do we want to:

  1. Add an extra parameter to extract
  2. Add new extract components
  3. Keep a minimal set of functions (as is) and force users to do some glueing

@ianmcook
Copy link
Contributor Author

ianmcook commented Mar 22, 2023

I think option 1 or 2 is preferable to option 3. @westonpace do you agree?

I don't have any strong opinion about whether option 1 or 2 is best. Option 1 seems a bit cleaner to me. Option 2 blows up into a lot of new extract components and creates uncertainty about which ones we actually need/want: QUARTER_0, QUARTER_1, MONTH_0, MONTH_1, DAY_0, DAY_1,DAY_OF_YEAR_0, DAY_OF_YEAR_1, MONDAY_DAY_OF_WEEK_0, MONDAY_DAY_OF_WEEK_1, SUNDAY_DAY_OF_WEEK_0, SUNDAY_DAY_OF_WEEK_1, MONDAY_WEEK_0, MONDAY_WEEK_1, SUNDAY_WEEK_0, SUNDAY_WEEK_1, ISO_WEEK_0, ISO_WEEK_1, US_WEEK_0, US_WEEK_1

@rok
Copy link
Contributor

rok commented Mar 22, 2023

I assumed 3. is preferred, but if that's not the case I'd prefer 1 and can make a PR.

There are two more candidates for such a change: YEAR/ISO_YEAR/US_YEAR, MONDAY_WEEK/SUNDAY_WEEK/ISO_WEEK/US_WEEK.

@ianmcook
Copy link
Contributor Author

ianmcook commented Mar 22, 2023

I don't think this affects YEAR, ISO_YEAR, or US_YEAR because year integers are not indices in the same sense. (They don't start at 0 or 1, and no one ever tries to label 2000 as 1999.)

@rok
Copy link
Contributor

rok commented Mar 22, 2023

It's not about indexing indeed :D, but we could collapse three extract functions if we added parameters.

@westonpace
Copy link
Member

For indexing let's make it an option (not an argument).

Let's defer calendar type for a future PR since it's backwards incompatible. I think I'd want to see at least one non-arrow engine and how they chose to implement the feature first.

westonpace pushed a commit that referenced this issue May 10, 2023
BREAKING CHANGE: This adds an option to control indexing of components
as discussed in #477.

---------

Co-authored-by: Ian Cook <[email protected]>
@ianmcook
Copy link
Contributor Author

Closed by #479

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants