Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Minor: Refine doc comments for BuiltinScalarFunction::return_dimension #7045

Merged
merged 1 commit into from
Jul 26, 2023

Conversation

alamb
Copy link
Contributor

@alamb alamb commented Jul 20, 2023

Which issue does this PR close?

Related to #7008

Rationale for this change

I only just now realized what "dimension" meant in the context of ListArray / Arrays so I tried to document it

What changes are included in this PR?

Doc comments

Are these changes tested?

Are there any user-facing changes?

@alamb
Copy link
Contributor Author

alamb commented Jul 26, 2023

@izveigor or @jayzhan211 I wonder if you have some time to review this PR?

Copy link
Contributor

@izveigor izveigor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I agree the fact that we should expand the documentation and come up with a pretty format for array functions.

@alamb alamb merged commit 3542029 into apache:main Jul 26, 2023
@alamb
Copy link
Contributor Author

alamb commented Jul 26, 2023

Thanks @izveigor -- I'll keep chipping away at keeping the codebase readable and documented :)

@alamb alamb deleted the alamb/more_doc_comments branch July 26, 2023 19:41
@jayzhan211
Copy link
Contributor

jayzhan211 commented Jul 26, 2023

@alamb @izveigor
I have different thoughts on this, List() is 1D, List(List()) is 2D, Int64 is 0D, dimension is only meaningful to List, not non-List type,
that is also the reason I left the comment on

PR comment:
https://github.com/apache/arrow-datafusion/pull/7008/files/b0e37747d487f89b365d19089becdf8b1b63cbe7#r1268009827

Main:
https://github.com/apache/arrow-datafusion/blob/354202922889502cf51cabf630b05003fddd2ec7/datafusion/expr/src/built_in_function.rs#L494-L497.

But, If we want a new definition of dimensions, it is also fine.

@alamb
Copy link
Contributor Author

alamb commented Jul 27, 2023

I have different thoughts on this, List() is 1D, List(List()) is 2D, Int64 is 0D, dimension is only meaningful to List, not non-List type,

But, If we want a new definition of dimensions, it is also fine.

My intention was not to change the definition but to capture what this function does. I am pretty sure that given the input int64 this function would return 1

@jayzhan211
Copy link
Contributor

@alamb In your opinion, which one do you prefer? I think both are fine but need to be consistent everywhere

@alamb
Copy link
Contributor Author

alamb commented Jul 27, 2023

@alamb In your opinion, which one do you prefer?

  • Int64 is 0D,
  • List() is 1D
  • List(List()) is 2D,

Makes the most sense to me given my understanding of multi-dimensional array terminology.

I don't know the code well enough to understand how large a code change updating to that definition would be

@jayzhan211
Copy link
Contributor

I don't think there are many to change so changing the code might not be a problem. We just need to have a consensus on the definition of list dimensions, also add them to code comment and documentation

These are some code I know that is related to List dimensions
https://github.com/apache/arrow-datafusion/blob/11b7b5c215012231e5768fc5be3445c0254d0169/datafusion/expr/src/built_in_function.rs#L439-L460

https://github.com/apache/arrow-datafusion/blob/11b7b5c215012231e5768fc5be3445c0254d0169/datafusion/physical-expr/src/array_expressions.rs#L128-L162

https://github.com/apache/arrow-datafusion/blob/11b7b5c215012231e5768fc5be3445c0254d0169/datafusion/expr/src/built_in_function.rs#L494-L517

@izveigor Do you know any code that considers the list dimensions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
logical-expr Logical plan and expressions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants