-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document reduce_agg aggregate function #12195
Conversation
92e00bd
to
1d2a324
Compare
``combineFunction`` takes two states and returns the new state. | ||
The final state is returned. | ||
|
||
``redcue_agg`` supports state with native container type long, double or boolean. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it mean that S
must be one of these?
(just curious, why so?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: redcue_agg -> reduce_agg
Could you add an example of usage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@findepi : Correct, see the comments in code:
Lines 124 to 127 in 79f480b
// State with Slice or Block as native container type is intentionally not supported yet, | |
// as it may result in excessive JVM memory usage of remembered set. | |
// See JDK-8017163. | |
throw new UnsupportedOperationException(); |
Also see #9553
.. function:: reduce_agg(inputValue T, initialState S, inputFunction(S, T, S), combineFunction(S, S, S)) -> S | ||
|
||
Returns a single value reduced from input values. ``inputFunction`` and ``combineFunction`` will be invoked. | ||
``inputFunction`` takes the current state (initially ``initialState``) and the input value, and returns the new state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it mean reduce_agg
runs on single node?
the wording suggests so; but existence of combineFunction
suggests it is parallel
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No it runs in parallel. Any thoughts about how to revise the wording? -- Since when I read it, I didn't see it will run on single node, but I am obviously biased...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@findepi, I don't think the name combineFunction
implies parallelism.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dain not the name -- the existence of it.
It it was single-threaded, there would be just initialSate
and inputFunction
(aka "accumulator" or "accumulatingFunction").
The need for combineFunction
arises only when you want to merge two branches of processing (ie parallel computations).
1d2a324
to
49d0764
Compare
Thanks @mbasmanova and @findepi for reviewing. I added examples, here is how it looks like: |
-- (1, 12) | ||
-- (2, 30) | ||
|
||
``reduce_agg`` supports state with native container type long, double or boolean. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think any end user will understand what this means. Maybe state this as reduce_agg currently does not support variable width types, container types, or inexact types
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dain : Maybe I will just say reduce_agg
currently supports boolean, integer (TINYINT, SMALLINT, INTEGER, BIGINT) and floating-point (REAL, DOUBLE) types .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It also supports dates, times, timestamps, right?
49d0764
to
d111ce9
Compare
@findepi @dain @mbasmanova Is there anything else that's needed from this PR? We need to merge this and update the release notes PR (#12194) with a link to this new doc. |
d111ce9
to
d4de05b
Compare
Thanks @electrum for reviewing. Comments addressed. |
Extracted from prestodb/presto#12195
@electrum David, is this ready to merge? Could you approve? |
Returns a single value reduced from input values. ``inputFunction`` and ``combineFunction`` will be invoked. | ||
``inputFunction`` takes the current state (initially ``initialState``) and the input value, and returns the new state. | ||
``combineFunction`` takes two states and returns the new state. | ||
The value of final state is returned:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: there should be only one colon, e.g. "returned: "
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No description provided.