Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document reduce_agg aggregate function #12195

Merged
merged 1 commit into from
Jan 22, 2019
Merged

Conversation

wenleix
Copy link
Contributor

@wenleix wenleix commented Jan 8, 2019

No description provided.

``combineFunction`` takes two states and returns the new state.
The final state is returned.

``redcue_agg`` supports state with native container type long, double or boolean.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it mean that S must be one of these?

(just curious, why so?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: redcue_agg -> reduce_agg

Could you add an example of usage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@findepi : Correct, see the comments in code:

// State with Slice or Block as native container type is intentionally not supported yet,
// as it may result in excessive JVM memory usage of remembered set.
// See JDK-8017163.
throw new UnsupportedOperationException();

Also see #9553

.. function:: reduce_agg(inputValue T, initialState S, inputFunction(S, T, S), combineFunction(S, S, S)) -> S

Returns a single value reduced from input values. ``inputFunction`` and ``combineFunction`` will be invoked.
``inputFunction`` takes the current state (initially ``initialState``) and the input value, and returns the new state.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it mean reduce_agg runs on single node?
the wording suggests so; but existence of combineFunction suggests it is parallel

Copy link
Contributor Author

@wenleix wenleix Jan 9, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it runs in parallel. Any thoughts about how to revise the wording? -- Since when I read it, I didn't see it will run on single node, but I am obviously biased...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@findepi, I don't think the name combineFunction implies parallelism.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dain not the name -- the existence of it.
It it was single-threaded, there would be just initialSate and inputFunction (aka "accumulator" or "accumulatingFunction").
The need for combineFunction arises only when you want to merge two branches of processing (ie parallel computations).

@wenleix
Copy link
Contributor Author

wenleix commented Jan 9, 2019

Thanks @mbasmanova and @findepi for reviewing. I added examples, here is how it looks like:

screen shot 2019-01-08 at 10 06 18 pm

-- (1, 12)
-- (2, 30)

``reduce_agg`` supports state with native container type long, double or boolean.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think any end user will understand what this means. Maybe state this as reduce_agg currently does not support variable width types, container types, or inexact types.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dain : Maybe I will just say reduce_agg currently supports boolean, integer (TINYINT, SMALLINT, INTEGER, BIGINT) and floating-point (REAL, DOUBLE) types .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It also supports dates, times, timestamps, right?

@nezihyigitbasi
Copy link
Contributor

nezihyigitbasi commented Jan 19, 2019

@findepi @dain @mbasmanova Is there anything else that's needed from this PR? We need to merge this and update the release notes PR (#12194) with a link to this new doc.

@wenleix
Copy link
Contributor Author

wenleix commented Jan 22, 2019

Thanks @electrum for reviewing. Comments addressed.

electrum added a commit to electrum/trino that referenced this pull request Jan 22, 2019
@mbasmanova
Copy link
Contributor

@electrum David, is this ready to merge? Could you approve?

Returns a single value reduced from input values. ``inputFunction`` and ``combineFunction`` will be invoked.
``inputFunction`` takes the current state (initially ``initialState``) and the input value, and returns the new state.
``combineFunction`` takes two states and returns the new state.
The value of final state is returned::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: there should be only one colon, e.g. "returned: "

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wenleix wenleix merged commit 559aaa9 into prestodb:master Jan 22, 2019
@wenleix wenleix deleted the reduce_agg_doc branch January 22, 2019 22:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants