[WIP] [COMMENTS?] [QUARKS-230] Add timer triggered window aggregations #167

dlaboss · 2016-07-15T15:21:51Z

No description provided.

ddebrunner · 2016-07-15T18:25:46Z

api/topology/src/main/java/quarks/topology/TWindow.java

+     * 
+     * @see #aggregate(BiFunction)
+     */
+    <U> TStream<U> timedAggregate(long period, TimeUnit unit, BiFunction<List<T>, K, U> aggregator);


One of the issues I had with this in a previous system was that with many partitions the behaviour was not desired in that if the partition did not change the window still fired, thus wasting cpu cycles to produce the same result. Thus I wonder if it should be more along the lines of:

Aggregation of window partition on any partition change with a minimum period of period between aggregations.

A reasonable question.
In the particular use case that came up, it was OK/desirable that an unchanged partition still yielded an aggregation. e.g., the, less than perfect, interface that was desired between the device and the iothub was to publish events on the "current location" of the device even if it hadn't moved a meaningful distance.
A more efficient, less chatty, device/iothub interface would have been to only publish under that condition.
So maybe a timed-aggregator interface that only supported timed-trigger-if-changed semantics might not be acceptible?

I was wondering if the continue sending updates even if nothing has changed might be better handled by a separate operation, then it could be applied to anything, rather than just a window.

Something like pass any input tuple to the output, but send the last tuple if nothing has been received for the declared period.

... per-partition. Sounds like that separate operation could be a count(1)-timedAggregate-evenIfDidntChange. Does it make sense to force such a user to have to use a timed-trigger-if-changed window followed by this separate operation rather than just use a count(N)-timedAggregate-evenIfDidntChange? (I can imagine it would be ok, just want to be sure)

ddebrunner · 2016-07-25T19:17:05Z

On timed batch:

I think part of my concern is that the api is meant to separate out what defines the window contents (TWindow) from how it is processed (methods on TWindow).

timedBatch seems to be mixing the concepts of what is contained in the window and when it is processed, by say evicting tuples in last(10) window after a second, so that the contents of the window no longer match its definition.

It seems that it's really trying to define a window like:

last(long period, TimeUnit unit, int max)

where the contents of the window is the last period seconds with a maximum of the max most recent tuples.

Then batch is applied to this window as normal.

ddebrunner · 2016-07-25T19:18:26Z

Maybe a better question for timedBatch(3, SECONDS, func), is when would it be used over

last(3, SECONDS).batch(func)

?

[WIP] [COMMENTS?] [QUARKS-230] Add timer triggered window aggregations

fb57ec7

ddebrunner reviewed Jul 15, 2016
View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] [COMMENTS?] [QUARKS-230] Add timer triggered window aggregations #167

[WIP] [COMMENTS?] [QUARKS-230] Add timer triggered window aggregations #167

dlaboss commented Jul 15, 2016

ddebrunner Jul 15, 2016

dlaboss Jul 15, 2016

ddebrunner Jul 15, 2016

dlaboss Jul 17, 2016

ddebrunner commented Jul 25, 2016

ddebrunner commented Jul 25, 2016

[WIP] [COMMENTS?] [QUARKS-230] Add timer triggered window aggregations #167

Are you sure you want to change the base?

[WIP] [COMMENTS?] [QUARKS-230] Add timer triggered window aggregations #167

Conversation

dlaboss commented Jul 15, 2016

ddebrunner Jul 15, 2016

Choose a reason for hiding this comment

dlaboss Jul 15, 2016

Choose a reason for hiding this comment

ddebrunner Jul 15, 2016

Choose a reason for hiding this comment

dlaboss Jul 17, 2016

Choose a reason for hiding this comment

ddebrunner commented Jul 25, 2016

ddebrunner commented Jul 25, 2016