-
Notifications
You must be signed in to change notification settings - Fork 136
[WIP] [COMMENTS?] [QUARKS-230] Add timer triggered window aggregations #167
base: master
Are you sure you want to change the base?
[WIP] [COMMENTS?] [QUARKS-230] Add timer triggered window aggregations #167
Conversation
* | ||
* @see #aggregate(BiFunction) | ||
*/ | ||
<U> TStream<U> timedAggregate(long period, TimeUnit unit, BiFunction<List<T>, K, U> aggregator); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the issues I had with this in a previous system was that with many partitions the behaviour was not desired in that if the partition did not change the window still fired, thus wasting cpu cycles to produce the same result. Thus I wonder if it should be more along the lines of:
Aggregation of window partition on any partition change with a minimum period of period
between aggregations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A reasonable question.
In the particular use case that came up, it was OK/desirable that an unchanged partition still yielded an aggregation. e.g., the, less than perfect, interface that was desired between the device and the iothub was to publish events on the "current location" of the device even if it hadn't moved a meaningful distance.
A more efficient, less chatty, device/iothub interface would have been to only publish under that condition.
So maybe a timed-aggregator interface that only supported timed-trigger-if-changed semantics might not be acceptible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering if the continue sending updates even if nothing has changed might be better handled by a separate operation, then it could be applied to anything, rather than just a window.
Something like pass any input tuple to the output, but send the last tuple if nothing has been received for the declared period.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... per-partition. Sounds like that separate operation could be a count(1)-timedAggregate-evenIfDidntChange. Does it make sense to force such a user to have to use a timed-trigger-if-changed window followed by this separate operation rather than just use a count(N)-timedAggregate-evenIfDidntChange? (I can imagine it would be ok, just want to be sure)
On timed batch: I think part of my concern is that the api is meant to separate out what defines the window contents (TWindow) from how it is processed (methods on TWindow). timedBatch seems to be mixing the concepts of what is contained in the window and when it is processed, by say evicting tuples in last(10) window after a second, so that the contents of the window no longer match its definition. It seems that it's really trying to define a window like:
where the contents of the window is the last Then batch is applied to this window as normal. |
Maybe a better question for
? |
No description provided.