-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add watermark generator #959
Add watermark generator #959
Conversation
Signed-off-by: Chen Dai <[email protected]>
a01ab67
to
2ac0fb5
Compare
Signed-off-by: Chen Dai <[email protected]>
Codecov Report
@@ Coverage Diff @@
## feature/maximus-m1 #959 +/- ##
=========================================================
- Coverage 97.96% 62.76% -35.20%
=========================================================
Files 303 10 -293
Lines 7805 658 -7147
Branches 504 119 -385
=========================================================
- Hits 7646 413 -7233
- Misses 158 192 +34
- Partials 1 53 +52
Flags with carried forward coverage won't be shown. Click here to find out more. Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
...java/org/opensearch/sql/planner/streaming/watermark/BoundedOutOfOrderWatermarkGenerator.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: Chen Dai [email protected]
Description
Watermark is a monotonically increasing timestamp of the oldest work not yet completed. The work can be any grouping operation, such as aggregate or join, that accumulate stream events to a table and maintain the state. In other words, watermark is the way of how we reason about the completeness of accumulated window state.
There are several aspects of watermark implementation, including watermark generation, watermark emit frequency and watermark propagation. This PR is focused on the watermark generation which has no dependency on how we integrate with query plan later.
What's covered in this PR is the common generate strategy: Bounded Out-Of-Order Watermark Generator which allows a fixed delay for disordered data.
Issues Resolved
#953
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.