Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Respect timestamp OutputTime Windowing Strategy configuration in Lifted CombineFns. #20436

Open
damccorm opened this issue Jun 4, 2022 · 0 comments

Comments

@damccorm
Copy link
Contributor

damccorm commented Jun 4, 2022

The Go SDK currently retains an arbitrary timestamp per key per bundle when performing a lifted combine.
However, depending on the windowing strategy, a prefered time could be specified.

// (Required) The OutputTime specifies, for a grouping transform, how to

The code in question for the Go SDK:
https://github.com/apache/beam/blob/master/sdks/go/pkg/beam/core/runtime/exec/combine.go#L395

At present this implementation is "correct", as the default output time is Unspecified, and there's no user mechanism to configure a windowing strategy to this granularity.

So there are a few parts to this.

  1. Propagate the windowing strategy information to exec.LiftedCombine somehow and implement the correct output. This can be done whether or not 2 is implemented.
  2. Provide a trigger configuration for beam.WindowInto, so this can be configured on the user side. This is significantly more work.

This matters only when using windows that are not the Global Window, and when using a Lifted Combine, which commonly only happens in batch contexts. However, since Beam is a unified model, the windowing features should work correctly in both execution modes of a Go SDK pipeline.

Imported from Jira BEAM-10302. Original Jira may contain additional context.
Reported by: lostluck.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant