Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StateWatcher watches and reports changed Pipeline State #32040

Merged
merged 2 commits into from
Aug 1, 2024

Conversation

damondouglas
Copy link
Contributor

This PR closes #32032 with a package private StateWatcher. It sends a GetJobStateRequest to the state stream endpoint of the Job Management service and reports any changes in State to listeners, invoking onStateChanged.


Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:

  • Mention the appropriate issue in your description (for example: addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment fixes #<ISSUE NUMBER> instead.
  • Update CHANGES.md with noteworthy changes.
  • If this contribution is large, please file an Apache Individual Contributor License Agreement.

See the Contributor Guide for more tips on how to make review process smoother.

To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md

GitHub Actions Tests Status (on master branch)

Build python source distribution and wheels
Python tests
Java tests
Go tests

See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.

@damondouglas damondouglas marked this pull request as ready for review July 31, 2024 21:45
@damondouglas
Copy link
Contributor Author

R: @Abacn or @ahmedabu98

@damondouglas damondouglas changed the title StateWatcher watches for changed Pipeline State StateWatcher watches and reports changed Pipeline State Jul 31, 2024
Copy link
Contributor

Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control. If you'd like to restart, comment assign set of reviewers

@github-actions github-actions bot removed the java label Jul 31, 2024
Copy link
Contributor

@Abacn Abacn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to think a dedicate Listener and watcher isn't necessary. One just need to periodically polls current state in pipelineResult.waitUntilFinish(), and polls state once everytime pipelineResult.getStete() called. This is what existing implementation doing:

Existing runners (direct, dataflow) track pipeline states in following way

getState() either make a call to query state, or return terminal state if it is already at terminal state

DIrect runner:

Dataflow runner:

For waitUntilFinish, it does polls pipeline state periodically until at a terminal state

Direct runner

Dataflow runner

BackOffAdapter.toGcpBackOff(STATUS_BACKOFF_FACTORY.withMaxRetries(0).backoff()),

We can simplify pretty much this PR inside delegate.waitUntilFinish

@damondouglas
Copy link
Contributor Author

@Abacn this is different in that the runner needs to launch an underlying Job Management service written in a different language and shut it down gracefully. The reason for introducing this PR is that relying on existing tools in Beam do not properly shutdown the gRPC channels, the code is too nested to safely refactor. This has caused errors in relying on proper shutdown, even when calling waitUntilFinish. The result is that the underlying process remains running without owndership and the only way to end the process is to use lsof, for example followed by kill.

Please review this assuming it is needed.

@damondouglas damondouglas requested a review from Abacn August 1, 2024 16:51
Copy link
Contributor

@Abacn Abacn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. I didn't aware of the core context/consideration of this change involves gracefully closing gRPC channel. The change itself is clear and well organized as well as the test. This LGTM

@damondouglas damondouglas merged commit 21009e6 into apache:master Aug 1, 2024
18 checks passed
@damondouglas damondouglas deleted the prism-state-pubsub branch August 1, 2024 18:10
reeba212 pushed a commit to reeba212/beam that referenced this pull request Dec 4, 2024
* StateWatcher watches for changed Pipeline State

* Add Javadoc
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Prism][Task]: A State Publisher notifies listeners of changes to PipelineResult.State.
2 participants