Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deferrable mode for DataflowTemplatedJobStartOperator and DataflowStartFlexTemplateOperator #39018

Merged

Conversation

e-galan
Copy link
Contributor

@e-galan e-galan commented Apr 15, 2024

This fixes the deferrable mode for DataflowTemplatedJobStartOperator and DataflowStartFlexTemplateOperator. Previously the deferrable mode was implemented in a way that made most of the task execution run in the sync mode, and only after the execution was finished, the control was passed to the trigger, where it only checked the job status (that was by this time completed) and then immediately returned.

List of changes:

  • Add new methods to DataflowHook to be used for the deferrable mode. The methods will start a Dataflow job and then exit returning the job data.
    The existing hook methods are left without changes and are still used for the sync mode.
  • Add code to update XCOM with Dataflow job_id in both sync and deferrable modes.
  • Update unit tests
  • Small refactoring of data types and function parameters

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@e-galan e-galan closed this Apr 15, 2024
@e-galan e-galan reopened this Apr 15, 2024
@e-galan e-galan force-pushed the fix-def-mode-for-dataflow-operators branch 3 times, most recently from 4c1c9ed to 3921122 Compare April 16, 2024 08:43
@e-galan
Copy link
Contributor Author

e-galan commented Apr 18, 2024

Hi @eladkal @Lee-W @potiuk !
Could you take a look at the PR?

@Lee-W Lee-W self-requested a review April 19, 2024 04:34
@potiuk
Copy link
Member

potiuk commented Apr 22, 2024

Could you please rebase and resolve conflict as well @e-galan ? 0

@e-galan e-galan force-pushed the fix-def-mode-for-dataflow-operators branch from 3921122 to c0eb648 Compare April 22, 2024 09:08
@e-galan
Copy link
Contributor Author

e-galan commented Apr 22, 2024

Could you please rebase and resolve conflict as well @e-galan ? 0

Sure @potiuk , PR is now rebased and the conflict resolved.

@potiuk
Copy link
Member

potiuk commented Apr 22, 2024

Don't know all the details but it seems good for review.

@e-galan e-galan force-pushed the fix-def-mode-for-dataflow-operators branch from c0eb648 to 3d59b64 Compare April 25, 2024 08:15
@Lee-W Lee-W self-requested a review April 26, 2024 00:39
Copy link
Member

@Lee-W Lee-W left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Left one non-blocking nitpick. I'll keep it open for one or two days so that others would have a chance to take a deeper look. Thanks!

@Lee-W Lee-W merged commit 28a240a into apache:main Apr 29, 2024
41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants