update pattern for dataflow job id extraction #41794
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Dataflow job id is extracted from the logged output of
java
process that starts the Dataflow job, for example, in case ofBeamRunJavaPipelineOperator
.Currently job id pattern matches characters until first
"
or\n
is encountered, which is fine for a following case:[2024-08-27 11:20:22,094] INFO Submitted job: 2024-08-27_04_20_21-7947372725816706151
2024-08-27_04_20_21-7947372725816706151
However, if the logger is configured differently, for example, has a whitespace and a suffix at the end with additional information, the pattern extracts the id together with the suffix:
[2024-08-27 11:20:22,094] INFO Submitted job: 2024-08-27_04_20_21-7947372725816706151 (org.apache.beam.runners.dataflow.DataflowRunner) (main)
2024-08-27_04_20_21-7947372725816706151 (org.apache.beam.runners.dataflow.DataflowRunner) (main)
In the previous example suffix
(org.apache.beam.runners.dataflow.DataflowRunner) (main)
should not be extracted as part of the job id.I updated the pattern by adding the whitespace character
\s
(along side existing"
and\n
), indicating the end of job id.^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.