Skip to content

Commit

Permalink
Disallow using the JRH with Python streaming pipelines
Browse files Browse the repository at this point in the history
This cleans-up a lot of logic related to flag setting with respect to streaming and runner v2 experiments
There is also some dead code removed for configurations not possible any more (e.g. side input override, JRH Create, ...)
  • Loading branch information
lukecwik committed Dec 5, 2022
1 parent 53bcb39 commit 43ed75a
Show file tree
Hide file tree
Showing 14 changed files with 308 additions and 585 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -24,10 +24,6 @@ import static java.util.UUID.randomUUID

def now = new Date().format("MMddHHmmss", TimeZone.getTimeZone('UTC'))

def final JOB_SPECIFIC_SWITCHES = [
'-PwithDataflowWorkerJar="true"'
]

def psio_test = [
title : 'PubsubIO Write Performance Test Python 2GB',
test : 'apache_beam.io.gcp.pubsub_io_perf_test',
Expand Down Expand Up @@ -58,7 +54,7 @@ def executeJob = { scope, testConfig ->
commonJobProperties.setTopLevelMainJobProperties(scope, 'master', 240)

loadTestsBuilder.loadTest(scope, testConfig.title, testConfig.runner,
CommonTestProperties.SDK.PYTHON, testConfig.pipelineOptions, testConfig.test, JOB_SPECIFIC_SWITCHES)
CommonTestProperties.SDK.PYTHON, testConfig.pipelineOptions, testConfig.test)
}

PhraseTriggeringPostCommitBuilder.postCommitJob(
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,6 @@ PostcommitJobBuilder.postCommitJob('beam_PostCommit_Py_VR_Dataflow', 'Run Python
gradle {
rootBuildScriptDir(commonJobProperties.checkoutDir)
tasks(':sdks:python:test-suites:dataflow:validatesRunnerBatchTests')
tasks(':sdks:python:test-suites:dataflow:validatesRunnerStreamingTests')
switches('-PdisableRunnerV2')
commonJobProperties.setGradleSwitches(delegate)
}
Expand Down
34 changes: 34 additions & 0 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,10 +44,44 @@
## Bugfixes
* Fixed X (Java/Python) ([#X](https://github.com/apache/beam/issues/X)).
## Known Issues
* ([#X](https://github.com/apache/beam/issues/X)).
-->
# [2.45.0] - Unreleased

## Highlights

* New highly anticipated feature X added to Python SDK ([#X](https://github.com/apache/beam/issues/X)).
* New highly anticipated feature Y added to Java SDK ([#Y](https://github.com/apache/beam/issues/Y)).

## I/Os

* Support for X source added (Java/Python) ([#X](https://github.com/apache/beam/issues/X)).

## New Features / Improvements

* X feature added (Java/Python) ([#X](https://github.com/apache/beam/issues/X)).

## Breaking Changes

* Python streaming pipelines and portable Python batch pipelines on Dataflow are required to
use Runner V2. The `disable_runner_v2`, `disable_runner_v2_until_2023`, `disable_prime_runner_v2`
experiments will raise an error during pipeline construction. Note that non-portable Python
batch jobs are not impacted. ([#24515](https://github.com/apache/beam/issues/24515))

## Deprecations

* X behavior is deprecated and will be removed in X versions ([#X](https://github.com/apache/beam/issues/X)).

## Bugfixes

* Fixed X (Java/Python) ([#X](https://github.com/apache/beam/issues/X)).

## Known Issues

* ([#X](https://github.com/apache/beam/issues/X)).

# [2.44.0] - Unreleased

Expand Down
1 change: 0 additions & 1 deletion sdks/python/apache_beam/io/gcp/pubsub_io_perf_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,6 @@
--publish_to_big_query=<OPTIONAL><true/false>
--metrics_dataset=<OPTIONAL>
--metrics_table=<OPTIONAL>
--dataflow_worker_jar=<OPTIONAL>
--input_options='{
\"num_records\": <SIZE_OF_INPUT>
\"key_size\": 1
Expand Down
7 changes: 0 additions & 7 deletions sdks/python/apache_beam/options/pipeline_options.py
Original file line number Diff line number Diff line change
Expand Up @@ -1079,13 +1079,6 @@ def _add_argparse_args(cls, parser):
dest='min_cpu_platform',
type=str,
help='GCE minimum CPU platform. Default is determined by GCP.')
parser.add_argument(
'--dataflow_worker_jar',
dest='dataflow_worker_jar',
type=str,
help='Dataflow worker jar file. If specified, the jar file is staged '
'in GCS, then gets loaded by workers. End users usually '
'should not use this feature.')

def validate(self, validator):
errors = []
Expand Down
Loading

0 comments on commit 43ed75a

Please sign in to comment.