Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Failing Test]: Python PostCommit failing due to duplicate AvroSchemaIO autoservice #25601

Closed
1 of 15 tasks
Abacn opened this issue Feb 23, 2023 · 21 comments · Fixed by #25611
Closed
1 of 15 tasks

[Failing Test]: Python PostCommit failing due to duplicate AvroSchemaIO autoservice #25601

Abacn opened this issue Feb 23, 2023 · 21 comments · Fixed by #25611

Comments

@Abacn
Copy link
Contributor

Abacn commented Feb 23, 2023

What happened?

Python PostCommit jdbcio_xlang_it_test failing. Jenkins log does not show much message, but running locally the actual error shows:

ERROR:apache_beam.utils.subprocess_server:Starting job service with ['java', '-jar', '/Users/yathu/.apache_beam/cache/jars/composite-jars/9b6e2cb01bd723cbd87a5e71462f25664f165993697afc98e1c64c34bf814f98.jar', '50709', '--filesToStage=/Users/yathu/dev/virtualenv/py38beam/lib/sdks/java/extensions/schemaio-expansion-service/build/libs/beam-sdks-java-extensions-schemaio-expansion-service-2.47.0-SNAPSHOT.jar,/Users/yathu/.apache_beam/cache/jars/postgresql-42.2.16.jar']
ERROR:apache_beam.utils.subprocess_server:Error bringing up service
Traceback (most recent call last):
  File "/Users/yathu/dev/virtualenv/py38beam/lib/python3.8/site-packages/apache_beam/utils/subprocess_server.py", line 88, in start
    raise RuntimeError(
RuntimeError: Service failed to start up with error 1
Traceback (most recent call last):

This is because there are two @AutoService AvroSchemaIOProvider classes so the schemaio expansion service fails to start.

The fix could either be similar to the workaround here:

// Avro provider is treated as a special case since two Avro providers may want to be loaded -

Or simply remove AvroSchemaIOProvider in core.

Issue Failure

Failure: Test is continually failing

Issue Priority

Priority: 1 (unhealthy code / failing or flaky postcommit so we cannot be sure the product is healthy)

Issue Components

  • Component: Python SDK
  • Component: Java SDK
  • Component: Go SDK
  • Component: Typescript SDK
  • Component: IO connector
  • Component: Beam examples
  • Component: Beam playground
  • Component: Beam katas
  • Component: Website
  • Component: Spark Runner
  • Component: Flink Runner
  • Component: Samza Runner
  • Component: Twister2 Runner
  • Component: Hazelcast Jet Runner
  • Component: Google Cloud Dataflow Runner
@Abacn
Copy link
Contributor Author

Abacn commented Feb 23, 2023

CC: @aromanenko-dev

@damccorm
Copy link
Contributor

@aromanenko-dev this is currently the only known release blocker - do you think you'll be able to get a fix in quickly, or should we consider temporarily reverting #25534 (since if I understand correctly, that is the cause of the failures)?

@mosche
Copy link
Member

mosche commented Feb 23, 2023

@Abacn @aromanenko-dev I was just looking at this as well. I agree with @aromanenko-dev that removing anything from core might be problematic. Though, also, it turns out that org.apache.beam.sdk.extensions.avro.io.AvroSchemaIOProvider isn't even part of the uber expansion jar, there's a separate one build for testing (:sdks:java:testing:expansion-service)

@Abacn
Copy link
Contributor Author

Abacn commented Feb 23, 2023

another quick fix for 2.46.0 would be remove the SchemaIOProvider AutoService in extension-avro, or even just comment out the @AutoService annotation. This would not introduce any breaking change and also let expansion service working properly

@aromanenko-dev
Copy link
Contributor

@damccorm @Abacn Before any reverts or deletes, I'd like to try a fix proposed by @mosche if it will work. Let me create a PR for that and check.

@mosche
Copy link
Member

mosche commented Feb 23, 2023

@aromanenko-dev I think there's two more follow ups required:

  • Log deprecation warnings when using the Avro related classes from core are used. AvroSchemaIOProvider is a good example where users might never notice such a deprecation as there's no direct dependency on the class.
  • Investigate / Discuss if the schemaio expansion-service should depend on the avro extension, currently it doesn't. Thinking ahead, the moment the deprecated AvroSchemaIOProvider is removed from core any x-lang pipeline using it would break.

@aromanenko-dev
Copy link
Contributor

@mosche

  • Good point about logging. Though, since these classes are already annotated with @Deprecated annotation. Shouldn't they be automatically added to build log? Or you are talking mostly about runtime logs?
  • IIUC, AvroSchemaIOProvider should be loaded only if it's used in one of it's dependent modules and it doesn't require direct extensions/avro dependency?

@mosche
Copy link
Member

mosche commented Feb 23, 2023

Good point about logging. Though, since these classes are already annotated with @deprecated annotation. Shouldn't they be automatically added to build log? Or you are talking mostly about runtime logs?

Of course I'm talking about runtime ... Users won't ever directly interact with AvroSchemaIOProvider. Because of that they are not going to notice a deprecation warning at build time.

IIUC, AvroSchemaIOProvider should be loaded only if it's used in one of it's dependent modules and it doesn't require direct extensions/avro dependency?

This isn't as simple ... AvroSchemaIOProvider exposes Avro sources in Beam SQL / xlang in a rather dynamic way to the user.
And currently this works as Avro is still part of core. Once removed, the behavior will suddenly break for users that have successfully used it before unless the Avro extension is always added to the expansion service jar :/

@Abacn
Copy link
Contributor Author

Abacn commented Feb 23, 2023

Note that schemaio expansion-service does not depend on avro extension, but the uber jar includes it. I checked that unzip the jar beam-sdks-java-extensions-schemaio-expansion-service-2.47.0-SNAPSHOT.jar I see class of sdk/extensions/avro and class files

@aromanenko-dev
Copy link
Contributor

I think this is why we have this issue actually.

@aromanenko-dev
Copy link
Contributor

@mosche I added logging for providers registering.

@mosche
Copy link
Member

mosche commented Feb 23, 2023

thx @Abacn, i checked a jar of an earlier version before the extension existed 🤦

@aromanenko-dev
Copy link
Contributor

@damccorm I see that you already cut a branch for release. If fix from #25611 works and there are no other issues, could you cut it again to include this fix and not cherry-pick?

@damccorm
Copy link
Contributor

damccorm commented Feb 23, 2023

@aromanenko-dev is there a reason not to cherry pick? Recutting isn't supported by the current scripts and might lead to issues.

Plus, then we're scope creeping extra commits into the release.

@damccorm
Copy link
Contributor

AFAICT none of the changes since the cut should prevent an easy cherry-pick

@damccorm
Copy link
Contributor

If its just a matter of making time to do the CP promptly, I'm happy to help out there

@aromanenko-dev
Copy link
Contributor

@damccorm Cherry-pick is ok, I just asked what is easier for you.

@damccorm
Copy link
Contributor

Ok cool - thanks!

@aromanenko-dev
Copy link
Contributor

Ok, so, let's wait for tests passed and if it's ok, I'll merge it. Then, we can cherry-pick it to release branch.

@damccorm
Copy link
Contributor

I merged the PR and put up a cherry-pick, could someone please approve?

#25618

@aromanenko-dev
Copy link
Contributor

Thanks @damccorm !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment