-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix schema inference inside parameterized types #32705
fix schema inference inside parameterized types #32705
Conversation
Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment |
2cff426
to
bc6fa83
Compare
R: @ahmedabu98 |
Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control. If you'd like to restart, comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, left one suggestion
public static CoderRegistry createDefault(@Nullable SchemaRegistry schemaRegistry) { | ||
return new CoderRegistry(schemaRegistry); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be a breaking change for users
Can we have the old createDefault() method as well and have it return new CoderRegistry(null)?
Would maintain existing use cases and limit the number of files in this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point as this is a public method (even though it's probably not intended for use outside of core Beam). Added the old createDefault() back.
…ized types" This reverts commit c243491.
…ized types" (#33133) (#33147) This reverts commit c243491. Co-authored-by: Yi Hu <[email protected]>
Previously Beam prioritized schemas over coders in inference, but did not inspect nested parameterized types for schemas. This led to some sharp edges for users - e.g. if Foo had a registered schema.
PCollection = readFoo();
Would infer the correct SchemaCoder for Foo. However
PCollection<Iterable> = readAllFoos();
Would not search for a schema, and instead take whatever Coder accepted Foo (possibly SerializableCoder). This led to a lot of confusion for users.
This PR ensures that the schema lookup continues while inspecting type parameters.
Note: this PR touches many files due to a new parameter added to CoderRegistry(), however the vast majority of those changes are trivial.