-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Zstd codec in SerializableAvroCodecFactory #32352
Conversation
@@ -50,6 +50,9 @@ class SerializableAvroCodecFactory implements Externalizable { | |||
private static final Pattern deflatePattern = Pattern.compile(DEFLATE_CODEC + "-(?<level>-?\\d)"); | |||
private static final Pattern xzPattern = Pattern.compile(XZ_CODEC + "-(?<level>\\d)"); | |||
|
|||
// Don't reference `DataFileConstants.ZSTANDARD_CODEC` directly for Avro 1.8 compat | |||
private static final Pattern zstdPattern = Pattern.compile("zstandard\\[(?<level>\\d+)\\]"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: DataFileConstants.ZSTANDARD_CODEC
was added in Avro 1.9, so I didn't want to reference it here and break 1.8 users. When Avro 1.8 is dropped it can be used directly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for late question. zstandard compression level range from negative 7 to 22. This regex does not handle negative, is this a concern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh great catch! added a -?
to the regex and updated the tests.
Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment |
Assigning reviewers. If you would like to opt out of this review, comment R: @kennknowles for label java. Available commands:
The PR bot will only process comments in the main thread (not review comments). |
cc @Abacn, since I saw you made some recent commits to Beam's avro module? :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, would you mind add this change to https://github.com/apache/beam/blob/master/CHANGES.md ?
Under New Features / Improvements Section,
Support for Zstd codec SerializableAvroCodecFactory
is added (Java/Python) ([#32349](https://github.com/apache/beam/issues/32349)).
merge conflicts after other change pushed CHANGES.md change, would you mind rebase onto the latest master? thanks! |
d0d9635
to
1f8d48b
Compare
cc @Abacn done! 🫡 |
thank you for prompt responses! |
* Support Zstd codec in SerializableAvroCodecFactory * Test AvroIO.write * Make tests compilable on Avro 1.8 * format * Update CHANGES.md * Support negative levels for zstd
fixes #32349
Adds support for Zstandard compression in Avro writes
Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
addresses #123
), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, commentfixes #<ISSUE NUMBER>
instead.CHANGES.md
with noteworthy changes.See the Contributor Guide for more tips on how to make review process smoother.
To check the build health, please visit https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md
GitHub Actions Tests Status (on master branch)
See CI.md for more information about GitHub Actions CI or the workflows README to see a list of phrases to trigger workflows.