Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add compression support to bijection-avro #174

Merged
merged 1 commit into from
Aug 14, 2014
Merged

Add compression support to bijection-avro #174

merged 1 commit into from
Aug 14, 2014

Conversation

miguno
Copy link

@miguno miguno commented Aug 11, 2014

This API enhancement is backwards compatible.

We add withCompression() factory methods to SpecificAvroCodecs and GenericAvroCodecs, and also add the following three convenience methods:

  • withBzip2Compression
  • withDeflateCompression
  • withSnappyCompression

(I did not add a withXzCompression method as this codec was apparently introduced in Avro 1.7.6, and Bijection is currently still using the older 1.7.5 version.)

Usage examples

// Encode data, no compression (already supported in bijection-avro)
implicit val specificInjection = SpecificAvroCodecs[FiscalRecord]

// Encode data, with Snappy compression (new feature)
implicit val specificInjection = SpecificAvroCodecs.withSnappyCompression[FiscalRecord]

Compression in Avro is transparent to readers of the data, which means that there is no change needed on the decoding side of things.

No compression support added to toBinary methods

Please note that I did not add corresponding compression support to GenericAvroCodecs.toBinary and SpecificAvroCodecs.toBinary (i.e. the Injection variants that do not embed the Avro schema into the encoded binary data). This is because Avro's API provides compression only at the file container level (i.e. block compression). In other words, without using Avro's DataFileWriter class -- which is what Bijection does for apply but not for toBinary -- we cannot set a compression codec. We can try to work around that limitation, but this would turn toBinary into a renamed apply method, and it would make the code inconsistent because suddenly toBinary would embed the Avro schema into each encoded record (which it does not do in the current code, and which is IMHO the core semantic difference between toBinary and apply).

@MansurAshraf
Copy link
Contributor

+1 LGTM

@johnynek
Copy link
Collaborator

Looks great! Thanks for doing this.

johnynek added a commit that referenced this pull request Aug 14, 2014
Add compression support to bijection-avro
@johnynek johnynek merged commit f8c12ac into twitter:develop Aug 14, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants