Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does KSQL supports binary Avro format? #1820

Closed
yana2301 opened this issue Sep 3, 2018 · 8 comments
Closed

Does KSQL supports binary Avro format? #1820

yana2301 opened this issue Sep 3, 2018 · 8 comments
Labels

Comments

@yana2301
Copy link

yana2301 commented Sep 3, 2018

Hello,
I have a question on support of Avro format.
There is topic in Kafka that contains binary avro records, byte arrays.
I can define stream or table in Avro format for this topic, but when try to read from it - I get deserealization exception.
Could anyone tell me please, is this use-case supported now?

@miguno
Copy link
Contributor

miguno commented Sep 3, 2018

Yes, Avro is a supported data format.

Note that Avro support in KSQL requires Confluent Schema Registry, which means the Kafka messages must follow the associated schema registry compatible wire protocol for Avro.

@miguno miguno added the question label Sep 3, 2018
@rodesai
Copy link
Contributor

rodesai commented Sep 3, 2018

Also be advised that the bytes type is currently not supported.

@miguno
Copy link
Contributor

miguno commented Sep 3, 2018

BYTES type: see #1742

@yana2301
Copy link
Author

yana2301 commented Sep 4, 2018

thanks for answers, but it still not clear to me whether binary Avro records are supported.
I use only varchar type in record. I encode whole record as byte array before sending to Kafka, here's the code snippet:
entity - generic avro record.

      val datumWriter = new SpecificDatumWriter[T](entity.getSchema)
      val objectWriter = new DataFileWriter(datumWriter)
      val bytes = new ByteArrayOutputStream()
      objectWriter.setCodec(CodecFactory.nullCodec())
      objectWriter.create(entity.getSchema, bytes)
      objectWriter.append(entity)
      objectWriter.close()
      bytes.toByteArray

I have registered corresponding schema for value in Schema Registry. When I try to read from stream - I get deserealization exception.
So my question is are binary avro records(not just byte type, whole record encoded as binary) are supported?

@miguno
Copy link
Contributor

miguno commented Sep 4, 2018

@yana2301: Sorry if my original reply wasn't clear.

So my question is are binary avro records(not just byte type, whole record encoded as binary) are supported?

Yes, binary Avro is supported (as far as I understand you), but the Kafka message must included two additional pieces of information in addition to the serialized Avro bytes in order to be compatible with Schema Registry's wire format.

If you just serialize the bytes, you have covered only row #3 in the screenshot below (bytes 5 onwards in the wire format), but you must also provide (1) the magic byte and (2) the Avro schema id.

screen shot 2018-09-04 at 13 03 54

@miguno
Copy link
Contributor

miguno commented Sep 4, 2018

Oh, and a note: KSQL supports only Kafka message keys of type String at the moment. That is, KSQL supports Avro for message values (in the format described above), but not yet for message keys.

See #824

@yana2301
Copy link
Author

yana2301 commented Sep 4, 2018

@miguno thanks for response, this is really helpful!

@yana2301 yana2301 closed this as completed Sep 4, 2018
@emirhosseini
Copy link

@miguno So what about Apache Kafka? Meaning what if I'm not using Confluent Kafka?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants