-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enhance Datagen with Protobuf Support and Nested Key References #112
base: main
Are you sure you want to change the base?
Conversation
…mat. protobuf schema definitions are referenced in the _meta of the json schema.
Thanks for the PR! I’ll take a look |
README.md
Outdated
@@ -213,7 +213,11 @@ Here is the general syntax for a JSON input schema: | |||
{ | |||
"_meta": { | |||
"topic": "<my kafka topic>", | |||
"key": "<field to be used for kafka record key>" , | |||
"key": "<field to be used for kafka record key>" , | |||
"proto": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It’s ok to add an example that uses protobuf output format, but we shouldn’t make breaking changes to what is required for the input json schema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am updating the README to add a separate section called:
JSON Schema with Protobuf.
The proto {}
meta field is intended to be optional.
@@ -82,6 +82,12 @@ export async function generateMegaRecord(schema: any) { | |||
}); | |||
} | |||
|
|||
// specify the proto field for the topic | |||
if ("proto" in _meta) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be invalid if the user doesn’t specify proto output format. In other words, if they have proto in _meta but are using -f avro for output, then _meta.proto should be ignored.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The problem with this is that I would have to pass in the format
into the generateMegaRecord
method to skip the if("proto" in _meta)
check.
the _meta.proto
is effectively ignored already by all output formats, including Avro, because _meta.proto
is only referenced in the protoFormat.ts
tests/iterationIndex.json
Outdated
@@ -1,7 +1,11 @@ | |||
{ | |||
"_meta": { | |||
"topic": "air_quality", | |||
"key": "id" | |||
"key": "id", | |||
"proto": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let’s not change existing tests, but rather add a new test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am adding a interationIndex-proto.json
file to use instead
tests/schema.json
Outdated
@@ -3,6 +3,10 @@ | |||
"_meta": { | |||
"topic": "mz_datagen_users", | |||
"key": "id", | |||
"proto": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment. Let’s not change existing tests to require proto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am adding a schema-proto.json
file to use instead
async register(megaRecord: any): Promise<void> { | ||
this.schemaFiles = await ProtoFormat.getProtoSchemaFiles(megaRecord); | ||
this.schemas = await ProtoFormat.getProtoSchemas(megaRecord, Array.from(this.schemaFiles)); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for that PR, it looks great!
One thing that I believe might be missing is that the schemas do not get registered in the CSR, so not 100% sure how consumers will decode the Protobuf messages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You do not need a CSR to use protos with Kafka. Often, consumers will have have their own references to the schemas. The datagen tool working with protos should be independent on how consumers get the protos. The datagen tool is about message production. How consumers decrypt the messages they have the option of
- having the .proto files or the generated classes from the .protos
- CSR.
Is (2) a requirement for this PR?
Hi. Thank you for the feedback! I'll reply to comments, and I'll make sure to update the PR, once I get time to do so! Thank you so much for this program! It's great! |
What type of PR is this?
Description
8.0.0
.
) separatorAdded to documentation?