Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing bugs related to avro model processing: #582

Merged
merged 9 commits into from
Feb 10, 2024

Conversation

raphaeldelio
Copy link
Contributor

When a custom record references another custom record, we need to reverse the order so that we post-process the inner records first.

Even though the schemas are removed from the definitions, they are still processed and throw an error when they don't find themselves in the definitions.

I added comments to the code to make it easier to spot the changes. They can be removed. I'm not sure if I'm following the correct coding styles or if this operation of reversing the LinkedHashMap would be ideal here. Let me know if I can enhance it further.

- Nested models
- Schemas without definitions
- Small documentation fixes
Copy link

netlify bot commented Feb 5, 2024

Deploy Preview for springwolf-ui canceled.

Name Link
🔨 Latest commit 6f0f2a4
🔍 Latest deploy log https://app.netlify.com/sites/springwolf-ui/deploys/65c75cf19a3cc100084a1c93

Copy link

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Welcome to Springwolf. Thanks a lot for creating your first pull request. Please check out our contributors guide and feel free to join us on discord.

@sam0r040
Copy link
Collaborator

sam0r040 commented Feb 9, 2024

Hi @raphaeldelio, thanks for your PR. Could you provide a test case in core (ideally a unit test) that illustrates the problem? It is unclear for me, which problem the reversing resolves. Where does the error happen when the schemas "do not find themselves"?

@raphaeldelio
Copy link
Contributor Author

Hey @sam0r040,

I'm not sure how I can test it without the Avro dependency. Maybe you can help me if I elaborate better on the issue.

I changed the Avro message to be a nested message. Before, we had only an ExamplePayload, now we have an ExamplePayload inside of an AnotherPayload.

If you remove the reversing part of the code and run the tests, you will see two tests fail:

Screenshot 2024-02-09 at 16 24 43

This is because, the registerSchema function will start by reading all the schemas. In this case, it will start by reading first AnotherPayload and then reading ExamplePayload, resulting in the first coming before the second in the LinkedHashMap.

After that, we preProcess the schemas and put them all in this.schemas. So far, so good.

The problem starts during the postProcessor. In the post processing part, we will process each schema at a time, starting from the first in the LinkedHashMap (AnotherPayload).

First, we run the AvroPostProcessor. That's fine. It will work on the AnotherPayload and will remove the avro properties from that schema and also remove all schemas coming from org.apache.avro from the definitions.

Now, it comes to the second post processor, ExampleGeneratorPostProcessor. Bear in mind that we're processing the first object from the LinkedHashMap (AnotherPayload).

During the ExampleJsonGenerator, it will eventually fall into:

handleObjectProperties()

This will find the ExamplePayload as a property of AnotherPayload and try to build the example for it.

And now it becomes a problem because ExamplePayload will also fall into handleObjectProperties(). And since ExamplePayload hasn't been postProcessed yet, it still has all the Avro properties (Schema and SpecificData) that it will try to find the definition for. Since the definitions have already been removed (The post processor for AnotherPayload already got rid of them), it will fall into:

 throw new ExampleGeneratingException("Missing schema during example json generation: " + schemaName);

This exception is thrown and it won't finish building the example for AnotherPayload as a consequence.

This can be tested by removing the reversing part. You will see that tests will fail because the example part will be missing from the result JSON:
Screenshot 2024-02-09 at 16 20 41

And this error will be thrown:

Generate example for io.github.stavshamir.springwolf.example.kafka.dto.avro.AnotherPayloadAvroDto
 Failed to build json example for schema io.github.stavshamir.springwolf.example.kafka.dto.avro.AnotherPayloadAvroDto

io.github.stavshamir.springwolf.schemas.example.ExampleGenerator$ExampleGeneratingException: Missing schema during example json generation: org.apache.avro.Schema
  at io.github.stavshamir.springwolf.schemas.example.ExampleJsonGenerator.buildSchemaInternal(ExampleJsonGenerator.java:81) ~[main/:na]

You can see that the error above occurred while generating the example for AnotherPayloadAvroDto. AnotherPayloadAvroDto doesn't have the property Schema at this point, but it has the ExamplePayload property which still has the Schema property in it.

So, if we invert the order, ExamplePayload will be processed first and its Schema and SpecificData will be removed. Then, the post-processor will run on the outer object, which is AnotherPayload and so on in case there were multiple nested objects. From inside out.

raphaeldelio and others added 2 commits February 9, 2024 17:18
…emoved the current schema

Example: an avro schema is removed. Then the next post-processor (example generator) does not need to be called.
Also, it avoids the error message due to unresolvable avro schemas
@timonback
Copy link
Member

Hi @raphaeldelio , thank you for the detailed explanation.

I fixed the issue that already removed (avro) schemas are still being processed in #593
I believe this resolves your issue as well.

Still, I like to keep your adjustment of the example project (avro, dto + asyncapi.json) as this provides a better test case and avoids regressions.

I opened raphaeldelio#1.
Assuming that the code works, we you can revert the code obsolete code adjustments.
Or I/you can also open a PR containing only the example adjustments.

@raphaeldelio
Copy link
Contributor Author

@timonback looks good! I merged your changes into this PR. Thank you. I hope my analysis was helpful.

@timonback timonback merged commit 6d90cb3 into springwolf:master Feb 10, 2024
16 checks passed
@timonback
Copy link
Member

@raphaeldelio Yes indeed, the analysis was very helpful.
Thank you for the contribution and improving Springwolf!

@raphaeldelio raphaeldelio deleted the bugfix/nested-avro-models branch February 10, 2024 12:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants