Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Schema Registry Avro] Use LRU cache policy #20108

Merged
merged 8 commits into from
Jan 28, 2022

Conversation

deyaaeldeen
Copy link
Member

Packages impacted by this PR

@azure/schema-registry-avro

Issues associated with this PR

Fixes #20064

Describe the problem that is addressed by this PR

Cache could grow unbounded and is a function of the unique schemas seen.

What are the possible designs available to address the problem? If there are more than one possible design, why was the one in this PR chosen?

There're many cache designs but the architects recommended using LRU one with 128 max entries.

Are there test cases added in this PR? (If not, why?)

Yes!

Provide a list of related PRs (if any)

N/A

Command used to generate this PR:**(Applicable only to SDK release request PRs)

N/A

Checklists

  • Added impacted package name to the issue description
  • Does this PR needs any fixes in the SDK Generator?** (If so, create an Issue in the Autorest/typescript repository and link it here)
  • Added a changelog (if necessary)

Comment on lines -47 to -50
// REVIEW: signature.
//
// - Should we wrap all errors thrown by avsc to avoid having our exception //
// contract being tied to its implementation details?
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is tracked by #20072 already.

@@ -74,6 +74,7 @@
"@azure/schema-registry": "1.0.2",
"avsc": "^5.5.1",
"buffer": "^6.0.0",
"lru-cache": "^6.0.0",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One small thing to look at is I know we have a list of "blessed" dependencies and a guideline to avoid other external dependencies. But I don't know how up-to-date this guideline is so up to you if you want to follow up on it https://azure.github.io/azure-sdk/typescript_implementation.html#ts-dependencies-no-other-packages

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one's a pretty core JS ecosystem tool. It has 66M weekly downloads (4x more than React).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good with this one, it's already in our graph in several places

sdk/schemaregistry/schema-registry-avro/package.json Outdated Show resolved Hide resolved
interface CacheEntry {
/** Schema ID */
id: string;

/** avsc-specific representation for schema */
type: avro.Type;
encoder: AVSCEncoder;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this is really the only part of the PR that tripped me up. It's clearly just a lexical change, but I can't help but wonder why we went from "avro.Type" to "AVSCEncoder" here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is basically the encoder for a specific schema that we use to encode and decode values and I renamed it to make it more readable, I remember when I first read this code, the Type name did not give me a good idea what this object is about.

if (idCounter === testSchemaIds.length) {
throw new Error("Out of IDs. Generate more GUIDs and paste them above.");
if (idCounter >= testSchemaIds.length) {
return uuid();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'm wondering a bit why we have any pre-generated GUIDs in the tests unless we rely on specific GUIDs.

Copy link
Contributor

@nguerrera nguerrera Jan 28, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, pre-generated GUIDs was me and I forget my reason. LOL. I would agree that if it's ok to generate some it should be ok to generate all. I wouldn't keep a mix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is my understanding that the hard coded ones are meant to serve as already registered schemas so some tests can call decode without having to call encode first/register schema explicitly first. I feel like it is reasonable to keep the hard-coded one based on this though it could make reading the tests a bit harder.

I can look into refactoring this in another PR.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, yes, that seems to ring a bell.

@deyaaeldeen deyaaeldeen merged commit b88c0ba into Azure:main Jan 28, 2022
@deyaaeldeen deyaaeldeen deleted the schemaregistryavro/update-cache branch January 28, 2022 21:36
azure-sdk pushed a commit to azure-sdk/azure-sdk-for-js that referenced this pull request Aug 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Schema Registry Avro] Update cache policy to be LRU with max 128 entries
5 participants