-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Schema Registry Avro] Use LRU cache policy #20108
[Schema Registry Avro] Use LRU cache policy #20108
Conversation
// REVIEW: signature. | ||
// | ||
// - Should we wrap all errors thrown by avsc to avoid having our exception // | ||
// contract being tied to its implementation details? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is tracked by #20072 already.
@@ -74,6 +74,7 @@ | |||
"@azure/schema-registry": "1.0.2", | |||
"avsc": "^5.5.1", | |||
"buffer": "^6.0.0", | |||
"lru-cache": "^6.0.0", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One small thing to look at is I know we have a list of "blessed" dependencies and a guideline to avoid other external dependencies. But I don't know how up-to-date this guideline is so up to you if you want to follow up on it https://azure.github.io/azure-sdk/typescript_implementation.html#ts-dependencies-no-other-packages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one's a pretty core JS ecosystem tool. It has 66M weekly downloads (4x more than React).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm good with this one, it's already in our graph in several places
interface CacheEntry { | ||
/** Schema ID */ | ||
id: string; | ||
|
||
/** avsc-specific representation for schema */ | ||
type: avro.Type; | ||
encoder: AVSCEncoder; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this is really the only part of the PR that tripped me up. It's clearly just a lexical change, but I can't help but wonder why we went from "avro.Type" to "AVSCEncoder" here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is basically the encoder for a specific schema that we use to encode and decode values and I renamed it to make it more readable, I remember when I first read this code, the Type
name did not give me a good idea what this object is about.
if (idCounter === testSchemaIds.length) { | ||
throw new Error("Out of IDs. Generate more GUIDs and paste them above."); | ||
if (idCounter >= testSchemaIds.length) { | ||
return uuid(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I'm wondering a bit why we have any pre-generated GUIDs in the tests unless we rely on specific GUIDs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, pre-generated GUIDs was me and I forget my reason. LOL. I would agree that if it's ok to generate some it should be ok to generate all. I wouldn't keep a mix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is my understanding that the hard coded ones are meant to serve as already registered schemas so some tests can call decode without having to call encode first/register schema explicitly first. I feel like it is reasonable to keep the hard-coded one based on this though it could make reading the tests a bit harder.
I can look into refactoring this in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes, that seems to ring a bell.
Update readme.python.md (Azure#20108)
Packages impacted by this PR
@azure/schema-registry-avro
Issues associated with this PR
Fixes #20064
Describe the problem that is addressed by this PR
Cache could grow unbounded and is a function of the unique schemas seen.
What are the possible designs available to address the problem? If there are more than one possible design, why was the one in this PR chosen?
There're many cache designs but the architects recommended using LRU one with 128 max entries.
Are there test cases added in this PR? (If not, why?)
Yes!
Provide a list of related PRs (if any)
N/A
Command used to generate this PR:**(Applicable only to SDK release request PRs)
N/A
Checklists