-
Notifications
You must be signed in to change notification settings - Fork 225
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: add cache to get evaluation rules storage method #1910
perf: add cache to get evaluation rules storage method #1910
Conversation
347e65e
to
9e86c1b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @rudineirk !! Definitely love any changes that help improve performance.
Ive been thinking about if moving the existing caching back to wrap the storage layer makes sense instead of the current interceptor/server cache that we have, so this is a good first step!
I'm trying to remember why I decided to move the caching to the server layer in the first place, as it was initially in the storage layer similar to this approach..🤔. Here is when it changed: https://github.com/flipt-io/flipt/pull/954/files#diff-ee870b8fa81722aba3a872f0a81e22f5e100f4595dda3a4de59b39e23e8e882bR120
I think it makes sense to rely on TTL cache expiration (like you are doing) since that's already what we rely on for the existing entityID evaluation cache.
Also wondering if we even need the existing evaluation cache after this change, as im not sure how much performance benefit we get there since this is likely the slowest part of the process as you described.. something for us to benchmark I suppose.
Have you seen any overhead performance-wise of the marshalling to/from JSON for large sets of EvaluationRules? Im guessing most flags don't actually have that many rules to begin with so maybe its a non-issue?
9e86c1b
to
5dc0558
Compare
Codecov Report
@@ Coverage Diff @@
## main #1910 +/- ##
==========================================
+ Coverage 71.21% 71.30% +0.09%
==========================================
Files 60 61 +1
Lines 5777 5820 +43
==========================================
+ Hits 4114 4150 +36
- Misses 1431 1436 +5
- Partials 232 234 +2
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
I think the final evaluation result cache is still valid, in our use case we make multiple evaluation requests over the same flag/entityId, the evaluation result cache helps with this, reducing a few ms from each request. The cache that is being added in this PR helps with evaluation requests with different entityIDs for the same flag, where we got the worst delays. About the overhead of decoding a large JSON, it has a bit of impact CPU wise, but I think it's still way better than querying a lot of data from the DB/storage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR indeed! Awesome to see users running the software and making improvements @rudineirk.
I am wondering if we should keep the same original semantics of serializing the rules as protobuf as we did EvaluationResponse
, but if the JSON serialization is not too big of a CPU hit as you've stated, then maybe it is not worth worrying about that right now.
Code looks good though.
5dc0558
to
7333573
Compare
I thought about that too, to use protobuf to serialize the cache, but it would require to convert all storage structs into protobuf, and I don't know if the performance gains over JSON would be worth it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks again @rudineirk ! added some thoughts around error handling
7333573
to
5135f2c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! thank you again @rudineirk !!
@all-contributors please add @rudineirk for code |
I've put up a pull request to add @rudineirk! 🎉 |
will go out in v1.24.0, likely Monday 🎉 |
* 'main' of https://github.com/flipt-io/flipt: docs: add rudineirk as a contributor for code (#1913) perf: add cache to get evaluation rules storage method (#1910) fix: renable pprof (#1912)
this improves the flags evaluation performance, after some investigation we found the main cause of delays in evaluation requests were the rules query, in our case running Flipt with a PostgreSQL DB on AWS RDS we were getting up to 1s of delays, with this fix it reduced to a few miliseconds.
this structure could also be used later to implement the cache for the ListFlags mentioned in issue #935