-
Notifications
You must be signed in to change notification settings - Fork 109
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FEATURE] Metadata-less entries in distributed cache #337
Comments
Hi @jarodriguez-itsoft , The idea is not to require a manual setting, but to make it automatic based on concrete usage: I'm already doing this for memory entries (eg: cache entries in L1, the memory level), whereby I only populate the metadata if it's needed, based on the features you are using like eager refresh & co. I've never done this for the L2 (distributed level) for... reasons 🤔 ... but I'm getting back to this again. The short version is that in Will update on this. |
I figured out why I did that: it was to make sure that when an entry comes in from L2 (distributed) to L1 (memory), the actual expiration woul be "aligned". So anyway, now I'm trying to find a way to do the same while avoiding the metadata. Will update. |
Hi @jarodriguez-itsoft , since most of the work is done for v2, I'm now work on this. Upon further inspection, I have a question regarding this: {
"$id": "1",
"v": "bar",
"m": {
"$id": "2",
"e": "2024-12-29T16:47:37.5660416+00:00"
},
"t": 638684956575645632
} As I explained FusionCache can avoid storing the metadata part in case it's not need: for the memory level it's already doing it, but for the distributed level it is not.
Corrections, see below. UPDATE 2024-12-31 Hope this helps. |
Hi @jarodriguez-itsoft , I just release preview-4, which most probably will be the last preview before going GA with FusionCache V2 🥳 If you can play with it and let me know it would be great, thanks! |
Hi @jodydonetti, sorry for late response, just returning from holidays! I have set up a test project, which basically adds a string, with the key having the version so we can compare.
I have tested wih 1.4.1 and with the preview-4
First thing I noticed are a couple of keys "RedisTESTv2p4:__fc:t:*" and "RedisTESTv2p4:__fc:t:**" which I don't know where they come from. The cached entry from 1.4.1:
The cached entry from preview-4:
The metadata part seems to have dissapeared. That's cool! To reduce it further, we created a custom serializer
to skip JSON when using base types, so when using this FusionCacheRawNewtonsoftJsonSerializer the result is :
The question is that -although it is working- I feel it is quite inefficient due to all the unneeded boxings/unboxings required. Any chance of having a built-in mechanism to skip working with FusionCacheDistributedEntry and directly work with values? Regards |
Forgot to check the content of those weird wildcard entries created by preview-4:
Seems those are metadata entries with no value. |
Hi @jarodriguez-itsoft thanks for the update! The new metadata-less mode in preview-4 is looking good, happy about it 👍 The 2 extra entries are there to support the new Clear mechanism: they are 2 instead of 1 to support both a "remove all" and an "expire all" behavior. If you really don't want any extra cache entry you can disable Tagging (which is the underlying mechanism for Clear, too) by setting Hope this helps. |
Oh, regarding the custom serializer you implemented: I'll have a better look at it later but keep in mind that the other values being serialized are there so FusionCache can work as expected. By removing them you'll probably end up with some nasty surprises, like stuff lasting in the cache (particularly in L1) longer or shorter than expected, fail-safe not working correctly and so on. Something you can do to have a taste of these potential issues is to temporarily add your own serializer to the ones currently available in the test suite (there's an enum where you add a value and a method that instantiate it, that's it), then run all the tests. Hope this helps, let me know! |
It seems, but no: the value is the timestamp of when the last Clear happened, and when nothing is in the cache the default value 0 (zero) is used. Then, when serializing, there's a setting for which default values are not emitted (to save bandwidth etc), and 0 is the default value, therefore is not emitted. It looks like is useless but it's actually not ;-) PS: anyway you just gave me an idea for how to optimize it even more, by skipping writing to distributed when the value is 0, thanks 😬 ! Will update you after I tried it. |
Hi Jody:
I have made a series of test using different combinations of the DisableTagging property and ClearCache methods. Here are some results:
The only available batch-cleaning command at Redis is the FLUSHALL command, which clears all the entries. Regarding the tagging overhead, I tested adding some keys and no other tagging keys were created appart from those 2.
IMHO it is completely OK, as far as tagging keys do not increase linearly with the real keys.
Well, as I commented before, our use case is basically L1-only caches + L2-only caches. Your library makes a great work as an abstraction to both use cases, and by helping with the cache-stampede problem ;) Thanks! |
The design is such that a clear method call will not instantly clear the cache, although from "the outside" the end result is the same. You can read more on the design in the original proposal.
To be more precise the command would be Anyway I can't use that because of 3 reasons:
So an actual real "remove all" operation is not suitable, on top of being quite heavy in caches with a ton of entries.
Eheh yep, been there done that 😅
Exactly: only 2 extra entries globally per-cache + 1 for each tag used (more on this right below, it's now even better!). Oh, and regarding this last passage: remember the idea you gave me last time, when I said this?
I just implemented it (to be precise: skip distributed write + skip backplane when value is zero), and can confirm it works beautifully! So, to recap, in preview-4 it was: "1 extra entry per each tag ASSOCIATED to any entry + 2 extra entries globally ALWAYS" while now it has become: "1 extra entry per each tag for which a Another nice optimization, thanks for the inspiration!
Good 👍
Love to hear that 😬
One thing about the stampede protection when using only L2: instead of skipping L1 completely, I'd suggest to specify a super low duration locally (like Thanks! |
Hi all, I still can't believe it but v2.0.0 is finally out 🎉 |
That's amazing Jody! Been quite busy but as soon as I have some spare time I will test the final release and (hopefully not) tell if I find something ;) |
Problem
I’m using this amazing library to manage both combined (in-memory and distributed) caches and distributed-only caches.
Some of these caches hold millions of entries, accessed concurrently by dozens of app nodes (following a producer/consumer pattern). Thanks to a new feature introduced in recent versions of the library, we can now use distributed-only caches, which has spared us from having nodes consume massive amounts of memory unnecessarily.
So far, so good, but KeyDB is also consuming a lot of memory, and we aim to reduce its usage as well.
The issue is that FusionCache stores a lot of metadata in Redis, which is essentially useless since Redis already handles expiration, and the in-memory part of the cache is disabled.
For example, to simply store the string value "var", we get something like this:
127.0.0.1:6379> HGETALL "v0:foo"
That's quite too much memory overhead when the caches have millions of entries.
Solution
Add a cache-level (or entry-level option) to control, at the distributed level, if the value should be serialized/deserialized as usual or simply stored/retrieved in its raw form.
127.0.0.1:6379> GET "foo"
"bar"
The text was updated successfully, but these errors were encountered: