-
Notifications
You must be signed in to change notification settings - Fork 3
[OEP 8] Asynchronous read cache with state machine #8
Comments
This is very nicely thought out. If you don't mind, I have a few minor questions. 1. Will the "operations buffer" be used for read+write or only reads? However writes cannot be lost but are a much rarer event. We use a dedicated buffer for writes instead of blocking on the exclusive lock. For efficiency we're using JCTools' 2. Do you plan on revisiting the 2Q policy and considering alternatives? 3. What are the pros/cons of a custom implementation vs. using Caffeine directly? Let me know if I can be of help. |
Hi @ben-manes , thank you very much for the feedback. About your questions.
We supposed to use a separate queue to log additions of new entries. That will be cool to use
Not yet, we tried to use LIRS and TinyLFU. LIRS results were even worse than 2Q and for TinyLFU cache hits were the same as for 2Q. Maybe we missed something. We are going to create a tool to gather traces of cache access from production deployments and will rerun our tests again.
Maybe I am wrong, but Caffeine does not support acquire/release semantic. I mean if I cache#get() data and they are in use, they may be removed from the cache. But in our design cache is a single source of pages for all DB components also it tracks dirty pages and flushes them to the disk when they "released" back to the cache. So pages can not be evicted when they get (acquired) from the cache and used by components. |
This would be great. I'd appreciate it if you can supply traces so that I can digest them with my simulator. It operates on
Yes, this is not a use-case it was designed for as it is not a general purpose scenario. It can be emulated in awkward ways, which is good enough for limited cases but perhaps not yours. The first way is to use a The second way is to use But the fact that it might be hacked on may not be good enough, so a custom implementation is very reasonable. |
The only open question which I see is when to free direct memory pointers when they removed during cache entry eviction. Because of step
we may access a page which is already reclaimed to the pool of pages. |
You might also leverage to phantom references, either to always defer or as a failsafe, to release native resources when the page is garbage collected. |
Latest log from YCSB tests
so as you can see read perfromance in general high but , because presence of global lock on cache eviction we have periodical slow down. This problem may be fixed by given change. |
Implemented in 3.0.15 and 3.1 versions |
Oh cool. Can you point me to the code? Curious to see what you built. |
Thanks! Glancing through it, a few ideas that you might find interesting.
Otherwise looks great :) |
@ben-manes . Super cool. Thank you very much for such feedback. |
Guys I will reopen issue to incorporate further improvements. |
Reference:
https://github.com/orientechnologies/orientdb-labs/blob/master/OEP_8.md
Summary:
Asynchronous read cache with state machine
Goals:
Non-Goals:
Success metrics:
Motivation:
To implement thread safety guarantees inside of read cache we use implementation of group locks which are close to Google Striped.
But this approach has couple of disadvantages:
There are https://screencloud.net/v/3PZd states of threads in case of YCSB "read only" benchmark. All "wait" states of threads are caused by exclusive page locks inside of 2Q cache.
An alternative approach is proposed to overcome these disadvantages.
It is proposed to:
The proposed design is an adoption of the design of Caffeine framework which has excellent scalability characteristics.
Description:
Current workflow of 2Q cache looks like following:
As alternative following design is proposed:
buffer.
Lock-free operations buffer.
To gather statics, we will use ring buffer which will be implemented using a plain array. But this buffer will be presented as not a single instance of the array but as an array of arrays. Each of the arrays will be used by a subset of threads to minimize contention between threads.
If threshold on one of those arrays will be reached, all those arrays will be emptied by one of the threads by applying tryLock operation which prevents contention between threads. Lock mentioned above is used only during buffer flush and is not used during logging of a statistic inside of buffers. Pointers inside of buffer will be implemented without
CAS operations, as a result, few of operations will be lost, but it will not cause significant changes in overall statistics. There is limit on amount of records which will be flushed at once per single thread. Eache thread flushes no more than 2 * threshold amount of elements from buffer.
State machine.
To achieve thread safety guarantees a state machine with following states will be introduced:
Eviction process
The removal process is performed during operations buffer flush (when the state of eviction policy is updated).
During this process:
To prevent case when we remove the reloaded page (see "state machine" description item 3) concurrent hash map item is removed using method map.remove(key, value) https://docs.oracle.com/javase/8/docs/api/java/util/concurrent/ConcurrentHashMap.html#remove-java.lang.Object-java.lang.Object- .
About ETA, we have already implemented a similar algorithm (different state machine) in the file auto-close functionality. So ETA for the worst case is 10 days , but probably smaller.
Alternatives:
There is no any other proposal for cache lock models at the moment.
Risks and assumptions:
In the case of incorrect or poorly tested implementation, there is a risk of data corruption.
Impact matrix
The text was updated successfully, but these errors were encountered: