-
Notifications
You must be signed in to change notification settings - Fork 166
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(iroh-dns-server): remove accidental blocking from store #2985
base: main
Are you sure you want to change the base?
Conversation
Documentation for this PR has been generated and is available at: https://n0-computer.github.io/iroh/pr/2985/docs/iroh/ Last updated: 2024-12-02T10:26:25Z |
Does this lead to a measurable improvement? I would say the big performance problem is that we are creating a write transaction per op, so we are limited to a small number (~1000/s) of txns per second no matter how beefy the machine is. If we want to really speed this up we need to do a similar design to what is done in the blob store - batch write transactions and sync in regular intervals. Even though in case of the discovery server we can be more sloppy. E.g. it losing the last second of updates in case of a crash is really no big deal since discovery information gets republished. |
@@ -14,7 +14,7 @@ const SIGNED_PACKETS_TABLE: TableDefinition<&SignedPacketsKey, &[u8]> = | |||
|
|||
#[derive(Debug)] | |||
pub struct SignedPacketStore { | |||
db: Database, | |||
db: Arc<Database>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
redb
should really implement Clone
for Database
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean Clone? This is how we do it with the protocols, they don't (need to) impl Clone, and people have to wrap them in an Arc... 🤣
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah sorry, I edited my comment :D
I really should try to set up a benchmark for this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah good fixes.
I also like @rklaehn's suggestion (happy to quickly add another PR for that if needed).
Also am slightly confused by ZoneCache
. Why is one cache an LRU, and the other a TTL?
We wrap them both with a mutex in the end, even though we don't need the mutex for the TTL.
And I don't think we need ZoneCache::cache
to be an LruCache
, do we? They're both caches, we can use TtlCache
for both of them, avoiding the top-level mutex entirely.
I don't think that we should do this, until we have figured out the write amplification issues. Current write amplification is around 10x which means we might need to drop redb anyway. This is just a stopgap until we have done so |
Unclear (yet) as this hasn't been deployed yet. But we have seen reports that both read and write calls are quite slow with heavy load |
f4ed24c
to
3fb51ae
Compare
Description
I found this during investigations of #2972. It turned out that both blocking locks and blocking IO calls are being made in the dns servers store implementation
Breaking Changes
None
Notes & open questions
This might be not optimal, but it is the safest way to get rid of blocking the whole runtime for now.
Change checklist