-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cosmos-sdk don't have to sync db for each block #14966
Comments
Are both atomic? What if |
I think atomicity is the same as before, a batch write should be represented as a single entry in WAL.
I think both are still true without sync. |
I would like to confirm prior to changing anything ;-) What is the semantic difference between |
|
I'm in support of this -- atomicity should not be done at the layer of a single write to a DB, but instead just a small write for confirming the write was done completely. Else, rewrite last blocks data. This becomes much easier in the IAVL new storage layout. Even have an IAVL-side optimization trying to improve this: cosmos/iavl#619 |
@yihuang where do you see |
Just there, iavl tree seems don't write with sync |
Oh I misunderstood this -- isn't it important for the metadata to be written synchronously? This is the data that is read from to determine last succesfully completed write to that store AFAIU? |
the idea is just let tendermint replay some blocks if system crash and lose some data. |
Hrmm, are we sure theres no edge cases around this? Agreed that in the average case this is safe. Its equivalent to failing to write midway through commit. But can't we get situations where the store for module A is committed succesfully for height H+2, and for module B is last synced at height H. If we crash, now we have a complex situation to recover from. (We'd need to rollback to H, which makes sense to do, but we don't currently have the software to correctly handle this or apply such rollbacks) |
our current flow of loading multistore is like this:
I think it's correct here, the basic assumption of the db commit pipeline is it only corrupt at tail, won't corrupt at middle, for example:
If |
But why does store 2's write getting sync'd to disk imply anything about store 1's write being synced to disk. Isn't commit info in a potentially separate physical DB than store1 & store2? If its in the same DB then I agree with your reasoning! (Though its not clear to me we want to imagine it being in the same DB longer term) If they're in separate DB's, couldn't we have a time diagram that looks like:
I'm perhaps vastly misunderstanding something about syncing here! Happy to just be told I'm not getting it and move on :) |
yeah, I was assuming they are same db, it's more tricky if they are different DBs. |
The same DB object is passed to each store. |
I'm closing this one, since benchmarks shows that the real cost is the writing the nodes, instead of sync. |
Summary
Currently we always
WriteSync
in commit event of each block, but we don't have to do that because tendermint will replay the blocks for us if we lost some data.Problem Definition
We do unnecessary sync in db writing.
Proposal
Conservative option:
Write
instead ofWriteSync
in commitAggressive option:
disable the WAL?risk losing data when pruning blocks.The text was updated successfully, but these errors were encountered: