-
Notifications
You must be signed in to change notification settings - Fork 215
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Merged by Bors] - sql: store ballots, blocks and layers in sqlite #3047
Conversation
wow! super happy to move to relational database for data integrity. a couple high level questions
|
i observed an improvement with more data, and small degradation when there is very little data.
but relative improvement will be much larger than -13.71% when encoding will be replaced, as xdr wastes a lot of time. i actually expected sqlite to be faster, as leveldb data structure is write-optimized. @countvonzero do you remember if
@noamnelke was against updating versions if we are not using new features, so i decided not to rush with 1.17. anyway i think we will want to use 1.18 in near future, as it introduces significant improvements to the language (fuzzing support in stdlib, and generics) |
yes. we were. we didn't explore other libraries further because core geth devs did not support the change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
initial pass only on sql package
if err != io.EOF { | ||
return nil, fmt.Errorf("copy sig %w", err) | ||
} | ||
sigBytes = nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this be an error too? same for pubkeyBytes
a ballot has to have signature and public key
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i made it this way because for testing i don't need signature and pubkey, can change it to a stricter validation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, it is very inconvenient to use valid signature and pubkey for testing, so i would rather leave it as it is, and enforce validation in other places (like it works now)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmmm... for me, the biggest benefit of using sql-like database is the guarantee on data integrity. if we cannot trust that whatever in the database is correct, then it lost a lot of its value to me.
we should have NOT NULL constraints for these two fields for ballot.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is a benefit for sure but not the most significant. i think that it doesn't matter much because we didn't enforce it with leveldb, so for me it is irrelevant if it will be enforced with sqlite.
what i see as a main benefit is that it is much harder to make mistakes writing custom keys, when i joined almost all mesh dbcode was subtly (and some not so much subtly) broken
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but what is the cost of adding them? why not add them? isn't it nice to be able to reap more benefit from sqlite?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added for layers, but can't add them for signature and pubkey because genesis ballot doesn't have them and it messes up some tests.
bors merge |
## Motivation The existing approach lacks atomicity, causal durability (e.g. an atx may not be on disk when a ballot is saved), and the durability can't be enforced in general without running every leveldb operation in sync mode. All this problems will result in subtle bugs that are hard to diagnose. For those 3 requirements, we want to maintain all state in a single db. Moving state to a single leveldb will require us to enforce isolation ourselves (by maintaining separate namespace and manually concatenating keys like we do in some modules), beside that we have to create every single index manually while with sqlite we can just do `CREATE INDEX` and sqlite will do it for us and probably do a better job. Another significant benefit is that we can duplicate some state in sql table to avoid loading the whole structure into memory. For instance it will be relevant for atx, which is a large (10kb) and usually, after it was validated, we want to know only the associated smesher and weight of the atx. This would be problematic with leveldb, and would require adding custom index. related: #2918 ## Changes - general plumbing for core database stuff (db, transaction, migrations) using https://github.com/crawshaw/sqlite that is a relatively simple wrapper around C sqlite - tables and for layers, blocks and ballots - reworked ZeroLayer, it is relevant only for hare_output, which can be in 3 states - nil, empty, non-empty. SetZeroLayer update hare_output to empty state, at which tortoise will vote against all blocks within hdist. - updated to golang 1.16 for `embed` module. note that there is a bug with go mode tidy in 1.16 so i had to manually fix go.sum and disable go mod tidy on ci - golang/go#44129 ## Test Plan existing and new uts
Build failed: |
bors merge |
## Motivation The existing approach lacks atomicity, causal durability (e.g. an atx may not be on disk when a ballot is saved), and the durability can't be enforced in general without running every leveldb operation in sync mode. All this problems will result in subtle bugs that are hard to diagnose. For those 3 requirements, we want to maintain all state in a single db. Moving state to a single leveldb will require us to enforce isolation ourselves (by maintaining separate namespace and manually concatenating keys like we do in some modules), beside that we have to create every single index manually while with sqlite we can just do `CREATE INDEX` and sqlite will do it for us and probably do a better job. Another significant benefit is that we can duplicate some state in sql table to avoid loading the whole structure into memory. For instance it will be relevant for atx, which is a large (10kb) and usually, after it was validated, we want to know only the associated smesher and weight of the atx. This would be problematic with leveldb, and would require adding custom index. related: #2918 ## Changes - general plumbing for core database stuff (db, transaction, migrations) using https://github.com/crawshaw/sqlite that is a relatively simple wrapper around C sqlite - tables and for layers, blocks and ballots - reworked ZeroLayer, it is relevant only for hare_output, which can be in 3 states - nil, empty, non-empty. SetZeroLayer update hare_output to empty state, at which tortoise will vote against all blocks within hdist. - updated to golang 1.16 for `embed` module. note that there is a bug with go mode tidy in 1.16 so i had to manually fix go.sum and disable go mod tidy on ci - golang/go#44129 ## Test Plan existing and new uts
Pull request successfully merged into develop. Build succeeded: |
Motivation
The existing approach lacks atomicity, causal durability (e.g. an atx may not be on disk when a ballot is saved), and the durability can't be enforced in general without running every leveldb operation in sync mode. All this problems will result in subtle bugs that are hard to diagnose.
For those 3 requirements, we want to maintain all state in a single db. Moving state to a single leveldb will require us to enforce isolation ourselves (by maintaining separate namespace and manually concatenating keys like we do in some modules), beside that we have to create every single index manually while with sqlite we can just do
CREATE INDEX
and sqlite will do it for us and probably do a better job.Another significant benefit is that we can duplicate some state in sql table to avoid loading the whole structure into memory. For instance it will be relevant for atx, which is a large (10kb) and usually, after it was validated, we want to know only the associated smesher and weight of the atx. This would be problematic with leveldb, and would require adding custom index.
related: #2918
Changes
SetZeroLayer update hare_output to empty state, at which tortoise will vote against all blocks within hdist.
embed
module. note that there is a bug with go mode tidy in 1.16 so i had to manually fix go.sum and disable go mod tidy on ci - cmd/go: missing sum after updating a different package golang/go#44129Test Plan
existing and new uts