-
Notifications
You must be signed in to change notification settings - Fork 248
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Farmer checksums #1783
Farmer checksums #1783
Conversation
…ta structure doing checksum creation/verification on encoding/decoding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I correct that we don't use KZG to verify piece records because it's inefficient?
let expected_hash = &remaining_bytes[..mem::size_of::<Blake3Hash>()]; | ||
if actual_hash != expected_hash { | ||
let actual_checksum = blake3_hash(encoded_bytes); | ||
let expected_checksum = &remaining_bytes[..mem::size_of::<Blake3Hash>()]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use BLAKE3_HASH_SIZE
in all places?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good question. Actually I'd like to make both that constant and similar for Blake2 private. We don't have standalone constants for sizes of other data structures, so it seems more consistent to not use them here either. Also this way size and type are inherently linked and usage of type in IDE reveals all relevant places.
Maybe I'm just trying to rationalize irrational things 🤷♂️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this way size and type are inherently linked and usage of type in IDE reveals all relevant places.
This was my initial theory: dependency chain management.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I correct that we don't use KZG to verify piece records because it's inefficient?
Yes, it'd be crazy expensive comparing to just hashing with Blake3.
let expected_hash = &remaining_bytes[..mem::size_of::<Blake3Hash>()]; | ||
if actual_hash != expected_hash { | ||
let actual_checksum = blake3_hash(encoded_bytes); | ||
let expected_checksum = &remaining_bytes[..mem::size_of::<Blake3Hash>()]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very good question. Actually I'd like to make both that constant and similar for Blake2 private. We don't have standalone constants for sizes of other data structures, so it seems more consistent to not use them here either. Also this way size and type are inherently linked and usage of type in IDE reveals all relevant places.
Maybe I'm just trying to rationalize irrational things 🤷♂️
…pace#1791 autonomys/subspace#1786 autonomys/subspace#1787 autonomys/subspace#1785 autonomys/subspace#1783 autonomys/subspace#1761 autonomys/subspace#1782 autonomys/subspace#1784 autonomys/subspace#1778 autonomys/subspace#1776 autonomys/subspace#1762 autonomys/subspace#1772 autonomys/subspace#1777 autonomys/subspace#1767 autonomys/subspace#1775 autonomys/subspace#1768 autonomys/subspace#1771 autonomys/subspace#1760 autonomys/subspace#1766 autonomys/subspace#1742 autonomys/subspace#1765 autonomys/subspace#1770 autonomys/subspace#1764
…pace#1791 autonomys/subspace#1786 autonomys/subspace#1787 autonomys/subspace#1785 autonomys/subspace#1783 autonomys/subspace#1761 autonomys/subspace#1782 autonomys/subspace#1784 autonomys/subspace#1778 autonomys/subspace#1776 autonomys/subspace#1762 autonomys/subspace#1772 autonomys/subspace#1777 autonomys/subspace#1767 autonomys/subspace#1775 autonomys/subspace#1768 autonomys/subspace#1771 autonomys/subspace#1760 autonomys/subspace#1766 autonomys/subspace#1742 autonomys/subspace#1765 autonomys/subspace#1770 autonomys/subspace#1764
This implements first two items of #1723 step by step and fully unlocks #1725.
Essentially we look at every major thing farmer writes to disk and ensure we have a checksum there. For sectors we have two-level checksums: for pieces and whole sector. Piece checksum is checked during retrieval (but not proving since it takes time and will be checked by consensus anyway in case corruption happened), while whole sector is not really checked, but will be useful for
subspace-farmer scrub
command to quickly check the whole sector without trying to interpret it in any way.Checksumming wrapper in
subspace-core-primitives
is useful, but in some cases more domain specific handling was chosen for efficiency purposes.This is certainly a big breaking change to many on-disk farmer data structures.
Code contributor checklist: