Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Farmer checksums #1783

Merged
merged 6 commits into from
Aug 9, 2023
Merged

Farmer checksums #1783

merged 6 commits into from
Aug 9, 2023

Conversation

nazar-pc
Copy link
Member

@nazar-pc nazar-pc commented Aug 9, 2023

This implements first two items of #1723 step by step and fully unlocks #1725.

Essentially we look at every major thing farmer writes to disk and ensure we have a checksum there. For sectors we have two-level checksums: for pieces and whole sector. Piece checksum is checked during retrieval (but not proving since it takes time and will be checked by consensus anyway in case corruption happened), while whole sector is not really checked, but will be useful for subspace-farmer scrub command to quickly check the whole sector without trying to interpret it in any way.

Checksumming wrapper in subspace-core-primitives is useful, but in some cases more domain specific handling was chosen for efficiency purposes.

This is certainly a big breaking change to many on-disk farmer data structures.

Code contributor checklist:

@nazar-pc nazar-pc added the breaking-farmer This PR introduces breaking changes to the farmer implementation label Aug 9, 2023
Base automatically changed from small-farmer-refactoring to main August 9, 2023 07:01
Copy link
Contributor

@shamil-gadelshin shamil-gadelshin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I correct that we don't use KZG to verify piece records because it's inefficient?

let expected_hash = &remaining_bytes[..mem::size_of::<Blake3Hash>()];
if actual_hash != expected_hash {
let actual_checksum = blake3_hash(encoded_bytes);
let expected_checksum = &remaining_bytes[..mem::size_of::<Blake3Hash>()];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use BLAKE3_HASH_SIZE in all places?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good question. Actually I'd like to make both that constant and similar for Blake2 private. We don't have standalone constants for sizes of other data structures, so it seems more consistent to not use them here either. Also this way size and type are inherently linked and usage of type in IDE reveals all relevant places.

Maybe I'm just trying to rationalize irrational things 🤷‍♂️

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also this way size and type are inherently linked and usage of type in IDE reveals all relevant places.

This was my initial theory: dependency chain management.

Copy link
Member Author

@nazar-pc nazar-pc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I correct that we don't use KZG to verify piece records because it's inefficient?

Yes, it'd be crazy expensive comparing to just hashing with Blake3.

let expected_hash = &remaining_bytes[..mem::size_of::<Blake3Hash>()];
if actual_hash != expected_hash {
let actual_checksum = blake3_hash(encoded_bytes);
let expected_checksum = &remaining_bytes[..mem::size_of::<Blake3Hash>()];
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good question. Actually I'd like to make both that constant and similar for Blake2 private. We don't have standalone constants for sizes of other data structures, so it seems more consistent to not use them here either. Also this way size and type are inherently linked and usage of type in IDE reveals all relevant places.

Maybe I'm just trying to rationalize irrational things 🤷‍♂️

@nazar-pc nazar-pc merged commit 73b5ca0 into main Aug 9, 2023
@nazar-pc nazar-pc deleted the farmer-checksums branch August 9, 2023 14:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-farmer This PR introduces breaking changes to the farmer implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants