Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor(autonomi): deterministic archive #2599

Merged
merged 4 commits into from
Jan 8, 2025

Conversation

b-zee
Copy link
Contributor

@b-zee b-zee commented Jan 6, 2025

This ensures the same bunch of files/directories will yield the same (serialized) archive (assuming the files haven't changed). Furthermore, this will mean that uploading the same files will not require uploading a new version of the archive (which would differ slightly from the old).

@happybeing
Copy link
Contributor

Hi @b-zee , hope you had a good break.

I'm curious about this change as I wondered about it for my website metadata which also has a hashmap and considered making it deterministic with a btreemap.

Is it that you don't update the time unless the content changes, that this saves chunks?

@b-zee
Copy link
Contributor Author

b-zee commented Jan 6, 2025

Hey Mark! Thanks for asking, I had a good break, hope you've had a nice time as well!

Is it that you don't update the time unless the content changes, that this saves chunks?

This takes the metadata from the filesystem. So unless that changes, you'll end up with the same Archive. That will translate to the same bytes that we'll put on the network, and thus no requirement to put/pay again. (Given that with these changes the Archive will have the deterministically serialized (b-tree) map.) The uploaded metadata was always set to the current time ('now'), so that always translated into a unique Archive.

b-zee added 2 commits January 6, 2025 15:36
The uploaded timestamp caused the archive to change between uploads and
thus requiring the archive to be re-uploaded.
Use BTreeMap instead of HashMap so serde serializes deterministically.
@b-zee b-zee force-pushed the refactor-upload-deterministic-archive branch from 988654b to f13415e Compare January 6, 2025 14:36
Copy link
Member

@maqi maqi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

restore the output balance check during the second client upload within the memory_check CI test.
To ensure there is no extra cost to be paid during repeated uploads.

@b-zee b-zee enabled auto-merge January 7, 2025 13:36
@b-zee b-zee added this pull request to the merge queue Jan 7, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Jan 7, 2025
@b-zee b-zee added this pull request to the merge queue Jan 8, 2025
Merged via the queue into maidsafe:main with commit 6f94a1e Jan 8, 2025
26 checks passed
@b-zee b-zee deleted the refactor-upload-deterministic-archive branch January 8, 2025 07:14
@maqi maqi mentioned this pull request Jan 13, 2025
1 task
@jacderida jacderida mentioned this pull request Jan 14, 2025
jacderida added a commit that referenced this pull request Jan 14, 2025
jacderida added a commit that referenced this pull request Jan 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants