Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add new chunk count column for tracking #342

Merged
merged 22 commits into from
Feb 19, 2025
Merged

Conversation

snowmead
Copy link
Contributor

@snowmead snowmead commented Jan 27, 2025

Adds new CHUNK_COUNT_COLUMN to track number of chunks for a given file key. This avoids having to iterate through the entire file trie to count all the chunks and conclude if we have uploaded all the chunks yet for a file.

These changes also include a refactor of our column definitions to have consistent index retrievals and column length constant. In addition, I added some documentation to these columns to have a clear definition for their purpose in the context of RocksDB file storage.

Resolves issue #20

Copy link
Contributor

@TDemeco TDemeco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice job! Left some comments but overall great 💯

client/file-manager/src/in_memory.rs Outdated Show resolved Hide resolved
client/file-manager/src/in_memory.rs Outdated Show resolved Hide resolved
client/file-manager/src/rocksdb.rs Show resolved Hide resolved
@@ -20,23 +20,50 @@ use crate::{
LOG_TARGET,
};
use codec::{Decode, Encode};
use strum::{EnumCount, VariantArray};

#[derive(Debug, Clone, Copy, EnumCount, VariantArray)]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the refactor 💯


#[derive(Debug, Clone, Copy, EnumCount, VariantArray)]
pub enum Column {
Metadata,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have docs for the Metadata member as well, and if possible for all the columns

@@ -58,7 +85,8 @@ fn open_or_creating_rocksdb(db_path: String) -> io::Result<kvdb_rocksdb::Databas
Ok(db)
}

/// Storage backend for RocksDB.
/// Storage backend implementation for RocksDB.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love the improved docs on all functions!

@@ -46,7 +73,7 @@ fn open_or_creating_rocksdb(db_path: String) -> io::Result<kvdb_rocksdb::Databas
path.push(db_path.as_str());
path.push("storagehub/file_storage/");

let db_config = kvdb_rocksdb::DatabaseConfig::with_columns(8);
let db_config = kvdb_rocksdb::DatabaseConfig::with_columns(NUMBER_OF_COLUMNS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice improvement.
Indeed in the past I had errors when forgetting to increase the columns number hardcoded value. It would avoid it in the future.

Copy link
Contributor

@links234 links234 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good but seems to me we need another locker for the write_chunk so that we don't end up with wrong chunk_count value.w

Cargo.toml Show resolved Hide resolved
client/file-manager/src/rocksdb.rs Show resolved Hide resolved
@@ -15,10 +15,14 @@ pub enum FileStorageWriteError {
FailedToGetFileChunk,
/// File metadata fingerprint does not match the stored file fingerprint.
FingerprintAndStoredFileMismatch,
/// Failed to construct file trie.
FailedToContructFileTrie,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
FailedToContructFileTrie,
FailedToConstructFileTrie,

@ffarall ffarall self-requested a review February 19, 2025 22:52
Copy link
Contributor

@ffarall ffarall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing job here! Loved the added docs. Much needed. Just left a minor comment.

Comment on lines +46 to +52
/// Exclude* columns stores keys of 32 bytes representing the `file_key` with empty values.
///
/// These columns are used primarily to mark file keys as being excluded from certain operations.
ExcludeFile,
ExcludeUser,
ExcludeBucket,
ExcludeFingerprint,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd copy and paste the doc comment for the other items, or make one for each. Otherwise this doc comment will only be linked to ExcludeFile. So for example, if you hover over ExcludeBucket, you won't see any doc comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add this in another PR to avoid restarting CI for such a small change

@snowmead snowmead merged commit 5eab169 into main Feb 19, 2025
25 checks passed
@snowmead snowmead deleted the feat/chunk-count-column branch February 19, 2025 23:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants