Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add possibility to submit a list of changes to rocksdb #2619

Open
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

AurelienFT
Copy link
Contributor

@AurelienFT AurelienFT commented Jan 22, 2025

Description

The goal of this PR is to allow to insert a list of changes instead of one HashMap of changes. This avoid multiple update produced by different threads for example, to be merged in one transaction.

By removing this step we remove a whole iteration and insert/deletion in a big HashMap of changes.

It's currently not used anywhere in the code but you can see a usage example here : #2618

In this PR when we have to write history modifications we are still merging the changes in one big transaction before calling store_modifications_history because I don't think it's possible to use this algorithm with changes splitted in different parts. @acerone85 if you have an idea I would be happy to here it but in any case I think this should go to follow-up PR to not overcharge this one. (I made an attempt in branch try_adapt_history_vec branch but it's mehhh)

Checklist

  • Breaking changes are clearly marked as such in the PR description and changelog
  • New behavior is reflected in tests
  • The specification matches the implemented behavior (link update PR if changes are needed)

Before requesting review

  • I have reviewed the code myself
  • I have created follow-up issues caused by this PR and linked them here

@AurelienFT AurelienFT self-assigned this Jan 22, 2025
@AurelienFT AurelienFT added the xxx label Jan 22, 2025
@AurelienFT AurelienFT marked this pull request as ready for review January 23, 2025 14:12
@AurelienFT AurelienFT requested a review from acerone85 January 23, 2025 14:12
@AurelienFT AurelienFT marked this pull request as draft January 23, 2025 15:14
@AurelienFT
Copy link
Contributor Author

Works but draft until mature enough to fit requirements of : #2618

@AurelienFT AurelienFT marked this pull request as ready for review January 24, 2025 13:51
@AurelienFT AurelienFT requested a review from a team January 24, 2025 13:51
AurelienFT and others added 6 commits January 24, 2025 14:53
## Description
<!-- List of detailed changes -->
Was looking into:
```rs
                // We have to clone the prefix and start, because we need to pass them to the iterator
                // if someone finds a solution without making it a vec, feel free to contribute :)
                let column = column.id();
                let prefix = prefix.map(|prefix| prefix.to_vec());
                let start = start.map(|start| start.to_vec());
```
and did some refactoring to get to the bottom of the problem.

I don't know if there is a solution unless we decide to change the
signature of `iter_store_keys` to something like:

```rs
    fn iter_store_keys<'a>(
        &'a self,
        column: Self::Column,
        prefix: Option<&'a [u8]>,
        start: Option<&'a [u8]>,
        direction: IterDirection,
    ) -> BoxedIter<KeyItem>;
```
But that's not realistic in this PR.

Might as well keep the refactoring because I feel like it's cleaner.
Comment on lines +152 to +170
StorageChanges::ChangesList(changes_list) => {
// We have to clone the prefix and start, because we need to pass them to the iterator
// if someone finds a solution without making it a vec, feel free to contribute :)
let column = column.id();
let prefix = prefix.map(|prefix| prefix.to_vec());
let start = start.map(|start| start.to_vec());
changes_list
.iter()
.flat_map(move |changes| {
get_insert_keys_from_changes(
changes,
column,
prefix.as_deref(),
start.as_deref(),
direction,
)
})
.into_boxed()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IDK if this would be better or worse, but it avoids the to_vecs :P

Suggested change
StorageChanges::ChangesList(changes_list) => {
// We have to clone the prefix and start, because we need to pass them to the iterator
// if someone finds a solution without making it a vec, feel free to contribute :)
let column = column.id();
let prefix = prefix.map(|prefix| prefix.to_vec());
let start = start.map(|start| start.to_vec());
changes_list
.iter()
.flat_map(move |changes| {
get_insert_keys_from_changes(
changes,
column,
prefix.as_deref(),
start.as_deref(),
direction,
)
})
.into_boxed()
}
StorageChanges::ChangesList(changes_list) => {
let column = column.id();
changes_list
.iter()
.flat_map(move |changes| {
get_insert_keys_from_changes(
changes, column, prefix, start, direction,
)
})
.collect::<Vec<_>>()
.into_iter()
.into_boxed()
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mehhh, i don't really like that because start and prefix are pretty small in most of the cases where the number of elements iterated on could be pretty big so iterating twice and creating a heap allocation with all elements will be more costly than 2 little allocation

@AurelienFT AurelienFT requested review from MitchTurner and a team January 30, 2025 10:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants