Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

don't enumerate index_uid when requesting splits to gc #5489

Merged
merged 5 commits into from
Oct 18, 2024

Conversation

trinity-1686a
Copy link
Contributor

No description provided.

@fulmicoton fulmicoton force-pushed the trinity/gc-no-index_uid-in branch from 873d64d to a98ec47 Compare October 15, 2024 02:41
@fulmicoton fulmicoton force-pushed the trinity/gc-no-index_uid-in branch from a98ec47 to 5b63045 Compare October 15, 2024 09:08
@fulmicoton fulmicoton force-pushed the trinity/gc-no-index_uid-in branch from 5b63045 to c1cf01d Compare October 15, 2024 09:15
Comment on lines 365 to 368
if let Some(splits) = splits_metadata_to_delete_per_index.get_mut(&meta.index_uid) {
splits.push(meta);
} else {
splits_metadata_to_delete_per_index.insert(meta.index_uid.clone(), vec![meta]);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be expressed with the entry api, which imo is easier to read

Copy link
Contributor

@fulmicoton fulmicoton Oct 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. I was avoiding the needless 10k allocation, but yeah, maybe it does not matter. I will switch to the entry API.

.into_group_map();
for meta in splits_metadata_to_delete {
if !storages.contains_key(&meta.index_uid) {
info!(index_uid=?meta.index_uid, "split not listed in storage map: skipping");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this won't log often, but when it does, it may log a lot. this should be rate-limited

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point

Copy link

github-actions bot commented Oct 15, 2024

On SSD:

Average search latency is 1.0x that of the reference (lower is better).
Ref run id: 3782, ref commit: 184286b
Link

On GCS:

Average search latency is 0.981x that of the reference (lower is better).
Ref run id: 3783, ref commit: 184286b
Link

Copy link
Contributor Author

@trinity-1686a trinity-1686a left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟢 (i can't approve, this is technically my pr)

@fulmicoton fulmicoton merged commit 6b4acd0 into main Oct 18, 2024
5 checks passed
@fulmicoton fulmicoton deleted the trinity/gc-no-index_uid-in branch October 18, 2024 01:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants