-
Notifications
You must be signed in to change notification settings - Fork 873
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prefetch page index (#4090) #4216
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @tustvold -- I went through this PR carefully and it looks really nice 👌
I wonder if the same basic pattern could be applied to the Bloom filters as well, or if they suffer from the issue that they don't actually appear in the footer 🤔
cc @thinkharderdev and @Ted-Jiang
/// the last 8 bytes to determine the footer's precise length, before | ||
/// issuing a second request to fetch the metadata bytes | ||
/// | ||
/// If a `prefetch` is `Some`, this will read the specified number of bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
let mut loader = MetadataLoader::load(f, len, Some(130650)).await.unwrap(); | ||
assert_eq!(fetch_count.load(Ordering::SeqCst), 1); | ||
loader.load_page_index(true, true).await.unwrap(); | ||
assert_eq!(fetch_count.load(Ordering::SeqCst), 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🎉
Thanks for ping me, i will review this carefully today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Yes, the design of this was with them in mind. Bloom filters can be stored at the end of the file, which would allow prefetching to help, I'm not sure the writer currently does this though |
Which issue does this PR close?
Closes #4090
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?
This adds
Send
constraints tofetch_parquet_metadata
, this is unlikely to trip people up in practice