Skip to content

Commit

Permalink
[PERF] Use binary search in positional posting list (#2424)
Browse files Browse the repository at this point in the history
## Description of changes

*Summarize the changes made by this PR.*
 - Improvements & Bug fixes
- Uses binary search in PPL since entries are ordered, on a dataset of
1107 posting lists with ~450,000 doc ids, the aggregate time spent
inserting was taking ~100 seconds. This reduces it to .01 seconds. :)
 - New functionality
	 - none

## Test plan
*How are these changes tested?*
- [x] Tests pass locally with `pytest` for python, `yarn test` for js,
`cargo test` for rust

## Documentation Changes
None
  • Loading branch information
HammadB authored Jun 26, 2024
1 parent 7b3a751 commit e37a24c
Showing 1 changed file with 1 addition and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ impl PositionalPostingList {
}

pub(crate) fn get_positions_for_doc_id(&self, doc_id: i32) -> Option<Int32Array> {
let index = self.doc_ids.iter().position(|x| x == Some(doc_id));
let index = self.doc_ids.values().binary_search(&doc_id).ok();
match index {
Some(index) => {
let target_positions = self.positions.value(index);
Expand Down

0 comments on commit e37a24c

Please sign in to comment.