-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
enhance: speed up search iterator stage 1 #37947
base: master
Are you sure you want to change the base?
Conversation
17da513
to
a6bd30c
Compare
@PwzXxm cpp-unit-test check failed, comment |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #37947 +/- ##
===========================================
+ Coverage 69.32% 80.98% +11.65%
===========================================
Files 292 1381 +1089
Lines 26183 194670 +168487
===========================================
+ Hits 18152 157644 +139492
- Misses 8031 31463 +23432
- Partials 0 5563 +5563
|
rerun cpp-unit-test |
@PwzXxm cpp-unit-test check failed, comment |
00a762f
to
97467bf
Compare
@PwzXxm E2e jenkins job failed, comment |
@PwzXxm go-sdk check failed, comment |
@PwzXxm cpp-unit-test check failed, comment |
97467bf
to
e74fbfd
Compare
@PwzXxm cpp-unit-test check failed, comment |
@PwzXxm E2e jenkins job failed, comment |
e74fbfd
to
de74b0c
Compare
@PwzXxm cpp-unit-test check failed, comment |
/hold |
/unhold |
@PwzXxm cpp-unit-test check failed, comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: just some questions
heap.pop(); | ||
|
||
// last_bound may change between NextBatch calls, discard any invalid results | ||
if (!IsValid(cur_rst, last_bound, radius, range_filter)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so v2 iterator will not return better results compared to former iterations page?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not in this stage, the next one will try to take care of this.
const float dist = result.first; | ||
const bool is_valid = | ||
!last_bound.has_value() || dist > last_bound.value(); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: no need to consider the positive or negative metrics for dist and last_bound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The distances are converted when entering this class, no need to worry about it here
@@ -124,6 +125,19 @@ SearchOnGrowing(const segcore::SegmentGrowingImpl& segment, | |||
|
|||
// step 3: brute force search where small indexing is unavailable | |||
auto vec_ptr = record.get_data_base(vecfield_id); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cached itrator will be created every time, so what is 'cached'?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will introduce a pool of results in the next stage, as commented in https://github.com/milvus-io/milvus/pull/37947/files/9f6b88743198a575eb84cb427bcd41a7631676b7#diff-7344957165f4632a9363de767323618b7db0bd2d0f7cf7165965d3fb2612f18b. This class tries to provide a framework for the further implementation. If you think this name is confusing, I will change the naming if necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to bother, just follow your scheme
search_result.distances_.resize(nq_ * batch_size_); | ||
|
||
for (size_t query_idx = 0; query_idx < nq_; ++query_idx) { | ||
auto rst = GetBatchedNextResults(query_idx, search_info); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: seems that offsets and distances data retrieved are copied twice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The distance-id pairs need to be sorted before copy to the search_result
. Knowhere needs to provide the ability to give batched results via iterator to eliminate this copy.
bd0b450
to
8774ff5
Compare
@PwzXxm go-sdk check failed, comment |
8774ff5
to
9e649c7
Compare
@PwzXxm cpp-unit-test check failed, comment |
5b68678
to
1995a47
Compare
/hold |
/unhold |
/lgtm |
/hold |
1995a47
to
0552b7d
Compare
New changes are detected. LGTM label has been removed. |
@PwzXxm go-sdk check failed, comment |
rerun go-sdk |
Signed-off-by: Patrick Weizhi Xu <[email protected]>
0552b7d
to
9016c4a
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: PwzXxm The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@PwzXxm go-sdk check failed, comment |
rerun go-sdk |
issue: #37548