Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
InnerProduct Distance Metric for CAGRA search #2260
InnerProduct Distance Metric for CAGRA search #2260
Changes from 27 commits
b6d9980
fac65b8
ad48b03
2b8d898
e44ab17
c506216
cb7fbba
0029bba
293bc8f
c91d895
7a5f876
c37aa27
890372b
810ddd1
f92e68b
5bbbc70
7febd73
7e19937
e40b967
b102393
6da7d55
1e666e3
c1f4dcd
2e9e3fe
d8a4b39
c5cd0e7
eabb031
77ff0c2
b61c1c3
54061d0
dda1810
743d5bc
5791743
f5f30b7
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I like this approach as it doesn't follow the standard API design flow that we've been using in hte other algos. I wonder if instead of adding this additional helper function if we could instead just add an overloaded constructor to the
index_params
object that accepts an mdsan with the dataset and sets the defaults automatically. This would then follow the standard build flow but add an option for the user to establish the ressonable defaultsby passing in the dataset.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't add constructors to the
ivf_pq::index_params
struct because the aggregate assertion here. A helper is needed. One thing that can be done is that instead of a constructor within theindex_params
struct, we can have a helperget_default_index_params
outside of the struct, but inivf_pq_types.hpp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, I'd just remove the static assertion there. It's not necessary. My problem with exposing helper functions like this is that they make the APIs harder to use as they aren't immediatley obvious and don't provide for a consistent experience. Keeping everything together within the
index_params
object itself allows there to be a single entrypoint (e.g. index_params) for construction. The other option would be to have factory functions which would always be used for construction, however I think that will also make things even more confusing unless we used that pattern across all the ANN APis.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we are constructing CAGRA specific
ivf_pq::index_params
. I do not think the constructor ofivf_pq::index_params
shall have the job to define parameters according to CAGRA's requirements.For consistent experience, the user just need to fill the
metric
field ofcagra::index_params
, which is inherited from ann::index_params, and callcagra::build
as usual.Users of
build_knn_graph
are advanced users who are usingbuild_knn_graph
&optimize
in a separate step. Other algos don't have these steps, this is custom for CAGRA. Before this PR, the user either passes a manually filledivf_pq::index_params
, or just uses a default. I think a helper function is an improvement.