-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-Call Issue Fix with Binary Quantized Vectors #2071
Conversation
Signed-off-by: VIKASH TIWARI <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall code looks good to me. There are some minor comments. See how you want to resolve it.
Apart from this I would recommend doing 2 things:
- Update the description of the PR with details on what are the bugs which was found.
- Also add the details on how you tested these, like if you have ran a benchmarks to see the recall add that too.
...java/org/opensearch/knn/quantization/models/quantizationOutput/BinaryQuantizationOutput.java
Show resolved
Hide resolved
...java/org/opensearch/knn/quantization/models/quantizationOutput/BinaryQuantizationOutput.java
Show resolved
Hide resolved
src/main/java/org/opensearch/knn/index/codec/nativeindex/QuantizationIndexUtils.java
Show resolved
Hide resolved
b6e6ed1
to
a390c1d
Compare
src/main/java/org/opensearch/knn/index/codec/nativeindex/QuantizationIndexUtils.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor nit but not blocking. Otherwise, LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like there are some merge conflicts that need to be fixed @Vikasht34
Signed-off-by: VIKASH TIWARI <[email protected]>
Signed-off-by: Vikasht34 <[email protected]>
* Re-Call Issue Fix with Binary Quantized Vectors Signed-off-by: VIKASH TIWARI <[email protected]> * Feedback Fix Signed-off-by: VIKASH TIWARI <[email protected]> --------- Signed-off-by: VIKASH TIWARI <[email protected]> Signed-off-by: Vikasht34 <[email protected]> (cherry picked from commit ce735c4)
* Re-Call Issue Fix with Binary Quantized Vectors Signed-off-by: VIKASH TIWARI <[email protected]> * Feedback Fix Signed-off-by: VIKASH TIWARI <[email protected]> --------- Signed-off-by: VIKASH TIWARI <[email protected]> Signed-off-by: Vikasht34 <[email protected]> (cherry picked from commit ce735c4)
@@ -33,4 +58,11 @@ public interface QuantizationOutput<T> { | |||
* @return true if the quantized vector is already prepared, false otherwise. | |||
*/ | |||
boolean isPrepared(int vectorLength); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: We should be removing this as its a dead code at this point, we can add it back if we ever need it in the future
quantizationOutput.prepareQuantizedVector(vectorLength); | ||
|
||
// Assert | ||
assertNotNull(quantizationOutput.getQuantizedVector()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not null isn't tight enough. Can you check if the length is 10 and its all 0?
@Before | ||
public void setUp() throws Exception { | ||
super.setUp(); | ||
quantizationOutput = new BinaryQuantizationOutput(BITS_PER_COORDINATE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This member variable seems to be creating interdependency between tests, can we create this for every test please
* Returns a copy of the quantized vector. This is because of during transfer same vectors was getting | ||
* added due to reference. | ||
*/ | ||
return indexBuildSetup.getQuantizationOutput().getQuantizedVectorCopy(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we update the unit test for this class to make sure we always return a copy? we need a assertNotSame
check in the unit test of this class
* Re-Call Issue Fix with Binary Quantized Vectors Signed-off-by: VIKASH TIWARI <[email protected]> * Feedback Fix Signed-off-by: VIKASH TIWARI <[email protected]> --------- Signed-off-by: VIKASH TIWARI <[email protected]> Signed-off-by: Vikasht34 <[email protected]> (cherry picked from commit ce735c4) Co-authored-by: Vikasht34 <[email protected]>
Description
This PR addresses a critical issue that was identified during benchmarking, where the recall performance unexpectedly dropped below 1. The root cause of the issue was traced to two main problems in the quantization and vector handling process:
Testing
Performed Benchmarking with NQ Data Set Results are here s3://disk-based-ann-bq/NQ-1M-768/
Re-Call for One Bit Quantization with 2x Oversampling us 0.94
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.