Add documentation for k-NN Faiss SQfp16 #6249

naveentatikonda · 2024-01-24T00:13:29Z

Description

Add documentation for the new k-NN faiss encoder SQfp16 which quantizes 32 bit float vectors into 16 bit float values using Scalar Quantization results in memory optimization with a very minimal loss of precision. It also boosts the overall performance by enabling the SIMD support(vector dimension must be multiple of 8) on Linux and Mac OS.

Issues Resolved

Closes #5038

Checklist

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and subject to the Developers Certificate of Origin.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

kolchfa-aws

Thank you, @naveentatikonda! A couple of suggestions, and then we'll move this PR to editorial review.

_search-plugins/knn/knn-index.md

naveentatikonda · 2024-01-25T23:40:21Z

@kolchfa-aws Thanks for reviewing it. I have addressed your review comments.

natebower

@kolchfa-aws Please see my comments and changes and let me know if you have any questions. Thanks!

_search-plugins/knn/knn-index.md

naveentatikonda · 2024-02-02T22:04:47Z

@kolchfa-aws @natebower I need to make more changes to this existing documentation. Will address all the review comments and update it on Monday.

naveentatikonda · 2024-02-06T16:55:36Z

Unfortunately, we need to postpone this feature to 2.13 due to some build related issues. @kolchfa-aws can you pls help to update the labels on the PR and github issue. Thanks!
opensearch-project/opensearch-build#4386 (comment)

hdhalter · 2024-03-12T19:03:10Z

@naveentatikonda - Has anything changed, or is this content good to go? Thanks!

naveentatikonda · 2024-03-12T20:35:06Z

@naveentatikonda - Has anything changed, or is this content good to go? Thanks!

This documentation needs to be updated. I will make changes this week. Thanks!

Signed-off-by: Naveen Tatikonda <[email protected]>

kolchfa-aws · 2024-03-22T17:54:18Z

@naveentatikonda I addressed your comments.

_search-plugins/knn/knn-vector-quantization.md

Signed-off-by: kolchfa-aws <[email protected]>

Signed-off-by: Fanit Kolchina <[email protected]>

…tion-website into add_knn_sqfp16

Signed-off-by: Fanit Kolchina <[email protected]>

jmazanec15 · 2024-03-28T16:07:27Z

@kolchfa-aws can this be merged?

kolchfa-aws · 2024-03-28T16:10:13Z

@jmazanec15 @naveentatikonda requested a tech review on this PR. Once the tech review is done, we will do an editorial review and then we'll merge.

naveentatikonda · 2024-03-28T16:13:33Z

@kolchfa-aws can this be merged?

@jmazanec15 I'm waiting for @vamshin to review this PR before moving it to editorial review

vamshin · 2024-03-28T17:32:53Z

_search-plugins/knn/knn-index.md

 ## Lucene byte vector

 Starting with k-NN plugin version 2.9, you can use `byte` vectors with the `lucene` engine in order to reduce the amount of storage space needed. For more information, see [Lucene byte vector]({{site.url}}{{site.baseurl}}/field-types/supported-field-types/knn-vector#lucene-byte-vector).

+## SIMD optimization for the Faiss engine
+
+Starting with version 2.13, the k-NN plugin supports [Single Instruction Multiple Data (SIMD)](https://en.wikipedia.org/wiki/Single_instruction,_multiple_data) processing if the underlying hardware supports SIMD instructions (AVX2 on x64 architecture and Neon on ARM64 architecture). SIMD is supported by default on Linux machines only for the Faiss engine. SIMD architecture helps boost the overall performance by improving indexing throughput and reducing search latency.


SIMD is supported by default on Linux machines only for the Faiss engine.

SIMD should be CPU architecture dependent right? Why do we say only Linux machine?

Yes, SIMD is CPU architecture dependent. But, right now we are running into some issues on Windows OS due to some limitations with compiler and supporting SIMD for linux OS and mac OS (for development only). So, that's the reason we are explicitly calling it out that it works on linux.

vamshin · 2024-03-28T17:35:20Z

_search-plugins/knn/knn-index.md

-You can use encoders to reduce the memory footprint of a k-NN index at the expense of search accuracy. faiss has
-several encoder types, but the plugin currently only supports *flat* and *pq* encoding.
+You can use encoders to reduce the memory footprint of a k-NN index at the expense of search accuracy. Faiss has
+several encoder types, but the plugin currently only supports `flat`, `pq`, and `sq` encoding.


Faiss has
several encoder types, but the plugin currently only supports flat, pq, and sq encoding

k-NN plugin currently supports flat, pq, and sq encoders from Faiss library?.

vamshin · 2024-03-28T17:38:50Z

_search-plugins/knn/knn-index.md

+
+Parameter name | Required | Default | Updatable | Description
+:--- | :--- | :-- | :--- | :---
+`type` | false | `fp16` | false |  The type of scalar quantization to be used to encode 32-bit float vectors into the corresponding type. As of OpenSearch 2.13, only the `fp16` encoder type is supported. For the `fp16` encoder, vector values must be in the [-65504.0, 65504.0] range. 


For the fp16 encoder, vector values must be in the [-65504.0, 65504.0] range.

By default fp16 encoder expects vector values to be in the [-65504.0, 65504.0] range.

Also lets add above as Note and probably bold/highlight

We normally don't format sentences as a note in the parameter table.

got it. Shall we add a note about this inside faiss scalar quantization section ?

vamshin · 2024-03-28T17:47:07Z

_search-plugins/knn/knn-index.md

@@ -221,6 +322,8 @@ If you want to use less memory and index faster than HNSW, while maintaining sim

 If memory is a concern, consider adding a PQ encoder to your HNSW or IVF index. Because PQ is a lossy encoding, query quality will drop.

+If you want to reduce the memory requirements by a factor of 2 (with very minimal loss of search quality) or by a factor of 4 (with a significant drop in search quality), consider vector quantization. To learn more about vector quantization options, see [k-NN vector quantization]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-vector-quantization/). 


You can reduce the memory footprint by factor of 2 by using fp_16 encoder technique(provide link?) with minimal loss in search quality. If your vector dimensions fit in the byte range [-128, 128] we recommend using byte quantizer(provide link?) to cut down memory footprint by factor of 4.

The byte range is [-128, 127], correct?

yes, byte range is [-128 to 127]

Signed-off-by: Fanit Kolchina <[email protected]>

vamshin

LGTM! Thanks

natebower

@naveentatikonda @kolchfa-aws Please see my comments and changes and let me know if you have any questions. Thanks!

_search-plugins/knn/knn-index.md

natebower · 2024-03-29T11:52:50Z

_search-plugins/knn/knn-vector-quantization.md

+
+Optionally, you can specify the parameters in `method.parameters.encoder`. For more information about parameters within the `encoder` object, see [SQ parameters]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/#sq-parameters).
+
+The `fp16` encoder converts 32-bit vectors into their 16-bit counterparts. For this encoder type, the vector values must be in the [-65504.0, 65504.0] range. To define handling out-of-range values, the preceding request specifies the `clip` parameter. By default, this parameter is `false` and any vectors containing out-of-range values are rejected. When `clip` is set to `true` (as in the preceding request), out-of-range vector values are rounded up or down so that they are in the supported range. For example, if the original 32-bit vector is `[65510.82, -65504.1]`, the vector will indexed as a 16-bit vector `[65504.0, -65504.0]`.


What do we mean by "To define handling"?

_search-plugins/knn/knn-vector-quantization.md

_search-plugins/knn/knn-index.md

_search-plugins/knn/knn-vector-quantization.md

Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>

Signed-off-by: Fanit Kolchina <[email protected]>

_search-plugins/knn/knn-index.md

Signed-off-by: kolchfa-aws <[email protected]>

_search-plugins/knn/knn-index.md

Signed-off-by: kolchfa-aws <[email protected]>

kolchfa-aws self-assigned this Jan 24, 2024

kolchfa-aws added v2.12.0 release-notes PR: Include this PR in the automated release notes 4 - Doc review PR: Doc review in progress labels Jan 24, 2024

kolchfa-aws approved these changes Jan 25, 2024

View reviewed changes

_search-plugins/knn/knn-index.md Outdated Show resolved Hide resolved

_search-plugins/knn/knn-index.md Outdated Show resolved Hide resolved

_search-plugins/knn/knn-index.md Outdated Show resolved Hide resolved

naveentatikonda force-pushed the add_knn_sqfp16 branch from f73a858 to 9ef2e13 Compare January 25, 2024 23:39

naveentatikonda marked this pull request as ready for review January 25, 2024 23:39

naveentatikonda requested review from hdhalter, Naarcha-AWS, vagimeli, AMoo-Miki, natebower and dlvenable as code owners January 25, 2024 23:39

natebower reviewed Jan 26, 2024

View reviewed changes

naveentatikonda mentioned this pull request Jan 30, 2024

Change the Artifact Build Process for k-NN Plugin opensearch-project/opensearch-build#4386

Closed

kolchfa-aws added v2.13.0 and removed 4 - Doc review PR: Doc review in progress v2.12.0 labels Feb 6, 2024

hdhalter added the 3 - Tech review PR: Tech review in progress label Mar 4, 2024

naveentatikonda force-pushed the add_knn_sqfp16 branch from 9ef2e13 to 3541335 Compare March 18, 2024 22:52

naveentatikonda requested a review from stephen-crawford as a code owner March 18, 2024 22:52

naveentatikonda force-pushed the add_knn_sqfp16 branch 2 times, most recently from fd2db18 to 7459e8d Compare March 18, 2024 23:22

Add Documentation for k-NN Faiss SQFP16

8980923

Signed-off-by: Naveen Tatikonda <[email protected]>

naveentatikonda force-pushed the add_knn_sqfp16 branch from 7459e8d to 8980923 Compare March 18, 2024 23:27

kolchfa-aws reviewed Mar 22, 2024

View reviewed changes

_search-plugins/knn/knn-vector-quantization.md Outdated Show resolved Hide resolved

kolchfa-aws and others added 7 commits March 22, 2024 14:10

Update _search-plugins/knn/knn-vector-quantization.md

1e57c91

Signed-off-by: kolchfa-aws <[email protected]>

Add note about SIMD

533f594

Signed-off-by: Fanit Kolchina <[email protected]>

Merge branch 'add_knn_sqfp16' of github.com:naveentatikonda/documenta…

3e2f0be

…tion-website into add_knn_sqfp16

Reworded recall loss

318ab5b

Signed-off-by: Fanit Kolchina <[email protected]>

Reword according to tech review feedback

b98837f

Signed-off-by: Fanit Kolchina <[email protected]>

Tech review comment

b26511a

Signed-off-by: Fanit Kolchina <[email protected]>

Add warning about Windows

e84d905

Signed-off-by: Fanit Kolchina <[email protected]>

vamshin reviewed Mar 28, 2024

View reviewed changes

Tech review comments

4d62a9b

Signed-off-by: Fanit Kolchina <[email protected]>

vamshin approved these changes Mar 28, 2024

View reviewed changes

natebower approved these changes Mar 29, 2024

View reviewed changes

hdhalter added 5 - Editorial review PR: Editorial review in progress and removed 4 - Doc review PR: Doc review in progress labels Mar 29, 2024

kolchfa-aws reviewed Mar 29, 2024

View reviewed changes

_search-plugins/knn/knn-index.md Outdated Show resolved Hide resolved

kolchfa-aws reviewed Mar 29, 2024

View reviewed changes

_search-plugins/knn/knn-vector-quantization.md Outdated Show resolved Hide resolved

kolchfa-aws and others added 2 commits March 29, 2024 10:54

Apply suggestions from code review

6a6d38e

Co-authored-by: Nathan Bower <[email protected]> Signed-off-by: kolchfa-aws <[email protected]>

Define IVF

341daad

Signed-off-by: Fanit Kolchina <[email protected]>

kolchfa-aws reviewed Mar 29, 2024

View reviewed changes

_search-plugins/knn/knn-index.md Outdated Show resolved Hide resolved

Update _search-plugins/knn/knn-index.md

74c4c75

Signed-off-by: kolchfa-aws <[email protected]>

kolchfa-aws reviewed Mar 29, 2024

View reviewed changes

_search-plugins/knn/knn-index.md Outdated Show resolved Hide resolved

kolchfa-aws added 2 commits March 29, 2024 11:41

Update _search-plugins/knn/knn-index.md

9a6c4e2

Signed-off-by: kolchfa-aws <[email protected]>

Merge branch 'main' into add_knn_sqfp16

900478d

Signed-off-by: kolchfa-aws <[email protected]>

kolchfa-aws merged commit 5d9edcb into opensearch-project:main Mar 29, 2024
3 checks passed

hdhalter added 3 - Done Issue is done/complete and removed 5 - Editorial review PR: Editorial review in progress labels Mar 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add documentation for k-NN Faiss SQfp16 #6249

Add documentation for k-NN Faiss SQfp16 #6249

naveentatikonda commented Jan 24, 2024 •

edited by hdhalter

Loading

kolchfa-aws left a comment

naveentatikonda commented Jan 25, 2024

natebower left a comment

naveentatikonda commented Feb 2, 2024

naveentatikonda commented Feb 6, 2024

hdhalter commented Mar 12, 2024

naveentatikonda commented Mar 12, 2024

kolchfa-aws commented Mar 22, 2024

jmazanec15 commented Mar 28, 2024

kolchfa-aws commented Mar 28, 2024

naveentatikonda commented Mar 28, 2024

vamshin Mar 28, 2024

naveentatikonda Mar 28, 2024

vamshin Mar 28, 2024

naveentatikonda Mar 28, 2024

vamshin Mar 28, 2024

naveentatikonda Mar 28, 2024

kolchfa-aws Mar 28, 2024

naveentatikonda Mar 28, 2024

vamshin Mar 28, 2024

naveentatikonda Mar 28, 2024

kolchfa-aws Mar 28, 2024

naveentatikonda Mar 28, 2024

vamshin left a comment

natebower left a comment

natebower Mar 29, 2024

kolchfa-aws Mar 29, 2024

		@@ -221,6 +322,8 @@ If you want to use less memory and index faster than HNSW, while maintaining sim

		If memory is a concern, consider adding a PQ encoder to your HNSW or IVF index. Because PQ is a lossy encoding, query quality will drop.

		If you want to reduce the memory requirements by a factor of 2 (with very minimal loss of search quality) or by a factor of 4 (with a significant drop in search quality), consider vector quantization. To learn more about vector quantization options, see [k-NN vector quantization]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-vector-quantization/).


		Optionally, you can specify the parameters in `method.parameters.encoder`. For more information about parameters within the `encoder` object, see [SQ parameters]({{site.url}}{{site.baseurl}}/search-plugins/knn/knn-index/#sq-parameters).

		The `fp16` encoder converts 32-bit vectors into their 16-bit counterparts. For this encoder type, the vector values must be in the [-65504.0, 65504.0] range. To define handling out-of-range values, the preceding request specifies the `clip` parameter. By default, this parameter is `false` and any vectors containing out-of-range values are rejected. When `clip` is set to `true` (as in the preceding request), out-of-range vector values are rounded up or down so that they are in the supported range. For example, if the original 32-bit vector is `[65510.82, -65504.1]`, the vector will indexed as a 16-bit vector `[65504.0, -65504.0]`.

Add documentation for k-NN Faiss SQfp16 #6249

Add documentation for k-NN Faiss SQfp16 #6249

Conversation

naveentatikonda commented Jan 24, 2024 • edited by hdhalter Loading

Description

Issues Resolved

Checklist

kolchfa-aws left a comment

Choose a reason for hiding this comment

naveentatikonda commented Jan 25, 2024

natebower left a comment

Choose a reason for hiding this comment

naveentatikonda commented Feb 2, 2024

naveentatikonda commented Feb 6, 2024

hdhalter commented Mar 12, 2024

naveentatikonda commented Mar 12, 2024

kolchfa-aws commented Mar 22, 2024

jmazanec15 commented Mar 28, 2024

kolchfa-aws commented Mar 28, 2024

naveentatikonda commented Mar 28, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vamshin left a comment

Choose a reason for hiding this comment

natebower left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

naveentatikonda commented Jan 24, 2024 •

edited by hdhalter

Loading