Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move free vectorAddress from Java to JNI layer to reduce the memory footprint for Nmslib. #1602

Conversation

navneet1v
Copy link
Collaborator

@navneet1v navneet1v commented Apr 9, 2024

Description

Move free vectorAddress from Java to JNI layer to reduce the memory footprint for Nmslib.

Earlier we were freeing up memory in the Java layer where the thought process was as Java layer has init the memory location it should free up the space.

But due to limitations provided here: #1600 for nmslib doing that can triple the memory footprint for index creation. Hence, we are freeing up the memory in the JNI layer.

This code will be removed once we do the long term fix via #1600

JNI Test output

(base) 13:22 ~/workplace/k-NN/jni (stream-vectors-v2)$ ./bin/jni_test 
Running main() from /Users/navneev/workplace/k-NN/jni/googletest-src/googletest/src/gtest_main.cc
[==========] Running 22 tests from 20 test suites.
[----------] Global test environment set-up.
[----------] 1 test from FaissCreateIndexTest
[ RUN      ] FaissCreateIndexTest.BasicAssertions
[       OK ] FaissCreateIndexTest.BasicAssertions (11 ms)
[----------] 1 test from FaissCreateIndexTest (11 ms total)

[----------] 1 test from FaissCreateIndexFromTemplateTest
[ RUN      ] FaissCreateIndexFromTemplateTest.BasicAssertions
[       OK ] FaissCreateIndexFromTemplateTest.BasicAssertions (4 ms)
[----------] 1 test from FaissCreateIndexFromTemplateTest (4 ms total)

[----------] 3 tests from FaissLoadIndexTest
[ RUN      ] FaissLoadIndexTest.BasicAssertions
[       OK ] FaissLoadIndexTest.BasicAssertions (5 ms)
[ RUN      ] FaissLoadIndexTest.HNSWPQDisableSdcTable
WARNING clustering 256 points to 16 centroids: please provide at least 624 training points
[       OK ] FaissLoadIndexTest.HNSWPQDisableSdcTable (420 ms)
[ RUN      ] FaissLoadIndexTest.IVFPQDisablePrecomputeTable
WARNING clustering 256 points to 16 centroids: please provide at least 624 training points
[       OK ] FaissLoadIndexTest.IVFPQDisablePrecomputeTable (414 ms)
[----------] 3 tests from FaissLoadIndexTest (840 ms total)

[----------] 1 test from FaissQueryIndexTest
[ RUN      ] FaissQueryIndexTest.BasicAssertions
[       OK ] FaissQueryIndexTest.BasicAssertions (6 ms)
[----------] 1 test from FaissQueryIndexTest (6 ms total)

[----------] 1 test from FaissQueryIndexWithFilterTest1435
[ RUN      ] FaissQueryIndexWithFilterTest1435.BasicAssertions
[       OK ] FaissQueryIndexWithFilterTest1435.BasicAssertions (11 ms)
[----------] 1 test from FaissQueryIndexWithFilterTest1435 (11 ms total)

[----------] 1 test from FaissQueryIndexWithParentFilterTest
[ RUN      ] FaissQueryIndexWithParentFilterTest.BasicAssertions
[       OK ] FaissQueryIndexWithParentFilterTest.BasicAssertions (5 ms)
[----------] 1 test from FaissQueryIndexWithParentFilterTest (5 ms total)

[----------] 1 test from FaissFreeTest
[ RUN      ] FaissFreeTest.BasicAssertions
[       OK ] FaissFreeTest.BasicAssertions (0 ms)
[----------] 1 test from FaissFreeTest (0 ms total)

[----------] 1 test from FaissInitLibraryTest
[ RUN      ] FaissInitLibraryTest.BasicAssertions
[       OK ] FaissInitLibraryTest.BasicAssertions (0 ms)
[----------] 1 test from FaissInitLibraryTest (0 ms total)

[----------] 1 test from FaissTrainIndexTest
[ RUN      ] FaissTrainIndexTest.BasicAssertions
[       OK ] FaissTrainIndexTest.BasicAssertions (0 ms)
[----------] 1 test from FaissTrainIndexTest (0 ms total)

[----------] 1 test from FaissCreateHnswSQfp16IndexTest
[ RUN      ] FaissCreateHnswSQfp16IndexTest.BasicAssertions
[       OK ] FaissCreateHnswSQfp16IndexTest.BasicAssertions (5 ms)
[----------] 1 test from FaissCreateHnswSQfp16IndexTest (5 ms total)

[----------] 1 test from FaissIsSharedIndexStateRequired
[ RUN      ] FaissIsSharedIndexStateRequired.BasicAssertions
[       OK ] FaissIsSharedIndexStateRequired.BasicAssertions (0 ms)
[----------] 1 test from FaissIsSharedIndexStateRequired (0 ms total)

[----------] 1 test from FaissInitAndSetSharedIndexState
[ RUN      ] FaissInitAndSetSharedIndexState.BasicAssertions
WARNING clustering 256 points to 16 centroids: please provide at least 624 training points
[       OK ] FaissInitAndSetSharedIndexState.BasicAssertions (368 ms)
[----------] 1 test from FaissInitAndSetSharedIndexState (368 ms total)

[----------] 1 test from IDGrouperBitMapTest
[ RUN      ] IDGrouperBitMapTest.BasicAssertions
[       OK ] IDGrouperBitMapTest.BasicAssertions (0 ms)
[----------] 1 test from IDGrouperBitMapTest (0 ms total)

[----------] 1 test from NmslibIndexWrapperSearchTest
[ RUN      ] NmslibIndexWrapperSearchTest.BasicAssertions
[       OK ] NmslibIndexWrapperSearchTest.BasicAssertions (0 ms)
[----------] 1 test from NmslibIndexWrapperSearchTest (0 ms total)

[----------] 1 test from NmslibCreateIndexTest
[ RUN      ] NmslibCreateIndexTest.BasicAssertions
[       OK ] NmslibCreateIndexTest.BasicAssertions (3 ms)
[----------] 1 test from NmslibCreateIndexTest (3 ms total)

[----------] 1 test from NmslibLoadIndexTest
[ RUN      ] NmslibLoadIndexTest.BasicAssertions
[       OK ] NmslibLoadIndexTest.BasicAssertions (1 ms)
[----------] 1 test from NmslibLoadIndexTest (2 ms total)

[----------] 1 test from NmslibQueryIndexTest
[ RUN      ] NmslibQueryIndexTest.BasicAssertions
[       OK ] NmslibQueryIndexTest.BasicAssertions (2 ms)
[----------] 1 test from NmslibQueryIndexTest (2 ms total)

[----------] 1 test from NmslibFreeTest
[ RUN      ] NmslibFreeTest.BasicAssertions
[       OK ] NmslibFreeTest.BasicAssertions (0 ms)
[----------] 1 test from NmslibFreeTest (0 ms total)

[----------] 1 test from NmslibInitLibraryTest
[ RUN      ] NmslibInitLibraryTest.BasicAssertions
[       OK ] NmslibInitLibraryTest.BasicAssertions (0 ms)
[----------] 1 test from NmslibInitLibraryTest (0 ms total)

[----------] 1 test from CommonsTests
[ RUN      ] CommonsTests.BasicAssertions
[       OK ] CommonsTests.BasicAssertions (0 ms)
[----------] 1 test from CommonsTests (0 ms total)

[----------] Global test environment tear-down
[==========] 22 tests from 20 test suites ran. (1265 ms total)
[  PASSED  ] 22 tests.

Issues Resolved

#1506

Check List

  • New functionality includes testing.
    • All tests pass
  • New functionality has been documented.
    • New functionality has javadoc added
  • Commits are signed as per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@navneet1v navneet1v added skip-changelog v2.14.0 Enhancements Increases software capabilities beyond original client specifications labels Apr 9, 2024
Copy link
Member

@jmazanec15 jmazanec15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me

@navneet1v
Copy link
Collaborator Author

Fixing the tests.

@navneet1v navneet1v force-pushed the stream-vectors-v2 branch from 6719c99 to d6a98bb Compare April 9, 2024 21:27
@navneet1v navneet1v force-pushed the stream-vectors-v2 branch from d6a98bb to 0b8c2d0 Compare April 9, 2024 22:13
@navneet1v navneet1v merged commit ab16553 into opensearch-project:feature/stream-vectors Apr 9, 2024
43 of 48 checks passed
navneet1v added a commit to navneet1v/k-NN that referenced this pull request Apr 9, 2024
…layer to enable creation of larger segments for vector indices

Changes include:
1. Add the interface for streaming the vectors from java to jni layer with initial capacity (opensearch-project#1586)
2. Integrating storeVectors interfaces with createIndex and createIndexTemplate functions. (opensearch-project#1588)
3. Update KNN80BinaryDocValues reader count live docs and use live docs as initial capacity to initialize vector address(opensearch-project#1595)
4. Move free vectorAddress from Java to JNI layer to reduce the memory footprint for Nmslib (opensearch-project#1602)

Signed-off-by: Navneet Verma <[email protected]>
navneet1v added a commit to navneet1v/k-NN that referenced this pull request Apr 9, 2024
…layer to enable creation of larger segments for vector indices

Changes include:
1. Add the interface for streaming the vectors from java to jni layer with initial capacity (opensearch-project#1586)
2. Integrating storeVectors interfaces with createIndex and createIndexTemplate functions. (opensearch-project#1588)
3. Update KNN80BinaryDocValues reader count live docs and use live docs as initial capacity to initialize vector address(opensearch-project#1595)
4. Move free vectorAddress from Java to JNI layer to reduce the memory footprint for Nmslib (opensearch-project#1602)

Signed-off-by: Navneet Verma <[email protected]>
navneet1v added a commit to navneet1v/k-NN that referenced this pull request Apr 9, 2024
…layer to enable creation of larger segments for vector indices

Changes include:
1. Add the interface for streaming the vectors from java to jni layer with initial capacity (opensearch-project#1586)
2. Integrating storeVectors interfaces with createIndex and createIndexTemplate functions. (opensearch-project#1588)
3. Update KNN80BinaryDocValues reader count live docs and use live docs as initial capacity to initialize vector address(opensearch-project#1595)
4. Move free vectorAddress from Java to JNI layer to reduce the memory footprint for Nmslib (opensearch-project#1602)

Signed-off-by: Navneet Verma <[email protected]>
navneet1v added a commit that referenced this pull request Apr 10, 2024
…layer to enable creation of larger segments for vector indices (#1604)

Changes include:
1. Add the interface for streaming the vectors from java to jni layer with initial capacity (#1586)
2. Integrating storeVectors interfaces with createIndex and createIndexTemplate functions. (#1588)
3. Update KNN80BinaryDocValues reader count live docs and use live docs as initial capacity to initialize vector address(#1595)
4. Move free vectorAddress from Java to JNI layer to reduce the memory footprint for Nmslib (#1602)

Signed-off-by: Navneet Verma <[email protected]>
navneet1v added a commit to navneet1v/k-NN that referenced this pull request Apr 10, 2024
…layer to enable creation of larger segments for vector indices (opensearch-project#1604)

Changes include:
1. Add the interface for streaming the vectors from java to jni layer with initial capacity (opensearch-project#1586)
2. Integrating storeVectors interfaces with createIndex and createIndexTemplate functions. (opensearch-project#1588)
3. Update KNN80BinaryDocValues reader count live docs and use live docs as initial capacity to initialize vector address(opensearch-project#1595)
4. Move free vectorAddress from Java to JNI layer to reduce the memory footprint for Nmslib (opensearch-project#1602)

Signed-off-by: Navneet Verma <[email protected]>
navneet1v added a commit that referenced this pull request Apr 10, 2024
…layer to enable creation of larger segments for vector indices (#1604) (#1608)

Changes include:
1. Add the interface for streaming the vectors from java to jni layer with initial capacity (#1586)
2. Integrating storeVectors interfaces with createIndex and createIndexTemplate functions. (#1588)
3. Update KNN80BinaryDocValues reader count live docs and use live docs as initial capacity to initialize vector address(#1595)
4. Move free vectorAddress from Java to JNI layer to reduce the memory footprint for Nmslib (#1602)

Signed-off-by: Navneet Verma <[email protected]>
navneet1v added a commit to navneet1v/k-NN that referenced this pull request Apr 11, 2024
…layer to enable creation of larger segments for vector indices (opensearch-project#1604) (opensearch-project#1608)

Changes include:
1. Add the interface for streaming the vectors from java to jni layer with initial capacity (opensearch-project#1586)
2. Integrating storeVectors interfaces with createIndex and createIndexTemplate functions. (opensearch-project#1588)
3. Update KNN80BinaryDocValues reader count live docs and use live docs as initial capacity to initialize vector address(opensearch-project#1595)
4. Move free vectorAddress from Java to JNI layer to reduce the memory footprint for Nmslib (opensearch-project#1602)

Signed-off-by: Navneet Verma <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancements Increases software capabilities beyond original client specifications skip-changelog v2.14.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants