-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Load knn vectors format with mmapfs #78724
Conversation
Before the format used niofs. The current knn vectors implementation is based on the HNSW algorithm, which is designed for the case where the graph and vectors are be held in memory. Switching to mmapfs from niofs made a huge difference in ANN benchmarks, speeding up some searches over 3x.
Example benchmark results on nio (NIOFSDirectory)
mmap (MMapDirectory)
|
Pinging @elastic/es-search (Team:Search) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jtibshirani Oh very nice, indeed impressive gains, especially with higher recall, more than 3x times! Great notice, Julie!.
…' into feature/data_stream_support_routing * wjp/feature/data_stream_support_routing: (44 commits) Revert "Adjust /_cat/templates not to request all metadata (elastic#78812)" Allow indices lookup to be built lazily (elastic#78745) [DOCS] Document default security in alpha2 (elastic#78227) Add cluster applier stats (elastic#77552) Fix failing URLDecodeProcessorTests::testProcessor test (elastic#78690) Upgrade to lucene snapshot ba75dc5e6bf (elastic#78817) Adjust /_cat/templates not to request all metadata (elastic#78812) Simplify build plugin license handling (elastic#77009) Fix SearchableSnapshotsBlobStoreCacheIntegTests.testBlobStoreCache (elastic#78616) Improve Docker image caching and testing (elastic#78552) Load knn vectors format with mmapfs (elastic#78724) Fix date math zone test to use negative minutes (elastic#78796) Changing name of shards field in node/stats api to shard_stats (elastic#78531) [DOCS] Fix system index refs in restore tutorial (elastic#78582) Add previously removed settings back for 8.0 (elastic#78784) TSDB: Fix template name in test Add a system property to forcibly format everything (elastic#78768) Revert "Adding config so that some tests will break if over-the-wire encryption fails (elastic#78409)" (elastic#78787) Must date math test failure Adding config so that some tests will break if over-the-wire encryption fails (elastic#78409) ...
Before the format used niofs. The current knn vectors implementation is based on
the HNSW algorithm, which is designed for the case where the graph and vectors
are be held in memory. Switching to mmapfs from niofs made a big difference in
ANN benchmarks, speeding up some searches over 3x.
Relates to #78473.