-
Notifications
You must be signed in to change notification settings - Fork 137
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Partial loading implementation for FAISS HNSW #2405
base: main
Are you sure you want to change the base?
Partial loading implementation for FAISS HNSW #2405
Conversation
Signed-off-by: Dooyong Kim <[email protected]>
Please note that will make sure all |
Signed-off-by: Dooyong Kim <[email protected]>
|
||
package org.opensearch.knn.partialloading; | ||
|
||
public class KdyPerfCheck { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is temp class for tracking performance.
Will be removed before merging to main.
@@ -106,7 +106,7 @@ public void flush(int maxDoc, final Sorter.DocMap sortMap) throws IOException { | |||
final QuantizationState quantizationState = train(field.getFieldInfo(), knnVectorValuesSupplier, totalLiveDocs); | |||
// Check only after quantization state writer finish writing its state, since it is required | |||
// even if there are no graph files in segment, which will be later used by exact search | |||
if (shouldSkipBuildingVectorDataStructure(totalLiveDocs)) { | |||
if (false /*TMP*/ && shouldSkipBuildingVectorDataStructure(totalLiveDocs)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is temp code. Will revert it back before merging.
Partial Loading Code Review Breaks Down1. GoalThis document provides a comprehensive overview of a big PR on partial loading to minimize the time required for reviewers to complete a review. 2. ScopeDesign Document : RFC 1. Supported Vector Types
2.. Supported Metrics
3.. Filtered Query
4.. Nested Vectors
5.. Sparse Vector Documents
3. Break DownsThe PR can be divided into two main parts, with the search part further split into five subparts:
4. [Part 1] Index partial loading
5. [Part 2] Search2.1. Partial Loading Basic Framework
2.2. Normal Case — Happy PathThis is the straightforward case with no filtering IDs, parent IDs, and all documents having indexed vectors.
2.3. Having a FilteringWith Filtering:
No Integer List Conversion:
2.4. Having Parent IdsParent IDs Handling:
Conversion to BitSet:
Grouper Creation:
Parent-Level BFS in HNSW:
2.5. Sparse Vector Documents
|
Description
RFC : #2401
OpenSearch KNN plugin supports three engines: NMSLIB, FAISS, and Lucene.
The first two native engines, NMSLIB and FAISS, require all vector-related data structures (such as HNSW graphs) to be loaded into memory for search operation.
For large workloads, this memory cost can quickly become substantial if quantization techniques are not applied.
Therefore, 'Partial Loading' must be enabled as an option in native engines to control the available memory for KNN search. The objective of partial loading is twofold:
To allow users to control the maximum memory available for KNN searching.
To enable native engines to partially load only the necessary data within the constraint.
If we look closely a HNSW graph mainly consist of below things:
Full precision 32 bit vectors.
Graph representations.
Metadata like dimensions, number of vectors, space type, headers etc.
From the above items, main memory is used by these full precision vectors 4 bytes * the number of vectors * the number of dimension.
The way FAISS stores these vectors is in a Flat Index and during serialization and deserialization these vectors are written and read to/from the file and put in the main memory which increases the memory consumption.
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
#2401
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.