-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce a loading layer in NMSLIB. #2185
Introduce a loading layer in NMSLIB. #2185
Conversation
@@ -27,301 +27,354 @@ | |||
#include <string> | |||
|
|||
#include "hnswquery.h" | |||
#include "method/hnsw.h" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please search 'LoadIndexWithStream' in nmslib_wrapper.cpp
and ignore the formatting changes. 🫠
return (jlong) indexWrapper; | ||
} | ||
|
||
jlong knn_jni::nmslib_wrapper::LoadIndexWithStream(knn_jni::JNIUtilInterface *jniUtil, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the place for using an input stream to load index within NMSLIB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other loadFunction will be removed in next iterations right?
@@ -12,79 +12,102 @@ | |||
#include "org_opensearch_knn_jni_NmslibService.h" | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please ignore the formatting changes and directly go to Java_org_opensearch_knn_jni_NmslibService_loadIndexWithStream
.
} | ||
|
||
JNIEXPORT jobjectArray JNICALL Java_org_opensearch_knn_jni_NmslibService_queryIndex(JNIEnv * env, jclass cls, | ||
JNIEXPORT jlong JNICALL Java_org_opensearch_knn_jni_NmslibService_loadIndexWithStream(JNIEnv *env, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the new method added in this PR
src/main/java/org/opensearch/knn/index/memory/NativeMemoryAllocation.java
Show resolved
Hide resolved
@@ -182,7 +157,7 @@ class TrainingLoadStrategy | |||
NativeMemoryLoadStrategy<NativeMemoryAllocation.TrainingDataAllocation, NativeMemoryEntryContext.TrainingDataEntryContext>, | |||
Closeable { | |||
|
|||
private static TrainingLoadStrategy INSTANCE; | |||
private static volatile TrainingLoadStrategy INSTANCE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to make it volatile in Singleton pattern to avoid possible instruction order changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets add this as a java doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do in the next update
@@ -48,7 +48,7 @@ private void tripCb() throws Exception { | |||
createKnnIndex(indexName2, settings, createKnnIndexMapping(FIELD_NAME, 2)); | |||
|
|||
Float[] vector = { 1.3f, 2.2f }; | |||
int docsInIndex = 5; // through testing, 7 is minimum number of docs to trip circuit breaker at 1kb | |||
int docsInIndex = 7; // through testing, 7 is minimum number of docs to trip circuit breaker at 1kb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's weird though, setting this to 5, it was slightly less than 1KB making this test fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think overall it looks good. A couple of comments
jni/include/nmslib_stream_support.h
Outdated
} | ||
|
||
protected: | ||
std::streamsize xsgetn(std::streambuf::char_type *destination, std::streamsize count) final { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this name mean? Also, why is this access protected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please refer to the official doc : https://en.cppreference.com/w/cpp/io/basic_streambuf/sgetn
In short, xsgetn
is a protect virtual method defined in std::basic_streambuf
, which is called whenever sgetn
is called for the read delegation to the implementation.
x -> extended
s -> sequence
xs + getn = extended sequence getter function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh this is really cool
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add this explanation as comments on the function. Hard to remember for future.
src/main/java/org/opensearch/knn/index/memory/NativeMemoryAllocation.java
Show resolved
Hide resolved
Loading Time ComparisonThe numbers below were measured through
ObservationUnlike FAISS, it took almost 81% more time when loading a system cached file. ExperimentIndex size : 30G 1. Baseline (Using fread)
2. Using Stream (4KB)
|
Performance BenchmarkMachine :
|
you might want to update it to say loading layer for nmslib. |
Will holding the merging until root cause the big gap in warmup time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code looks good to me. Few minor comments. For me the biggest blocker right now to approve the code is increase in latency for warmups. Once we fix that, I will approve the code.
writeLock(); | ||
try { | ||
writeLock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any reason for moving this out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If an exception was thrown there, it will called unmatched unlock call in finally block.
I know the current implementation is not throwing any exceptions. Here I was trying to make the code robust no matter how other parts changed.
jni/include/nmslib_stream_support.h
Outdated
} | ||
|
||
protected: | ||
std::streamsize xsgetn(std::streambuf::char_type *destination, std::streamsize count) final { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add this explanation as comments on the function. Hard to remember for future.
return (jlong) indexWrapper; | ||
} | ||
|
||
jlong knn_jni::nmslib_wrapper::LoadIndexWithStream(knn_jni::JNIUtilInterface *jniUtil, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other loadFunction will be removed in next iterations right?
@@ -182,7 +157,7 @@ class TrainingLoadStrategy | |||
NativeMemoryLoadStrategy<NativeMemoryAllocation.TrainingDataAllocation, NativeMemoryEntryContext.TrainingDataEntryContext>, | |||
Closeable { | |||
|
|||
private static TrainingLoadStrategy INSTANCE; | |||
private static volatile TrainingLoadStrategy INSTANCE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lets add this as a java doc.
Performance tuning planPlanning to continue below two tuning plans.
TODO : Can we allocate an aligned memory layout?
|
Wont page cache typically be write through? In which case, if graph is created and written on same node it is searched on, wont it be cached? |
1. NMSLIB Loading Perf Issue Analysis2. Performance Degradation In FAISSAfter switching from direct file API usage to an abstract IO loading layer, additional overhead was introduced due to JNI calls and buffer copying via In NMSLIB, we expected a similar level of performance regression as seen in FAISS. However, we're observing a 70% increase in load time when loading a 6GB vector index. (baseline=4.144 sec, the modified one=7.503 sec) 3. Why is it more than twice as severe as in FAISS?The key performance difference in index loading between FAISS and NMSLIB stems from their file formats. FAISS stores chunks of the neighbor list in a single location and loads them all at once. See the code below:
In NMSLIB, each neighbor list is stored individually, requiring O(N) reads, where N is the total number of vectors.
4. Solution 1. Patch in NMSLIBWe can patch NMSLIB to avoid making JNI calls for each vector element. The idea is to load data in bulk, then parse the neighbor lists from that buffer, rather than reading bytes individually. This approach would reduce the number of JNI calls to O(Index size / Buffer size). For example, with a 6GB vector index containing 1 million vectors and a 64KB buffer size, the required JNI calls would be reduced to O(6GB / 64KB) = 98,304, which is a significant improvement over 1 million calls, achieving nearly a 90% reduction in operations. Result: Surprisingly, it is 8% faster than the baseline. (Note: I reindexed on a new single node, which is why the loading time differs from the one mentioned earlier in the issue.)
4.1 Pros
4.2 Cons
4.3. Patch in hnsw.cc
5. Solution 2. Disable Streaming When FSDirectorySince we're deprecating NMSLIB in version 3.x, we can disable loading layer in NMSLIB until then.
5.1. Pros :
5.2. Cons :
6. Solution 3. Live with it :)Since we're deprecating NMSLIB in version 3.x, we can tolerate this issue in the short term. 7. Micro Tuning Results
|
@navneet1v @jmazanec15 |
Sorry, I just saw it. Yes it is configured by default in pretty much general file system. The reasons that I assumed that it is going to be 'likely' write-back cache does not longer exist are in two fold:
Let me share your thoughts on it! You can call me an aggressive dreamer. 😛 |
@0ctopus13prime the approach of patching nmslib looks good to me. I think if it is providing a good latency we should do that. Since the pros of having a patch means improvements in load time and also getting away of FSDirectory dependency. |
Signed-off-by: Dooyong Kim <[email protected]>
Signed-off-by: Dooyong Kim <[email protected]>
…vector index. Signed-off-by: Dooyong Kim <[email protected]>
Signed-off-by: Dooyong Kim <[email protected]>
…ding index performance. Signed-off-by: Dooyong Kim <[email protected]>
a623149
to
2398d49
Compare
Signed-off-by: Dooyong Kim <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving. Please ans some of the clarifying questions.
+ | ||
+ template <typename dist_t> | ||
+ void Hnsw<dist_t>::LoadIndexWithStream(NmslibIOReader& input) { | ||
+ LOG(LIB_INFO) << "Loading index from an input stream(NmslibIOReader)."; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Question]:
is this info log enabled by default? Since I am not to see all the code from nmslib in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's disabled by default. (LIB_LOGNONE)
In nmslib_wrapper
void knn_jni::nmslib_wrapper::InitLibrary() {
similarity::initLibrary();
}
and in NMSLIB
void initLibrary(int seed = 0, LogChoice choice = LIB_LOGNONE, const char*pLogFile = NULL);
+ } | ||
+ | ||
+ template <typename dist_t> | ||
+ void Hnsw<dist_t>::LoadIndexWithStream(NmslibIOReader& input) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[Question]:
in this function what is the extra step we are doing to ensure that we are able to fetch more data from NmslibIOReader? I can understand some part where we might be doing it, but it will be good if you can point exactly what are the lines that improved the loading logic, so that it becomes easy to review that part itself..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added remainingBytes
method to return the number of unread bytes in IndexInputWithBuffer.
private long remainingBytes() {
return contentLength - indexInput.getFilePointer();
}
And in NMSLIB, within void Hnsw<dist_t>::LoadOptimizedIndex(NmslibIOReader& input)
method, we keep track of safe amount of bytes to read.
for (size_t i = 0, remainingBytes = input.remainingBytes(); i < totalElementsStored_; i++) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi , I am reviewing this PR , Please wait till EOD to merge this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me
Merging the PR, as we have 2 approvals and author is requesting for merge., |
* Introduce a loading layer in NMSLIB. Signed-off-by: Dooyong Kim <[email protected]> * Added NMSLIB istream implementation. Signed-off-by: Dooyong Kim <[email protected]> * Fix integer overflow issue when passing read size for loading NMSLIB vector index. Signed-off-by: Dooyong Kim <[email protected]> * Added unit test for NMSLIB loading layer. Signed-off-by: Dooyong Kim <[email protected]> * Made a patch in NMSLIB to avoid frequently calling JNI for better loading index performance. Signed-off-by: Dooyong Kim <[email protected]> * Compliance constexpr function in C++11 having nullstatement. Signed-off-by: Dooyong Kim <[email protected]> --------- Signed-off-by: Dooyong Kim <[email protected]> Co-authored-by: Dooyong Kim <[email protected]> (cherry picked from commit 7cf45c8)
* Introduce a loading layer in NMSLIB. Signed-off-by: Dooyong Kim <[email protected]> * Added NMSLIB istream implementation. Signed-off-by: Dooyong Kim <[email protected]> * Fix integer overflow issue when passing read size for loading NMSLIB vector index. Signed-off-by: Dooyong Kim <[email protected]> * Added unit test for NMSLIB loading layer. Signed-off-by: Dooyong Kim <[email protected]> * Made a patch in NMSLIB to avoid frequently calling JNI for better loading index performance. Signed-off-by: Dooyong Kim <[email protected]> * Compliance constexpr function in C++11 having nullstatement. Signed-off-by: Dooyong Kim <[email protected]>
* Introduce a loading layer in NMSLIB. Signed-off-by: Dooyong Kim <[email protected]> * Added NMSLIB istream implementation. Signed-off-by: Dooyong Kim <[email protected]> * Fix integer overflow issue when passing read size for loading NMSLIB vector index. Signed-off-by: Dooyong Kim <[email protected]> * Added unit test for NMSLIB loading layer. Signed-off-by: Dooyong Kim <[email protected]> * Made a patch in NMSLIB to avoid frequently calling JNI for better loading index performance. Signed-off-by: Dooyong Kim <[email protected]> * Compliance constexpr function in C++11 having nullstatement. Signed-off-by: Dooyong Kim <[email protected]> --------- Signed-off-by: Dooyong Kim <[email protected]> Co-authored-by: Dooyong Kim <[email protected]>
* Introduce a loading layer in NMSLIB. (#2185) * Introduce a loading layer in NMSLIB. Signed-off-by: Dooyong Kim <[email protected]> * Added NMSLIB istream implementation. Signed-off-by: Dooyong Kim <[email protected]> * Fix integer overflow issue when passing read size for loading NMSLIB vector index. Signed-off-by: Dooyong Kim <[email protected]> * Added unit test for NMSLIB loading layer. Signed-off-by: Dooyong Kim <[email protected]> * Made a patch in NMSLIB to avoid frequently calling JNI for better loading index performance. Signed-off-by: Dooyong Kim <[email protected]> * Compliance constexpr function in C++11 having nullstatement. Signed-off-by: Dooyong Kim <[email protected]> --------- Signed-off-by: Dooyong Kim <[email protected]> Co-authored-by: Dooyong Kim <[email protected]> * Fixed that it's failing to resolve a package in import statement. Signed-off-by: Dooyong Kim <[email protected]> * Move the element in the changelog from 3.x to 2.x. Signed-off-by: Dooyong Kim <[email protected]> --------- Signed-off-by: Dooyong Kim <[email protected]> Co-authored-by: Dooyong Kim <[email protected]>
Description
This PR is the first commit introducing the loading layer in NMSLIB.
Please refer to this issue for more details. - #2033
FYI : FAISS Loading Layer PR - #2139
Related Issues
Resolves #[Issue number to be closed when this PR is merged]
#2033
Check List
--signoff
.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.