Fix KNN build due to Lucene 10.3 upgrade#2878
Fix KNN build due to Lucene 10.3 upgrade#2878navneet1v merged 1 commit intoopensearch-project:mainfrom
Conversation
d0cca7e to
f5242bf
Compare
|
Need to fix Lucene codec too since OpenSearch core is moved to 10.3 version of Lucene |
|
I see dependency conflict fixed on this branch but ITs still seem to fail to load failure access at runtime when i run locally. It's confusing to me this class is not loaded as I see it in the bundle: I think @karenyrx has an upstream change to resolve this: My only thought would be it's kind of awkward that grpc transport SPI is supplying failure access when KNN uses it natively for |
8e55b69 to
7013e9e
Compare
|
waiting for this PR to be available. opensearch-project/OpenSearch#19339 via nightly snapshots. Tested locally that with this change build is working. |
f1982a7 to
3de0da0
Compare
795cfe2 to
a5d591d
Compare
src/main/java/org/opensearch/knn/index/codec/KNN990Codec/NativeEngines990KnnVectorsReader.java
Show resolved
Hide resolved
src/main/java/org/opensearch/knn/index/query/TopApproxKnnCollector.java
Outdated
Show resolved
Hide resolved
| final Bits liveDocs = reader.getLiveDocs(); | ||
| acceptDocs = AcceptDocs.fromLiveDocs(liveDocs, reader.maxDoc()); | ||
| } else { | ||
| acceptDocs = new AcceptDocs() { |
There was a problem hiding this comment.
This is too verbose to put in else statement, Please find a better way to write this.
There was a problem hiding this comment.
Can you please elaborate more your comment. What is the meaning of this is too verbose to put in else comment?
There was a problem hiding this comment.
I think we can factor this out as a small function in this class with a small comment. e.g. when cardinality is 0, it's indicating that there's no filtering which is combining user provided filter + liveDocs required..
There was a problem hiding this comment.
It was not clear from the comment. I can move this to a private function.
There was a problem hiding this comment.
NIT : If cardinality == 0, then can we just pass null?
acceptDocs = AcceptDocs.fromLiveDocs(null, reader.maxDoc());
There was a problem hiding this comment.
@navneet1v just want to confirm we can't use fromIteratorSupplier right?
Another thought would be, should we create acceptdocs when we create filterbit set and pass it in?
There was a problem hiding this comment.
Synced with Navneet, we can just pass liveDocs. If no filtering + no deleted docs, liveDocs will always be null.e
There was a problem hiding this comment.
@navneet1v just want to confirm we can't use fromIteratorSupplier right?
Another thought would be, should we create acceptdocs when we create filterbit set and pass it in?
@shatejas to use the fromIteratorSupplier, we have to pass the iterator right from the base KNNWeight.class . I thought about doing it first and then realized it will break the whole code and will lead to changes across many files, we are already on 25 files in this PR. I think with new interface we should move our code to completely using AcceptedDocs. Will take that change in upcoming PRs.
So, my strategy was to first make this change get the build working, and as a next steps see if we can use fromIteratorSupplier. Based on my deep-dive I can see AcceptedDocs was just a fancy wrapper over bits and DISI. For HNSW the usage of bits remain same, its just when Lucene starts to do exact search it uses iterator() to get the doc Ids. https://github.com/apache/lucene/blob/586dc103361c99fefa2f67946855691a073e805c/lucene/core/src/java/org/apache/lucene/search/AbstractKnnVectorQuery.java#L234
There was a problem hiding this comment.
@navneet1v I think I am aligned, as long as we do it as a fast follow up
There was a problem hiding this comment.
Sure, I will create a GH issue on this. and take it next.
src/test/java/org/opensearch/knn/memoryoptsearch/AbstractFaissCagraHnswIndexTests.java
Show resolved
Hide resolved
src/test/java/org/opensearch/knn/memoryoptsearch/AbstractFaissCagraHnswIndexTests.java
Outdated
Show resolved
Hide resolved
0ctopus13prime
left a comment
There was a problem hiding this comment.
LGTM! Added few nit-pickings, thank you for the hard work!
src/test/java/org/opensearch/knn/memoryoptsearch/FaissCagraHnswIndexTests.java
Outdated
Show resolved
Hide resolved
src/test/java/org/opensearch/knn/memoryoptsearch/FaissCagraHnswIndexTests.java
Show resolved
Hide resolved
src/test/java/org/opensearch/knn/memoryoptsearch/FaissCagraHnswIndexTests.java
Show resolved
Hide resolved
src/test/java/org/opensearch/knn/memoryoptsearch/FaissMemoryOptimizedSearcherTests.java
Outdated
Show resolved
Hide resolved
| final Bits liveDocs = reader.getLiveDocs(); | ||
| acceptDocs = AcceptDocs.fromLiveDocs(liveDocs, reader.maxDoc()); | ||
| } else { | ||
| acceptDocs = new AcceptDocs() { |
There was a problem hiding this comment.
@navneet1v just want to confirm we can't use fromIteratorSupplier right?
Another thought would be, should we create acceptdocs when we create filterbit set and pass it in?
Signed-off-by: Navneet Verma <navneev@amazon.com>
…#2878) Signed-off-by: Navneet Verma <navneev@amazon.com>
Description
This change allows us to fix the failing builds which was happening to Lucene 10.3 and other changes in core.
Changes done:
Known Issue
MacOS CI will fail and it is unrelated to this PR. Will raise another PR for the same. GH issue: #2876
Related Issues
#2888
Check List
--signoff.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.