Skip to content

[Bug] Fix bug when dimension <= 56 in Faiss SQ.#3229

Merged
0ctopus13prime merged 1 commit intoopensearch-project:mainfrom
0ctopus13prime:small-dimension-faiss-sq-fix
Mar 31, 2026
Merged

[Bug] Fix bug when dimension <= 56 in Faiss SQ.#3229
0ctopus13prime merged 1 commit intoopensearch-project:mainfrom
0ctopus13prime:small-dimension-faiss-sq-fix

Conversation

@0ctopus13prime
Copy link
Copy Markdown
Collaborator

Description

The FaissSQDistanceComputer in faiss_sq_flat.h computes the 1-bit popcount dot product using only 8-byte (uint64_t) word loops: words = quantizedVectorBytes >> 3. When quantizedVectorBytes is not a multiple of 8, the trailing bytes are silently dropped. For example, dimension 56 produces 7 quantized bytes — words = 7 >> 3 = 0 — so the popcount loop never executes and the dot product is always zero. The HNSW graph is then built with no meaningful distance discrimination, resulting in ~16% recall. At dimension 57, Lucene's getDiscreteDimensions rounds up to 64 discretized dimensions (8 bytes), so words = 1 and the loop works correctly, giving ~88% recall. This creates a sharp recall cliff rather than a gradual degradation.

Fix:

Added a byte remainder loop in the IsBytesMultipleOf8=false branch of operator(), distances_batch_4, and symmetric_dis. After the 8-byte word loop, the remaining bytes are processed individually with __builtin_popcount. The IsBytesMultipleOf8=true path is unchanged — the remainder code only exists in the unaligned template instantiation, so there is zero performance impact for the common aligned case.

Related Issues

Resolves #[Issue number to be closed when this PR is merged]
N/A

Check List

  • [O] New functionality includes testing.
  • [O] New functionality has been documented.
  • [O] API changes companion pull request created.
  • [O] Commits are signed per the DCO using --signoff.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@naveentatikonda
Copy link
Copy Markdown
Member

@0ctopus13prime pls fix the failing CI

Suite: Test class org.opensearch.knn.memoryoptsearch.faiss.FaissScalarQuantizedBulkSimdScorerTests
  2> REPRODUCE WITH: ./gradlew ':test' --tests 'org.opensearch.knn.memoryoptsearch.faiss.FaissScalarQuantizedBulkSimdScorerTests.testBBQCosineScoring' -Dtests.seed=66759F1A3FC1EA74 -Dtests.security.manager=false -Dtests.locale=ku -Dtests.timezone=Etc/GMT+12 -Druntime.java=21
  2> java.lang.AssertionError: Score mismatch at ord=224 for COSINE expected:<1.0> but was:<1.010622501373291>
        at __randomizedtesting.SeedInfo.seed([66759F1A3FC1EA74:517D2C11C2F232DE]:0)
        at org.junit.Assert.fail(Assert.java:89)
        at org.junit.Assert.failNotEquals(Assert.java:835)
        at org.junit.Assert.assertEquals(Assert.java:555)
        at org.opensearch.knn.memoryoptsearch.faiss.FaissScalarQuantizedBulkSimdScorerTests.doTest(FaissScalarQuantizedBulkSimdScorerTests.java:210)
        at org.opensearch.knn.memoryoptsearch.faiss.FaissScalarQuantizedBulkSimdScorerTests.testBBQCosineScoring(FaissScalarQuantizedBulkSimdScorerTests.java:69)

@0ctopus13prime 0ctopus13prime force-pushed the small-dimension-faiss-sq-fix branch from b23402d to 7855595 Compare March 30, 2026 18:59
@0ctopus13prime
Copy link
Copy Markdown
Collaborator Author

Hey @naveentatikonda
Fixed it, thank you

Signed-off-by: Dooyong Kim <kdooyong@amazon.com>
@0ctopus13prime 0ctopus13prime force-pushed the small-dimension-faiss-sq-fix branch from 7855595 to e743fb6 Compare March 30, 2026 19:43
@codecov
Copy link
Copy Markdown

codecov bot commented Mar 30, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 83.02%. Comparing base (a1ee88c) to head (e743fb6).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff            @@
##               main    #3229   +/-   ##
=========================================
  Coverage     83.02%   83.02%           
  Complexity     4160     4160           
=========================================
  Files           447      447           
  Lines         15313    15313           
  Branches       1964     1964           
=========================================
  Hits          12714    12714           
  Misses         1806     1806           
  Partials        793      793           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@0ctopus13prime 0ctopus13prime merged commit 77ac70e into opensearch-project:main Mar 31, 2026
58 checks passed
opensearch-trigger-bot bot pushed a commit that referenced this pull request Mar 31, 2026
Signed-off-by: Dooyong Kim <kdooyong@amazon.com>
(cherry picked from commit 77ac70e)
0ctopus13prime added a commit that referenced this pull request Mar 31, 2026
Signed-off-by: Dooyong Kim <kdooyong@amazon.com>
0ctopus13prime added a commit to 0ctopus13prime/k-NN that referenced this pull request Mar 31, 2026
Signed-off-by: Dooyong Kim <kdooyong@amazon.com>
naveentatikonda pushed a commit to naveentatikonda/k-NN that referenced this pull request Mar 31, 2026
Signed-off-by: Dooyong Kim <kdooyong@amazon.com>
0ctopus13prime added a commit to 0ctopus13prime/k-NN that referenced this pull request Mar 31, 2026
Signed-off-by: Dooyong Kim <kdooyong@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants