fix: apply score_threshold filtering after fusion queries in local mode#1138
fix: apply score_threshold filtering after fusion queries in local mode#1138joein merged 2 commits intoqdrant:devfrom
Conversation
The local/memory client was not applying score_threshold filtering after RRF and DBSF fusion operations. This caused query_points with prefetch and fusion queries to return results below the specified score_threshold. This fix adds score_threshold filtering after fusion results are computed, matching the behavior of the remote Qdrant server. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
✅ Deploy Preview for poetic-froyo-8baba7 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
📝 WalkthroughWalkthroughThis pull request implements score_threshold filtering for fusion queries (RRF and DBSF) in the Qdrant client. The primary change adds a score_threshold filter in the Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Possibly related issues
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (1)
tests/congruence_tests/test_query.py (1)
431-444: Clarify the TODO comment for future maintenance.The TODO indicates that
score_thresholdis not applied in formula queries in core. This test verifies that local mode matches this behavior (i.e., both should return the same unfiltered results). Consider rewording to make the intent clearer:- score_threshold=1.0, # todo: score threshold is not applied in formula queries in core + score_threshold=1.0, # Note: score_threshold is intentionally NOT applied in formula queries (matches core behavior)This distinguishes it from a "to-do" item and clarifies it's documenting expected behavior rather than a bug to fix.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
qdrant_client/local/local_collection.py(1 hunks)tests/congruence_tests/test_query.py(2 hunks)tests/test_in_memory.py(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
tests/test_in_memory.py (1)
qdrant_client/http/models/models.py (2)
FusionQuery(962-963)Fusion(950-959)
tests/congruence_tests/test_query.py (3)
qdrant_client/http/models/models.py (6)
ScoredPoint(2510-2521)Prefetch(2023-2047)RrfQuery(2466-2467)Rrf(2458-2463)FormulaQuery(945-947)MultExpression(1679-1680)qdrant_client/async_qdrant_remote.py (1)
query_points(364-469)qdrant_client/http/api/search_api.py (2)
query_points(550-565)query_points(773-788)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Redirect rules - poetic-froyo-8baba7
- GitHub Check: Header rules - poetic-froyo-8baba7
- GitHub Check: Pages changed - poetic-froyo-8baba7
🔇 Additional comments (5)
qdrant_client/local/local_collection.py (1)
826-839: LGTM! Score threshold filtering correctly applied after fusion.The implementation correctly:
- Filters fused results by
score >= score_thresholdbefore payload/vector retrieval (efficient)- Applies offset after filtering (correct order of operations)
- Excludes
FormulaQueryfrom threshold filtering, matching server behaviortests/test_in_memory.py (2)
123-208: Well-structured test for RRF score threshold filtering.The test properly validates:
- Baseline behavior without threshold (all 5 points returned)
- Filtered behavior with threshold (only qualifying points)
- Per-point score assertions for correctness
The docstring clearly documents expected RRF scores for the test data.
211-296: Good parallel test for DBSF fusion with score threshold.Follows the same pattern as the RRF test with appropriate threshold value (1.0) for DBSF score distribution. This ensures both fusion algorithms are covered.
tests/congruence_tests/test_query.py (2)
416-429: Good congruence test for RRF fusion with score threshold.The test validates that local mode matches remote behavior for RRF queries with score_threshold. The inline comment documenting expected results (3 results: 1.0, 0.5, 0.3̅) is helpful for debugging failures.
1333-1338: LGTM! Congruence tests added for both fusion score threshold variants.The new test methods are properly integrated into
test_dense_query_fusion, ensuring local-remote consistency for score-thresholded fusion queries.
|
@cbcoutinho thanks for addressing my comment and your contribution! |
…de (#1138) * fix: apply score_threshold filtering after fusion queries in local mode The local/memory client was not applying score_threshold filtering after RRF and DBSF fusion operations. This caused query_points with prefetch and fusion queries to return results below the specified score_threshold. This fix adds score_threshold filtering after fusion results are computed, matching the behavior of the remote Qdrant server. * tests: simplify score threshold tests, add formula threshold test --------- Co-authored-by: George Panchuk <george.panchuk@qdrant.tech>
Summary
This PR reimplements the fix from the main branch onto the dev branch.
Related: Closes original PR targeting main branch
This PR was generated with the help of AI, and reviewed by a Human