Skip to content

Milvus vector store similarity scores are not normalized #4879

@ejrtks1020

Description

@ejrtks1020

Describe the bug

The Milvus vector store implementation does not normalize similarity scores returned from search operations, which causes issues when using SimilarityThresholdRetriever. Different metric types (L2, IP, COSINE) in Milvus return scores in different ranges and scales, but the current implementation passes these raw scores directly to the retriever without normalization.
This creates problems because:

  1. L2 (Euclidean distance): Returns distance values where smaller values indicate higher similarity (0 to infinity range)
  2. IP (Inner Product): Returns values typically in the range of -1 to 1
  3. COSINE: Returns values in the range of -1 to 1

The SimilarityThresholdRetriever expects normalized scores between 0 and 1, where higher values indicate higher similarity. Without proper normalization, the threshold filtering becomes inconsistent and unreliable across different metric types.

To Reproduce

  1. Set up a Milvus vector store with any metric type (L2, IP, or COSINE)
  2. Configure a SimilarityThresholdRetriever with a reasonable threshold (e.g., 0.8)
  3. Perform a similarity search
  4. Observe that the threshold filtering doesn't work as expected due to unnormalized scores

Expected behavior

  • All similarity scores should be normalized to a 0-1 range where 1 represents the highest similarity
  • L2 distances should be converted to similarity scores (e.g., using 1 / (1 + distance))
  • IP and COSINE scores should be normalized from their [-1, 1] range to [0, 1] range (e.g., using (score + 1) / 2)
  • SimilarityThresholdRetriever should work consistently across all metric types

Screenshots

No response

Flow

No response

Use Method

None

Flowise Version

No response

Operating System

macOS

Browser

Chrome

Additional context

See Also https://milvus.io/docs/ko/metric.md

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions