Skip to content

Support derived source for knn with other fields#3209

Open
neetikasinghal wants to merge 1 commit intoopensearch-project:3.5from
neetikasinghal:3.5
Open

Support derived source for knn with other fields#3209
neetikasinghal wants to merge 1 commit intoopensearch-project:3.5from
neetikasinghal:3.5

Conversation

@neetikasinghal
Copy link
Copy Markdown
Contributor

@neetikasinghal neetikasinghal commented Mar 24, 2026

Description

Implement derived source for k-NN fields with core implementation, with a fallback to k-NN plugin's implementation.

Related PR: opensearch-project/OpenSearch#20991

Before this change, if index.derived_source.enabled and there is a knn field type, there will be error thrown saying that knn vector field doesnt supprt derived source feature.
After this change, if index.derived_source.enabled and there is a knn field type, the derived source feature will be turned on so that users can benefit out of both knn fields and other fields.

Related Issues

Resolves #3058

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Neetika Singhal <neetiks@amazon.com>
@Override
protected DerivedFieldGenerator derivedFieldGenerator() {
// using knn vector fetcher to fetch the vector and passing it as a doc values fetcher
return new DerivedFieldGenerator(mappedFieldType, new KnnVectorValuesFetcher((KNNVectorFieldType) mappedFieldType, simpleName()), null);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have one quick question Here , So Dervied Source is enabled at Index Time and can be mappedFieldType.hasDocValues() is false so DerivedFieldGenerator will choose stored path but storedFieldFetcher = null in this PR , will it not cause NullPointerException .

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes for that, i was planning to make a change like:

return new DerivedFieldGenerator(mappedFieldType, new KnnVectorValuesFetcher((KNNVectorFieldType) mappedFieldType, simpleName()), null) {
            @Override
            public FieldValueType getDerivedFieldPreference() {
                return FieldValueType.DOC_VALUES;
            }
        };

I was planning to use FieldValueType.DOC_VALUES always for knn case, KnnVectorValuesFetcher can be used as the FieldValueFetcher. Ideally we should have a new FieldValueType something like FieldValueType.KNN_VALUES, but since i dont see knn using DOC_VALUES, we can leverage FieldValueType.DOC_VALUES for vector values. what do u think?


@Override
protected void canDeriveSourceInternal() {
// skipping any checks here
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I understand why we are skipping here because KNN is having Mapping Validation , correct me if I am wrong but how we will make sure if I excluded-from-source fields are never later considered derivable accidentally

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in other mappers, canDeriveSourceInternal calls checkStoredAndDocValuesForDerivedSource but for knn field mapper it is not needed for either of the doc values and stored flag to return true, so dont see a need to put any checks here.

protected void checkStoredAndDocValuesForDerivedSource() {
        if (!mappedFieldType.isStored() && !mappedFieldType.hasDocValues()) {
            throw new UnsupportedOperationException("Unable to derive source for [" + name() + "] with stored and " + "docValues disabled");
        }
    }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants