Support derived source for knn with other fields#3209
Support derived source for knn with other fields#3209neetikasinghal wants to merge 1 commit intoopensearch-project:3.5from
Conversation
Signed-off-by: Neetika Singhal <neetiks@amazon.com>
| @Override | ||
| protected DerivedFieldGenerator derivedFieldGenerator() { | ||
| // using knn vector fetcher to fetch the vector and passing it as a doc values fetcher | ||
| return new DerivedFieldGenerator(mappedFieldType, new KnnVectorValuesFetcher((KNNVectorFieldType) mappedFieldType, simpleName()), null); |
There was a problem hiding this comment.
I have one quick question Here , So Dervied Source is enabled at Index Time and can be mappedFieldType.hasDocValues() is false so DerivedFieldGenerator will choose stored path but storedFieldFetcher = null in this PR , will it not cause NullPointerException .
There was a problem hiding this comment.
yes for that, i was planning to make a change like:
return new DerivedFieldGenerator(mappedFieldType, new KnnVectorValuesFetcher((KNNVectorFieldType) mappedFieldType, simpleName()), null) {
@Override
public FieldValueType getDerivedFieldPreference() {
return FieldValueType.DOC_VALUES;
}
};
I was planning to use FieldValueType.DOC_VALUES always for knn case, KnnVectorValuesFetcher can be used as the FieldValueFetcher. Ideally we should have a new FieldValueType something like FieldValueType.KNN_VALUES, but since i dont see knn using DOC_VALUES, we can leverage FieldValueType.DOC_VALUES for vector values. what do u think?
|
|
||
| @Override | ||
| protected void canDeriveSourceInternal() { | ||
| // skipping any checks here |
There was a problem hiding this comment.
So I understand why we are skipping here because KNN is having Mapping Validation , correct me if I am wrong but how we will make sure if I excluded-from-source fields are never later considered derivable accidentally
There was a problem hiding this comment.
in other mappers, canDeriveSourceInternal calls checkStoredAndDocValuesForDerivedSource but for knn field mapper it is not needed for either of the doc values and stored flag to return true, so dont see a need to put any checks here.
protected void checkStoredAndDocValuesForDerivedSource() {
if (!mappedFieldType.isStored() && !mappedFieldType.hasDocValues()) {
throw new UnsupportedOperationException("Unable to derive source for [" + name() + "] with stored and " + "docValues disabled");
}
}
Description
Implement derived source for k-NN fields with core implementation, with a fallback to k-NN plugin's implementation.
Related PR: opensearch-project/OpenSearch#20991
Before this change, if
index.derived_source.enabledand there is a knn field type, there will be error thrown saying that knn vector field doesnt supprt derived source feature.After this change, if
index.derived_source.enabledand there is a knn field type, the derived source feature will be turned on so that users can benefit out of both knn fields and other fields.Related Issues
Resolves #3058
Check List
--signoff.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.