-
Notifications
You must be signed in to change notification settings - Fork 2.5k
[BUG] update document with _bulk fails to generate embeddings in inference processors #17494
Description
Describe the bug
Bug Description:
embeddings are not generated when documents are updated with /index/_bulk operation, when successfully generated with insert /index/_doc/ operation. Is there any change made in Opensearch that skips embedding generation for bulk operation only?
How to reproduce the error:
Related component
No response
To Reproduce
- deploy model
POST /_plugins/_ml/models/{model_id}/_deploy
- configure pipeline
"processors": [
{
"text_embedding": {
"model_id": {model_id},
"field_map": {
"text": "passage_embedding"
}
}
}
- ingest doc
PUT /my-nlp-index/_doc/1
{
"text": "hello world"
}
- update doc with _bulk
PUT /my-nlp-index/_bulk
{ "update": { "_index": "my-nlp-index", "_id": "1" } }
{ "doc" : { "text": "bye world" } }
Expected behavior
Embeddings are created for the initial ingest for "text":"hello world", but not updated with bulk operation
{ "doc" : { "text": "bye world" } }
Embeddings should be re-generated for bulk operation by calling text_embedding_processor
Additional Details
Plugins
opensearch-ml, opensearch-knn, opensearch-neural-search
Screenshots
If applicable, add screenshots to help explain your problem.
Host/Environment (please complete the following information):
- OS: [e.g. iOS] MAC OS
- Version [e.g. 22] Sequoia 15.3
Additional context
Add any other context about the problem here.