-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Description
Describe the bug
When an update is sent as a part of bulk, the request stays stuck in an infinite loop in TransportShardBulkAction due to repeated retries. This code pointer is expected to limit the number of retries to retry_on_conflict specified by the user in the bulk request, but the retryCounter in the BulkPrimaryExecutionContext is never incremented in the resetForExecutionForRetry method. In a scenario where there are repeated conflicts for an update, the loop in TransportShardBulkAction remains stuck forever. Note that this same behaviour is not seen when the _update API is invoked and a single document is updated because for the _update API, retries are handled in TransportUpdateAction.
To Reproduce
We need to throw VersionConflictEngineExceptions repeatedly for this to show up. I have created a remote branch on my fork where I have modified the update code to always throw VersionConflictEngineExceptions: https://github.com/raghuvanshraj/OpenSearch/tree/retry-on-conflict-testing
Steps to reproduce the behavior:
- Clone the branch linked above
- Bring up the opensearch process
- Create an index
- Ingest a single document on the index
- Update the document with the
bulkAPI withretry_on_conflictset in the update request. Sample:
{ "update" : { "_index" : "{{index_name}}", "retry_on_conflict": 3, "_id": "IYGCtIsBspX0Krzt2kus" } }
{ "doc": { "counter": 3 } }
Expected behavior
The expected behavior in this case would be for retry_on_conflict to be honored and for the request to be succeeded/failed gracefully.
Plugins
NA
Screenshots
NA
Host/Environment (please complete the following information):
NA
Additional context
NA