Skip to content

[Internal] DTS: Adds retries in DTS when isRetriable is true and on timeout#5689

Draft
Meghana-Palaparthi wants to merge 1 commit intomasterfrom
users/Meghana-Palaparthi/DTS_timeout_handling
Draft

[Internal] DTS: Adds retries in DTS when isRetriable is true and on timeout#5689
Meghana-Palaparthi wants to merge 1 commit intomasterfrom
users/Meghana-Palaparthi/DTS_timeout_handling

Conversation

@Meghana-Palaparthi
Copy link
Contributor

Description

This pull request introduces robust retry logic for distributed transaction commits and improves the handling and parsing of distributed transaction responses in the Cosmos DB SDK. The changes ensure that commit operations are retried safely in the event of timeouts or retriable errors, enhance diagnostics, and make response parsing more resilient. Additionally, the request and response classes are refactored for safer stream handling and improved reliability.

Distributed transaction commit improvements:

  • Added exponential backoff retry logic for distributed transaction commits, specifically handling timeouts and retriable errors with idempotency token support in DistributedTransactionCommitter
  • Improved error handling to distinguish between cancellation and other exceptions during commit attempts.

Response parsing and diagnostics enhancements:

  • Enhanced distributed transaction response parsing to extract isRetriable and serverDiagnostics fields, and improved resilience to partial JSON parsing failures.
  • Added the IsRetriable property to DistributedTransactionResponse and ensured it is correctly populated from server responses.

Request stream handling improvements:

  • Refactored DistributedTransactionServerRequest to use a pre-serialized byte array for the request body, enabling safe creation of new memory streams for each retry and preventing disposal issues.

Reliability and correctness fixes:

  • Ensured proper disposal checks in enumerator and count properties of DistributedTransactionResponse.
  • Improved deserialization error handling in DistributedTransactionOperationResult to throw explicit exceptions on failure.

Miscellaneous:

  • Minor cleanup and refactoring for resource URI handling and idempotency token extraction.

Type of change

Please delete options that are not relevant.

  • [] Bug fix (non-breaking change which fixes an issue)
  • [✓] New feature (non-breaking change which adds functionality)
  • [] Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • [] This change requires a documentation update

Closing issues

To automatically close an issue: closes #IssueNumber

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant