feat(channel): add message send retry mechanism with exponential backoff#2478
feat(channel): add message send retry mechanism with exponential backoff#2478
Conversation
chengyongru
commented
Mar 25, 2026
- Add send_max_retries config option (default: 3, range: 0-10)
- Implement _send_with_retry in ChannelManager with 1s/2s/4s backoff
- Propagate CancelledError for graceful shutdown
- Fix telegram send_delta to raise exceptions for Manager retry
- Add comprehensive tests for retry logic
- Document channel settings in README
- Add send_max_retries config option (default: 3, range: 0-10) - Implement _send_with_retry in ChannelManager with 1s/2s/4s backoff - Propagate CancelledError for graceful shutdown - Fix telegram send_delta to raise exceptions for Manager retry - Add comprehensive tests for retry logic - Document channel settings in README
|
manual tested |
Make channel delivery failures raise consistently so retry policy lives in ChannelManager rather than being split across individual channels. Tighten Telegram stream finalization, clarify sendMaxRetries semantics, and align the docs with the behavior the system actually guarantees.
Re-bin
left a comment
There was a problem hiding this comment.
This PR is going in the right direction.
Retries should not be scattered across channels. They should live in one place. Failure should mean one thing. And the system should not promise behavior it does not actually deliver.
So I tightened the design.
What changed
- kept retry policy in
ChannelManager - made channel send failures raise consistently instead of being silently swallowed
- fixed Telegram streaming so finalization failure can also participate in retry
- clarified
sendMaxRetriessemantics so config and behavior say the same thing - tightened the README so it describes the real contract, not an aspirational one
Why this matters
A system like this should feel simple.
A channel should do one job: send.
If sending fails, it should say so.
The manager should do one job: decide whether to retry.
That is cleaner.
That is easier to reason about.
That is easier to test.
And most importantly, that is easier to trust.
Result
The structure is now more minimal and more decoupled.
Instead of adding retry behavior on top of inconsistent failure handling, the code now has a clearer contract:
- channels perform delivery
- the manager owns retry strategy
- documentation matches reality
That is a much better foundation.
Validation
- focused channel and Telegram retry tests passed
- full test suite passed locally