Patch for "Publisher connection timeout causes exception and message losing"#45
Patch for "Publisher connection timeout causes exception and message losing"#45debasishg merged 2 commits intodebasishg:masterfrom
Conversation
|
I still have an issue with this fix. It only resolved the issue when input is closed. I still suffer the issue with output is closed and it's really hard to reproduce. I can't manually reproduce it. It only happens on production. I added a new patch to my system. If it will work, I will update it |
|
Thanks a lot .. I will check it over the weekend and merge .. |
|
My latest change is to let IO.write throw new RedisConnectionException("connection is closed. write error") when exception happens during os.write(data) and os.flush. The problem with reconnect there is that although the new socket connection is established, there will be a read from result function immediately after that. This read will be blocked until timeout. The default setting is 5 minutes. So message will be blocked for 5 mins by default. I hope it explains why we got blocking on prod. From the previous timeout exception, it looks "connected" method doesn't work well. I changed reconnect to throw a new RedisConnectionException so that the catch block in send method will quickly reconnect and retry. This issue is really hard to reproduce and I have to test it on prod. I will let you know how it's going after weekend. Meanwhile, if you figure out any other cause, please let me know. Thanks |
|
All fixes are in. Until now, no delay is found on the first message for connection timeout edge cases. |
Patch for "Publisher connection timeout causes exception and message losing"
the quick fix for your consideration