Skip to content

fix waitall deadlock if any errors occur#60030

Merged
vtjnash merged 1 commit intomasterfrom
jn/waitall-failfast-hang
Nov 5, 2025
Merged

fix waitall deadlock if any errors occur#60030
vtjnash merged 1 commit intomasterfrom
jn/waitall-failfast-hang

Conversation

@vtjnash
Copy link
Copy Markdown
Member

@vtjnash vtjnash commented Nov 3, 2025

When errors occur, waitall may skip allocating Channel producers, leading to deadlock in the subsequent loop in the event that the user asked it to failfast (ironically). This is seen often in the failing of the threads_exec test ever since the test was added for this call. Simplify this to just use separate loops for the wait and the return computation.

When errors occur, `waitall` may skip allocating Channel producers,
leading to deadlock in the subsequent loop in the event that the user
asked it to failfast (ironically). This is seen often in the failing of
the threads_exec test ever since the test was added for this call.
Simplify this to just use separate loops for the wait and the return
computation.
@vtjnash vtjnash requested a review from jakobnissen November 3, 2025 21:51
@vtjnash vtjnash added backport 1.12 Change should be backported to release-1.12 backport 1.13 Change should be backported to release-1.13 labels Nov 3, 2025
@KristofferC KristofferC mentioned this pull request Nov 5, 2025
17 tasks
@vtjnash vtjnash merged commit e2f3178 into master Nov 5, 2025
10 checks passed
@vtjnash vtjnash deleted the jn/waitall-failfast-hang branch November 5, 2025 22:26
KristofferC pushed a commit that referenced this pull request Nov 7, 2025
When errors occur, `waitall` may skip allocating Channel producers,
leading to deadlock in the subsequent loop in the event that the user
asked it to failfast (ironically). This is seen often in the failing of
the threads_exec test ever since the test was added for this call.
Simplify this to just use separate loops for the wait and the return
computation.

(cherry picked from commit e2f3178)
@KristofferC KristofferC mentioned this pull request Nov 7, 2025
35 tasks
KristofferC pushed a commit that referenced this pull request Nov 10, 2025
When errors occur, `waitall` may skip allocating Channel producers,
leading to deadlock in the subsequent loop in the event that the user
asked it to failfast (ironically). This is seen often in the failing of
the threads_exec test ever since the test was added for this call.
Simplify this to just use separate loops for the wait and the return
computation.

(cherry picked from commit e2f3178)
@KristofferC KristofferC removed the backport 1.12 Change should be backported to release-1.12 label Nov 21, 2025
@KristofferC KristofferC removed the backport 1.13 Change should be backported to release-1.13 label Nov 28, 2025
kpamnany pushed a commit to RelationalAI/julia that referenced this pull request Jan 21, 2026
When errors occur, `waitall` may skip allocating Channel producers,
leading to deadlock in the subsequent loop in the event that the user
asked it to failfast (ironically). This is seen often in the failing of
the threads_exec test ever since the test was added for this call.
Simplify this to just use separate loops for the wait and the return
computation.

(cherry picked from commit e2f3178)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants