Skip to content

Fix panic in handleSubscribeHTTP when client disconnects during publish#1598

Merged
binwiederhier merged 1 commit into
mainfrom
crashfix
Feb 8, 2026
Merged

Fix panic in handleSubscribeHTTP when client disconnects during publish#1598
binwiederhier merged 1 commit into
mainfrom
crashfix

Conversation

@binwiederhier
Copy link
Copy Markdown
Owner

@binwiederhier binwiederhier commented Feb 8, 2026

(This PR is AI generated, but a human [me] has verified that it is accurate)

Summary

  • Fixes a server crash (nil pointer dereference / SIGSEGV) in handleSubscribeHTTP caused by a race between a topic.Publish goroutine flushing the HTTP response writer and the handler returning after client disconnect.
  • Replaces the existing wlock.TryLock() hack with a proper wlock.Lock() + closed flag pattern that correctly waits for in-flight writes to finish and prevents future writes to a cleaned-up response writer.

Root cause

topic.Publish() copies the subscribers map and calls each subscriber in its own goroutine. When a client disconnects:

  1. handleSubscribeHTTP returns, triggering deferred cleanup
  2. Go's HTTP server cleans up the response writer
  3. But a Publish goroutine may still be in-flight, calling Flush() on the now-invalid writer

The old TryLock() approach had two failure modes:

  • TryLock succeeds (no sub running): future sub() calls deadlock on wlock.Lock() (goroutine leak, since the lock is never released)
  • TryLock fails (sub is running): the handler returns immediately without waiting, and Go cleans up the writer while sub is still flushing — exactly the panic

Fix

The new approach:

  • Deferred cleanup uses a blocking wlock.Lock() to wait for any in-flight sub() call to finish, then sets closed = true
  • sub() function checks the closed flag after acquiring the lock — if the connection has been closed, it returns without touching the response writer

This guarantees the response writer is never accessed after the handler returns.

Stack trace this fixes

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x5b3525]

goroutine 4355370 [running]:
bufio.(*Writer).Flush(0xc061282940)
net/http.(*response).FlushError(0xc0329b7ce0)
net/http.(*response).Flush(0xc04935ba10?)
heckel.io/ntfy/v2/server.(*Server).handleSubscribeHTTP.func2(...)
        server/server.go:1454
heckel.io/ntfy/v2/server.(*topic).Publish.func1.1(...)
        server/topic.go:117

Test plan

  • go vet ./server/ passes
  • Verify with make check
  • Soak test under load with concurrent publishers and disconnecting subscribers

Replace wlock.TryLock() with a proper Lock() + closed flag to prevent
writing to a response writer that has been cleaned up after the handler
returns. The previous TryLock approach could not guarantee the response
writer was still valid when a concurrent Publish goroutine called Flush.
@binwiederhier binwiederhier changed the title Fix panic in handleSubscribeHTTP when client disconnects during publish AI: Fix panic in handleSubscribeHTTP when client disconnects during publish Feb 8, 2026
@binwiederhier
Copy link
Copy Markdown
Owner Author

This is strongly related to this issue from 3 years ago: #338 (comment)

I am verifying that this fix is accurate and correct.

@binwiederhier binwiederhier merged commit 3647d39 into main Feb 8, 2026
4 checks passed
@binwiederhier
Copy link
Copy Markdown
Owner Author

I verified by THINKING that this is a good fix, and it actually does likely fix a goroutine leak as well.

Cursor wrote a fantastic test TestServer_SubscribeHTTP_NoWriteAfterHandlerReturn to test this as best as possible. I struggled back then to reproduce this because it only happens in a very narrow time window. This test is much better because it reproduces the "write after close" situation reliably, albeit not exactly like before.

I verified that the test fails without the fix.

@binwiederhier binwiederhier changed the title AI: Fix panic in handleSubscribeHTTP when client disconnects during publish Fix panic in handleSubscribeHTTP when client disconnects during publish Feb 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant