Skip to content

Infinite loop when connection unexpectedly dies #11

@nvx

Description

@nvx

When the connection is interrupted unexpectedly (eg, stateful firewall loses track of the connection) it seems that when database/sql tries to signal to the driver to close the connection it gets stuck in a busy loop causing 100% CPU usage.

Stack trace of the offending goroutine running under go version go1.12.9 linux/amd64:

goroutine 855659 [runnable]:
syscall.Syscall(0x0, 0x7, 0xc0001565a8, 0x1, 0x0, 0x1, 0x0)
        /usr/local/go/src/syscall/asm_linux_amd64.s:18 +0x5
syscall.read(0x7, 0xc0001565a8, 0x1, 0x8, 0x0, 0x0, 0x0)
        /usr/local/go/src/syscall/zsyscall_linux_amd64.go:732 +0x5a
syscall.Read(...)
        /usr/local/go/src/syscall/syscall_unix.go:172
internal/poll.(*FD).Read(0xc00014eb00, 0xc0001565a8, 0x1, 0x8, 0x0, 0x0, 0x0)
        /usr/local/go/src/internal/poll/fd_unix.go:165 +0x131
net.(*netFD).Read(0xc00014eb00, 0xc0001565a8, 0x1, 0x8, 0x0, 0xe97160, 0xc000096050)
        /usr/local/go/src/net/fd_unix.go:202 +0x4f
net.(*conn).Read(0xc0000ac2b8, 0xc0001565a8, 0x1, 0x8, 0x0, 0x0, 0x0)
        /usr/local/go/src/net/net.go:177 +0x69
io.ReadAtLeast(0x7f47d8c79a08, 0xc0004741c0, 0xc0001565a8, 0x1, 0x8, 0x1, 0x0, 0xe97160, 0xc000096050)
        /usr/local/go/src/io/io.go:310 +0x88
io.ReadFull(...)
        /usr/local/go/src/io/io.go:329
github.com/thda/tds/binary.(*Encoder).Read(0xc000156588, 0xc0001565a8, 0x1, 0x8, 0x0, 0xe97160, 0xc000096050)
        /home/jenkins_slave/workspace/tds-test/src/github.com/thda/tds/binary/encoder.go:78 +0x6b
github.com/thda/tds/binary.(*Encoder).ReadByte(...)
        /home/jenkins_slave/workspace/tds-test/src/github.com/thda/tds/binary/encoder.go:306
github.com/thda/tds.(*header).read(0xc000156490, 0xc000156588, 0xe97160, 0xc000096050)
        /home/jenkins_slave/workspace/tds-test/src/github.com/thda/tds/buffer.go:42 +0x1ab
github.com/thda/tds.(*buf).readPkt(0xc000156480, 0xc000499001, 0xe97160, 0xc000096050)
        /home/jenkins_slave/workspace/tds-test/src/github.com/thda/tds/buffer.go:140 +0x54
github.com/thda/tds.(*buf).processCancel(0xc000156480, 0x0, 0x0)
        /home/jenkins_slave/workspace/tds-test/src/github.com/thda/tds/buffer.go:554 +0xcf
github.com/thda/tds.(*buf).sendPkt(0xc000156480, 0xeb3401, 0xc0003c8350, 0x0)
        /home/jenkins_slave/workspace/tds-test/src/github.com/thda/tds/buffer.go:187 +0x1c0
github.com/thda/tds.(*buf).send(0xc000156480, 0xeadfe0, 0xc0002fb620, 0xc0003c830f, 0xc000499180, 0x1, 0x1, 0x0, 0x0)
        /home/jenkins_slave/workspace/tds-test/src/github.com/thda/tds/buffer.go:398 +0xee
github.com/thda/tds.(*session).Close(0xc0003ff040, 0x0, 0x0)
        /home/jenkins_slave/workspace/tds-test/src/github.com/thda/tds/session.go:342 +0x192
database/sql.(*driverConn).finalClose.func2()
        /usr/local/go/src/database/sql/sql.go:521 +0x49
database/sql.withLock(0xea1e80, 0xc000119280, 0xc000499270)
        /usr/local/go/src/database/sql/sql.go:3097 +0x63
database/sql.(*driverConn).finalClose(0xc000119280, 0x0, 0x4)
        /usr/local/go/src/database/sql/sql.go:519 +0x130
database/sql.(*driverConn).Close(0xc000119280, 0xc000097390, 0xc000097390)
        /usr/local/go/src/database/sql/sql.go:500 +0x138
database/sql.(*DB).putConn(0xc00010e180, 0xc000119280, 0xe97160, 0xc000097390, 0xc000119201)
        /usr/local/go/src/database/sql/sql.go:1250 +0x2d5
database/sql.(*driverConn).releaseConn(...)
        /usr/local/go/src/database/sql/sql.go:421
database/sql.(*DB).pingDC(0xc00010e180, 0xeae020, 0xc000520e40, 0xc000119280, 0xc0004994a8, 0x0, 0x0)
        /usr/local/go/src/database/sql/sql.go:719 +0x84
database/sql.(*DB).PingContext(0xc00010e180, 0xeae020, 0xc000520e40, 0xc000499520, 0xd567e0)
        /usr/local/go/src/database/sql/sql.go:742 +0x170
<truncated>

In this instance, db.PingContext() was called which made the database/sql package realise there was a bad connection in the pool. Relevant bit of code in the database/sql package that kicks off the chain of events: https://github.com/golang/go/blob/release-branch.go1.12/src/database/sql/sql.go#L1250

Subsequent calls to db.PingContext() worked fine using a fresh connection while the offending go routine was still caught up in a busy loop.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions