Skip to content

PSQL server stops responding #27

@hmntsharma

Description

@hmntsharma

Hi @TimEvens ,

Thanks for your help so far.

I tried to connect multiple routers all at once with bmp collector and it created a large data dump to be injested by the psql-app and to be finally stored in the psql database.

I have tested it repeatedly now, that the psql docker container remains up but the database stops responding and the psql-app connection times out.

I have found the following errors


==> Done upgrading to 2.2.2 SQL ==================================
===> Done with upgrade
===> Starting consumer
Archive:  /usr/local/openbmp/obmp-psql-consumer.jar
  inflating: obmp-psql.yml
===> Now running!!!
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.







org.postgresql.util.PSQLException: The connection attempt failed.
        at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:331)
        at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:49)
        at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:223)
        at org.postgresql.Driver.makeConnection(Driver.java:400)
        at org.postgresql.Driver.connect(Driver.java:259)
        at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:681)
        at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:190)
        at org.openbmp.PSQLHandler.connect(PSQLHandler.java:94)
        at org.openbmp.PSQLHandler.updateQuery(PSQLHandler.java:193)
        at org.openbmp.WriterRunnable.run(WriterRunnable.java:120)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.net.SocketTimeoutException: Read timed out
        at java.base/sun.nio.ch.NioSocketImpl.timedRead(NioSocketImpl.java:283)
        at java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:309)
        at java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:350)
        at java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:803)
        at java.base/java.net.Socket$SocketInputStream.read(Socket.java:966)
        at java.base/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:478)
        at java.base/sun.security.ssl.SSLSocketInputRecord.readHeader(SSLSocketInputRecord.java:472)
        at java.base/sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:70)
        at java.base/sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1455)
        at java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:1059)
        at org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:161)
        at org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:128)
        at org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:113)
        at org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:73)
        at org.postgresql.core.PGStream.receiveChar(PGStream.java:453)
        at org.postgresql.core.v3.ConnectionFactoryImpl.doAuthentication(ConnectionFactoryImpl.java:631)
        at org.postgresql.core.v3.ConnectionFactoryImpl.tryConnect(ConnectionFactoryImpl.java:180)
        at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:235)
        ... 14 more

More logs

Caused by: java.net.SocketTimeoutException: Read timed out
        at java.base/sun.nio.ch.NioSocketImpl.timedRead(NioSocketImpl.java:283)
        at java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:309)
        at java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:350)
        at java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:803)
        at java.base/java.net.Socket$SocketInputStream.read(Socket.java:966)
        at java.base/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:478)
        at java.base/sun.security.ssl.SSLSocketInputRecord.readHeader(SSLSocketInputRecord.java:472)
        at java.base/sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:70)
        at java.base/sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1455)
        at java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:1059)
        at org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:161)
        at org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:128)
        at org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:113)
        at org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:73)
        at org.postgresql.core.PGStream.receiveChar(PGStream.java:453)
        at org.postgresql.core.v3.ConnectionFactoryImpl.doAuthentication(ConnectionFactoryImpl.java:631)
        at org.postgresql.core.v3.ConnectionFactoryImpl.tryConnect(ConnectionFactoryImpl.java:180)
        at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:235)
        ... 14 more
org.postgresql.util.PSQLException: The connection attempt failed.
        at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:331)
        at org.postgresql.core.ConnectionFactory.openConnection(ConnectionFactory.java:49)
        at org.postgresql.jdbc.PgConnection.<init>(PgConnection.java:223)
        at org.postgresql.Driver.makeConnection(Driver.java:400)
        at org.postgresql.Driver.connect(Driver.java:259)
        at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:681)
        at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:190)
        at org.openbmp.PSQLHandler.connect(PSQLHandler.java:94)
        at org.openbmp.PSQLHandler.updateQuery(PSQLHandler.java:193)
        at org.openbmp.WriterRunnable.run(WriterRunnable.java:120)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
        at java.base/java.lang.Thread.run(Thread.java:833)
Caused by: java.net.SocketTimeoutException: Read timed out
        at java.base/sun.nio.ch.NioSocketImpl.timedRead(NioSocketImpl.java:283)
        at java.base/sun.nio.ch.NioSocketImpl.implRead(NioSocketImpl.java:309)
        at java.base/sun.nio.ch.NioSocketImpl.read(NioSocketImpl.java:350)
        at java.base/sun.nio.ch.NioSocketImpl$1.read(NioSocketImpl.java:803)
        at java.base/java.net.Socket$SocketInputStream.read(Socket.java:966)
        at java.base/sun.security.ssl.SSLSocketInputRecord.read(SSLSocketInputRecord.java:478)
        at java.base/sun.security.ssl.SSLSocketInputRecord.readHeader(SSLSocketInputRecord.java:472)
        at java.base/sun.security.ssl.SSLSocketInputRecord.bytesInCompletePacket(SSLSocketInputRecord.java:70)
        at java.base/sun.security.ssl.SSLSocketImpl.readApplicationRecord(SSLSocketImpl.java:1455)
        at java.base/sun.security.ssl.SSLSocketImpl$AppInputStream.read(SSLSocketImpl.java:1059)
        at org.postgresql.core.VisibleBufferedInputStream.readMore(VisibleBufferedInputStream.java:161)
        at org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:128)
        at org.postgresql.core.VisibleBufferedInputStream.ensureBytes(VisibleBufferedInputStream.java:113)
        at org.postgresql.core.VisibleBufferedInputStream.read(VisibleBufferedInputStream.java:73)
        at org.postgresql.core.PGStream.receiveChar(PGStream.java:453)
        at org.postgresql.core.v3.ConnectionFactoryImpl.doAuthentication(ConnectionFactoryImpl.java:631)
        at org.postgresql.core.v3.ConnectionFactoryImpl.tryConnect(ConnectionFactoryImpl.java:180)
        at org.postgresql.core.v3.ConnectionFactoryImpl.openConnectionImpl(ConnectionFactoryImpl.java:235)
        ... 14 more

There are no errors or logs in the obmp-psql container itself

The openbmp database does not connect either via CLI or grafana (all dashboards get timedout)

I have 24 vCPU core and 128 GB RAM with 500GB storage, I thought perhaps my disks are slow, although ioping suggests it should be good enough

Read test:

$ ioping -q -c 10 -s 8k .

--- . (xfs /dev/dm-2) ioping statistics ---
9 requests completed in 3.91 ms, 72 KiB read, 2.30 k iops, 18.0 MiB/s
generated 10 requests in 9.00 s, 80 KiB, 1 iops, 8.89 KiB/s
min/avg/max/mdev = 383.3 us / 434.1 us / 506.8 us / 37.6 us

Write test:

 $ ioping -q -c 10 -s 8k -W .

--- . (xfs /dev/dm-2) ioping statistics ---
9 requests completed in 5.10 ms, 72 KiB written, 1.76 k iops, 13.8 MiB/s
generated 10 requests in 9.00 s, 80 KiB, 1 iops, 8.89 KiB/s
min/avg/max/mdev = 452.8 us / 566.7 us / 647.4 us / 54.9 us

To rule out disk I/O issue, I even mounted the tmpfs with the podman container but the result was same over and over again.

The high write workload request is causing this, with less number of tcp sessions, there is no issue at all.

Please help!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions