Skip to content

Session restoration fails permanently due to gopcua CloseSession during reconnect #18694

@skartikey

Description

@skartikey

Relevant telegraf.conf

[[inputs.opcua_listener]]
    endpoint = "opc.tcp://<server>:4863"
    security_policy = "None"
    security_mode = "None"
    # subscription with 78 monitored items

Logs from Telegraf

debug: client: monitor: disconnected
  debug: client: monitor: auto-reconnecting
  debug: client: monitor: action: createSecureChannel
  debug: client: monitor: secure channel recreated   
  debug: client: monitor: action: restoreSession  
  debug: client: monitor: trying to restore session
  # ActivateSession succeeds, then CloseSession is sent immediately
  uasc 3/2: recv *ua.ActivateSessionResponse                       
  uasc 3/3: send *ua.CloseSessionRequest    
  uasc 3/3: recv *ua.CloseSessionResponse
  debug: client: monitor: session restored
  debug: client: monitor: trying to update namespaces
  uasc 3/4: err: The session id is not valid. StatusBadSessionIDInvalid (0x80250000)
  debug: client: monitor: updating namespaces failed: The session id is not valid.  
  # Second attempt: recreateSession succeeds but TransferSubscriptions fails      
  debug: client: monitor: session recreated                                 
  debug: client: monitor: action: transferSubscriptions
  uacp 4: recv ERRF with 16 bytes                      
  debug: client: monitor: transfer subscriptions failed. Recreating all subscriptions: EOF
  debug: client: monitor: action: restoreSubscriptions                                    
  debug: sub 11: recreate_create: failed to recreate subscription
  debug: client: monitor: recreate subscripitions failed: write tcp ...: broken pipe
  debug: client: monitor: no subscriptions to resume                                                                                                                                                               
  # Telegraf is now permanently disconnected, no further reconnection attempts

System info

Telegraf 1.38.2 (Docker), connecting to Siemens OpenPCS7 V9.2 OPC UA Server

Docker

No response

Steps to reproduce

  1. Configure opcua_listener with subscription to a Siemens OpenPCS7 server
  2. Wait for a period of low data change activity (15-25 minutes with few/no value changes)
  3. The server drops the TCP connection (EOF)
  4. gopcua's auto-reconnect triggers but fails permanently

Expected behavior

Telegraf should recover from connection drops and re-establish subscriptions, either through gopcua's auto-reconnect or through its own reconnection logic.

Actual behavior

After the connection drops, gopcua's restoreSession path calls ActivateSession which internally calls CloseSession on the same session (with DeleteSubscriptions: true). This destroys the session and its subscriptions. The subsequent TransferSubscriptions and restoreSubscriptions both fail because the server closes the connection. Telegraf ends up permanently disconnected with no further reconnection attempts.

Additional info

Root cause analysis:

The bug is in gopcua v0.8.0's ActivateSession method (client.go:906). During reconnection, when restoreSession calls ActivateSession(ctx, existingSession), the method's response handler calls c.CloseSession() on the "previous" session. But during restoreSession, the previous session IS the session being restored, so it immediately closes what it just activated.

This was partially fixed in gopcua/opcua#700 for the recreateSession path (by setting c.setSession(nil) before calling ActivateSession), but the restoreSession path was not fixed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugunexpected problem or unintended behaviorupstreambug or issues that rely on dependency fixes

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions