Skip to content

fix restoreSession closing the session it just re-activated#855

Open
skartikey wants to merge 1 commit intogopcua:mainfrom
skartikey:fix/restore-session-close
Open

fix restoreSession closing the session it just re-activated#855
skartikey wants to merge 1 commit intogopcua:mainfrom
skartikey:fix/restore-session-close

Conversation

@skartikey
Copy link
Copy Markdown

During reconnect, the restoreSession path calls ActivateSession with the existing session. ActivateSession's internal callback always calls CloseSession before setting the new session. Since the old and new session are the same object, this sends CloseSessionRequest with DeleteSubscriptions: true, destroying all subscriptions on the server.

Clear the session reference before calling ActivateSession, matching the fix applied to recreateSession in #700. The session object is still held by the local variable and gets re-set inside the callback.

Fixes #854

During reconnect, the restoreSession path calls ActivateSession with
the existing session. ActivateSession's internal callback always calls
CloseSession before setting the new session. Since the old and new
session are the same object, this sends CloseSessionRequest with
DeleteSubscriptions: true, destroying all subscriptions on the server.

Clear the session reference before calling ActivateSession, matching
the fix applied to recreateSession in gopcua#700. The session object is
still held by the local variable and gets re-set inside the callback.

Fixes gopcua#854
@skartikey
Copy link
Copy Markdown
Author

Adding context from a production report. A user running Telegraf 1.38.2 with inputs.opcua_listener against a Siemens OpenPCS7 V9.2 server hit this exact bug. The connection dropped after ~25 minutes of low data-change activity, and the auto-reconnect sequence played out exactly as described in #854:

restoreSession -> ActivateSession (success) -> CloseSession (!!!) -> "session restored"
-> ReadRequest -> StatusBadSessionIDInvalid
-> recreateSession (success) -> TransferSubscriptions -> ERRF/EOF
-> restoreSubscriptions -> broken pipe
-> "no subscriptions to resume"

After this, the client was permanently disconnected with no further reconnection attempts. The fix here matches what #700 did for recreateSession and is the minimal correct change.

One question: should closeSession(nil) also be guarded with a debug log (e.g., dlog.Printf("skipping close of nil session")) to make future troubleshooting easier, or is the current silent return-early sufficient?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

restoreSession closes the session it just re-activated, deleting all subscriptions

1 participant