-
Notifications
You must be signed in to change notification settings - Fork 13
Closed
Description
After
Recovering from split-brain; firing leftCluster()/joinedCluster() events
the ClusterListener proceeds to fireLeftClusterAndWaitToComplete. This is not guaranteed to wait for leftCluster handling on other nodes. Possibly this results in interleaving of execution of various left/joinCluster handlers.
2021.10.14 09:56:41 �[33mWARN �[m [hz.openfire.event-2]: org.jivesoftware.openfire.plugin.util.cache.ClusterListener - Recovering from split-brain; firing leftCluster()/joinedCluster() events
2021.10.14 09:56:41 �[36mDEBUG�[m [hz.openfire.priority-generic-operation.thread-0]: com.hazelcast.internal.partition.InternalPartitionService - [172.60.0.20]:5701 [openfire] [3.12.5] Applied partition state update with version: 1808
2021.10.14 09:56:41 �[32mINFO �[m [hz.openfire.cached.thread-4]: com.hazelcast.nio.tcp.TcpIpConnector - [172.60.0.20]:5701 [openfire] [3.12.5] Connecting to /172.60.0.10:5701, timeout: 10000, bind-any: true
2021.10.14 09:56:41 �[36mDEBUG�[m [hz.openfire.cached.thread-4]: com.hazelcast.nio.tcp.TcpIpEndpointManager - [172.60.0.20]:5701 [openfire] [3.12.5] Established socket connection between /172.60.0.20:40609 and /172.60.0.10:5701
2021.10.14 09:56:41 �[32mINFO �[m [hz.openfire.IO.thread-in-0]: com.hazelcast.nio.tcp.TcpIpConnection - [172.60.0.20]:5701 [openfire] [3.12.5] Initialized new cluster connection between /172.60.0.20:5701 and /172.60.0.10:55053
2021.10.14 09:56:41 �[36mDEBUG�[m [hz.openfire.event-2]: org.jivesoftware.openfire.plugin.util.cache.ClusteredCacheFactory - Firing leftCluster() event
2021.10.14 09:56:41 �[32mINFO �[m [hz.openfire.event-2]: org.jivesoftware.openfire.cluster.ClusterManager - Firing left cluster event for this node
2021.10.14 09:56:41 �[36mDEBUG�[m [hz.openfire.event-2]: org.jivesoftware.openfire.plugin.util.cache.ClusteredCacheFactory - Waiting for leftCluster() event to be called [timeout=30 seconds]
2021.10.14 09:56:41 �[36mDEBUG�[m [hz.openfire.cached.thread-2]: org.jivesoftware.openfire.muc.spi.MultiUserChatServiceImpl - Going to send 'join' presence stanzas on the local cluster node for 0 occupant(s) of cluster node b6a7aa53-e796-4d23-b2b1-b80650fe7181 that is now part of our cluster.
2021.10.14 09:56:41 �[36mDEBUG�[m [hz.openfire.cached.thread-2]: org.jivesoftware.openfire.muc.spi.MultiUserChatServiceImpl - Going to send 'join' presence stanzas on the local cluster node for 0 occupant(s) of cluster node b6a7aa53-e796-4d23-b2b1-b80650fe7181 that is now part of our cluster.
2021.10.14 09:56:41 �[32mINFO �[m [ClusterManager events dispatcher]: org.jivesoftware.util.cache.CacheFactory - Clustering stopped; cache migration complete
2021.10.14 09:56:41 �[32mINFO �[m [hz.openfire.IO.thread-in-1]: com.hazelcast.nio.tcp.TcpIpConnection - [172.60.0.20]:5701 [openfire] [3.12.5] Initialized new cluster connection between /172.60.0.20:40609 and /172.60.0.10:5701
2021.10.14 09:56:41 �[36mDEBUG�[m [ClusterManager events dispatcher]: org.jivesoftware.openfire.spi.RoutingTableImpl - Restoring cache content for cache 'Routing Servers Cache' by adding all outgoing server routes that are connected to the local cluster node.
2021.10.14 09:56:41 �[36mDEBUG�[m [ClusterManager events dispatcher]: org.jivesoftware.openfire.spi.RoutingTableImpl - Restoring cache content for cache 'Routing Components Cache' by adding all component routes that are connected to the local cluster node.
2021.10.14 09:56:41 �[36mDEBUG�[m [ClusterManager events dispatcher]: org.jivesoftware.openfire.spi.RoutingTableImpl - Restoring cache content for cache 'Routing Users Cache', 'Routing AnonymousUsers Cache' and 'Routing User Sessions' by adding all client routes that are connected to the local cluster node.
2021.10.14 09:56:41 �[36mDEBUG�[m [ClusterManager events dispatcher]: org.jivesoftware.openfire.spi.RoutingTableImpl - Adding client route user2@xmpp.localhost.example/jsxc-fc5ea58c
2021.10.14 09:56:41 �[36mDEBUG�[m [ClusterManager events dispatcher]: org.jivesoftware.openfire.spi.RoutingTableImpl - Adding client route ClientRoute{nodeID=a7206884-1c97-4915-a969-68d5c33dce18, available=true} to users cache under key user2@xmpp.localhost.example/jsxc-fc5ea58c
2021.10.14 09:56:41 �[36mDEBUG�[m [hz.openfire.cached.thread-4]: org.jivesoftware.openfire.muc.spi.OccupantManager - Replaced occupants with arguments null Occupant{roomName='muc1', nickname='user1', realJID=user1@xmpp.localhost.example/jsxc-7d2c6fd3, lastActive=2021-10-14T09:56:41.642849Z} 73230fd6-f29d-488e-aae4-775607a41246
2021.10.14 09:56:41 �[36mDEBUG�[m [hz.openfire.cached.thread-4]: org.jivesoftware.openfire.muc.spi.MultiUserChatServiceImpl - Going to send 'join' presence stanzas on the local cluster node for 1 occupant(s) of cluster node 73230fd6-f29d-488e-aae4-775607a41246 that is now part of our cluster.
2021.10.14 09:56:41 �[36mDEBUG�[m [hz.openfire.cached.thread-4]: org.jivesoftware.openfire.muc.spi.MUCPersistenceManager - Attempting to load room 'muc1' from the database.
2021.10.14 09:56:41 �[36mDEBUG�[m [hz.openfire.cached.thread-5]: org.jivesoftware.openfire.muc.spi.OccupantManager - Replaced occupants with arguments Occupant{roomName='muc1', nickname='user1', realJID=user1@xmpp.localhost.example/jsxc-7d2c6fd3, lastActive=2021-10-14T09:56:41.642849Z} null 73230fd6-f29d-488e-aae4-775607a41246
2021.10.14 09:56:41 �[36mDEBUG�[m [hz.openfire.cached.thread-5]: org.jivesoftware.openfire.muc.spi.OccupantManager - Replaced occupants with arguments null Occupant{roomName='muc1', nickname='user1', realJID=user1@xmpp.localhost.example/jsxc-7d2c6fd3, lastActive=2021-10-14T09:56:41.652447Z} 73230fd6-f29d-488e-aae4-775607a41246
2021.10.14 09:56:41 �[32mINFO �[m [hz.openfire.cached.thread-5]: org.jivesoftware.openfire.muc.spi.OccupantManager - We received a copy of local MUC occupants from node 73230fd6-f29d-488e-aae4-775607a41246, but we already had this information. This hints at a possible inefficient sharing of data across the cluster.
2021.10.14 09:56:41 �[36mDEBUG�[m [hz.openfire.cached.thread-5]: org.jivesoftware.openfire.muc.spi.MultiUserChatServiceImpl - Going to send 'join' presence stanzas on the local cluster node for 1 occupant(s) of cluster node 73230fd6-f29d-488e-aae4-775607a41246 that is now part of our cluster.
2021.10.14 09:56:41 �[36mDEBUG�[m [ClusterManager events dispatcher]: org.jivesoftware.openfire.muc.spi.MultiUserChatServiceImpl - Cluster event: service conference.xmpp.localhost.example left a cluster - going to restore 0 rooms
2021.10.14 09:56:41 �[36mDEBUG�[m [ClusterManager events dispatcher]: org.jivesoftware.openfire.muc.spi.OccupantManager - Reset occupants because we left the cluster
2021.10.14 09:56:41 �[36mDEBUG�[m [ClusterManager events dispatcher]: org.jivesoftware.openfire.muc.spi.LocalMUCRoomManager - Restoring cache content for cache 'MUC Service 'conference' Rooms' after we left the cluster, by adding all MUC Rooms that are known to the local node.
2021.10.14 09:56:41 �[36mDEBUG�[m [hz.openfire.cached.thread-4]: org.jivesoftware.openfire.group.GroupJID - Parsing JID from string: admin@xmpp.localhost.example
2021.10.14 09:56:41 �[36mDEBUG�[m [hz.openfire.cached.thread-4]: org.jivesoftware.openfire.muc.spi.FMUCHandler - (room: 'muc1@conference.xmpp.localhost.example'): Changing outbound join configuration. Existing: null, New: null
2021.10.14 09:56:41 �[33mWARN �[m [hz.openfire.cached.thread-4]: org.jivesoftware.openfire.muc.spi.MultiUserChatServiceImpl - A remote cluster node (73230fd6-f29d-488e-aae4-775607a41246) tells us that user user1@xmpp.localhost.example/jsxc-7d2c6fd3 is supposed to be an occupant (using nickname 'user1') of a room named 'muc1' but the data in the cluster cache does not indicate that this is true.
2021.10.14 09:56:41 �[33mWARN �[m [hz.openfire.cached.thread-5]: org.jivesoftware.openfire.muc.spi.MultiUserChatServiceImpl - A remote cluster node (73230fd6-f29d-488e-aae4-775607a41246) tells us that user user1@xmpp.localhost.example/jsxc-7d2c6fd3 is supposed to be an occupant (using nickname 'user1') of a room named 'muc1' but the data in the cluster cache does not indicate that this is true.
2021.10.14 09:56:41 �[36mDEBUG�[m [hz.openfire.cached.thread-5]: org.jivesoftware.openfire.muc.spi.MultiUserChatServiceImpl - Finished sending 'join' presence stanzas on the local cluster node.
2021.10.14 09:56:41 �[36mDEBUG�[m [hz.openfire.cached.thread-4]: org.jivesoftware.openfire.muc.spi.MultiUserChatServiceImpl - Finished sending 'join' presence stanzas on the local cluster node.
2021.10.14 09:56:41 �[36mDEBUG�[m [ClusterManager events dispatcher]: org.jivesoftware.openfire.muc.spi.MultiUserChatServiceImpl - Going to send 'leave' presence stanzas on the local cluster node for 1 occupant(s) of one or more cluster nodes that are no longer part of our cluster.
2021.10.14 09:56:41 �[36mDEBUG�[m [ClusterManager events dispatcher]: org.jivesoftware.openfire.muc.spi.MultiUserChatServiceImpl - Stanza now being sent: <presence type="unavailable" to="user2@xmpp.localhost.example/jsxc-fc5ea58c" from="muc1@conference.xmpp.localhost.example/user1"><x xmlns="http://jabber.org/protocol/muc#user"><item role="none" jid="user1@xmpp.localhost.example/jsxc-7d2c6fd3"/></x></presence>
2021.10.14 09:56:41 �[33mWARN �[m [ClusterManager events dispatcher]: org.jivesoftware.openfire.muc.spi.MultiUserChatServiceImpl - Unable to find MUCRole for recipient 'user2@xmpp.localhost.example/jsxc-fc5ea58c' in room muc1@conference.xmpp.localhost.example while broadcasting 'leave' presence for occupants on disconnected cluster node(s).
2021.10.14 09:56:41 �[36mDEBUG�[m [ClusterManager events dispatcher]: org.jivesoftware.openfire.muc.spi.MultiUserChatServiceImpl - Finished sending 'leave' presence stanzas on the local cluster node.
2021.10.14 09:56:41 �[36mDEBUG�[m [ClusterManager events dispatcher]: org.jivesoftware.openfire.muc.spi.MultiUserChatServiceImpl - Cluster event: service conference.xmpp.localhost.example left a cluster - going to restore 2 rooms
2021.10.14 09:56:41 �[36mDEBUG�[m [ClusterManager events dispatcher]: org.jivesoftware.openfire.muc.spi.OccupantManager - Reset occupants because we left the cluster
2021.10.14 09:56:41 �[36mDEBUG�[m [ClusterManager events dispatcher]: org.jivesoftware.openfire.muc.spi.LocalMUCRoomManager - Restoring cache content for cache 'MUC Service 'conference' Rooms' after we left the cluster, by adding all MUC Rooms that are known to the local node.
2021.10.14 09:56:41 �[36mDEBUG�[m [ClusterManager events dispatcher]: org.jivesoftware.openfire.muc.spi.MultiUserChatServiceImpl - Going to send 'leave' presence stanzas on the local cluster node for 0 occupant(s) of one or more cluster nodes that are no longer part of our cluster.
2021.10.14 09:56:41 �[32mINFO �[m [ClusterManager events dispatcher]: org.jivesoftware.openfire.cluster.ClusterMonitor - This node (a7206884-1c97-4915-a969-68d5c33dce18/xmpp2.localhost.example) has left the cluster
2021.10.14 09:56:41 �[32mINFO �[m [ClusterManager events dispatcher]: org.jivesoftware.openfire.cluster.ClusterMonitor - Sending message to admins: The local node ('xmpp2.localhost.example') has left the cluster - this node no longer has any resilience (enabled=true)
2021.10.14 09:56:41 �[36mDEBUG�[m [TaskEngine-pool-12]: org.jivesoftware.openfire.XMPPServer - Sending message to admin [jid=admin@xmpp.localhost.example, message=The local node ('xmpp2.localhost.example') has left the cluster - this node no longer has any resilience]
2021.10.14 09:56:41 �[36mDEBUG�[m [TaskEngine-pool-12]: org.jivesoftware.openfire.spi.RoutingTableImpl - Unable to route packet. No session is available so store offline. <message from="xmpp.localhost.example" to="admin@xmpp.localhost.example"><body>The local node ('xmpp2.localhost.example') has left the cluster - this node no longer has any resilience</body></message>
2021.10.14 09:56:41 �[36mDEBUG�[m [TaskEngine-pool-12]: org.jivesoftware.openfire.spi.RoutingTableImpl - Failed to route packet to JID: admin@xmpp.localhost.example packet: <message from="xmpp.localhost.example" to="admin@xmpp.localhost.example"><body>The local node ('xmpp2.localhost.example') has left the cluster - this node no longer has any resilience</body></message>
2021.10.14 09:56:41 �[36mDEBUG�[m [TaskEngine-pool-12]: org.jivesoftware.openfire.MessageRouter - Message sent to unreachable address: <message from="xmpp.localhost.example" to="admin@xmpp.localhost.example"><body>The local node ('xmpp2.localhost.example') has left the cluster - this node no longer has any resilience</body></message>
2021.10.14 09:56:41 �[36mDEBUG�[m [ClusterManager events dispatcher]: org.jivesoftware.util.cache.CacheFactory - CacheFactory is not listening for cluster events anymore because it left the cluster
2021.10.14 09:56:41 �[36mDEBUG�[m [hz.openfire.event-2]: org.jivesoftware.openfire.plugin.util.cache.ClusterListener - Firing joinedCluster() event```
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels