move stop signal listen to run function and refactor launch batcher#648
Conversation
There was a problem hiding this comment.
Pull request overview
This PR refactors node startup/shutdown wiring to support re-invoking SIGTERM listeners during lifecycle changes (e.g., dynamic reconfig), and aligns batcher launch flow with the router’s “start service then run” pattern.
Changes:
- Moved SIGTERM handling into
common/utilsasStopSignalListen(stopChan, ...)and wired it intoBatcher.Run(). - Refactored batcher launch to start its gRPC service via
Batcher.StartBatcherService()and adjustedCreateBatcherto no longer require aNetat construction time. - Updated tests and router service setup to match the new batcher initialization flow.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
common/utils/net.go |
Introduces reusable StopSignalListen with optional lifecycle stop channel. |
node/batcher/batcher.go |
Adds StartBatcherService(), Address(), and installs SIGTERM listener from within Run(). |
node/batcher/batcher_builder.go |
Changes CreateBatcher signature to no longer accept Net. |
node/server/arma.go |
Updates batcher launch path to use StartBatcherService(); switches SIGTERM wiring to utils.StopSignalListen. |
node/router/router.go |
Sets r.net earlier in StartRouterService(). |
node/batcher/utils_test.go |
Updates batcher creation to set Net after construction. |
test/utils_test.go |
Updates batcher creation to set Net after construction. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| b.logger.Infof("Starting batcher") | ||
| b.batcher.Start() | ||
| b.metrics.Start() | ||
| utils.StopSignalListen(b.stopChan, b, b.logger, b.Address()) | ||
| } |
There was a problem hiding this comment.
CreateBatcher no longer wires Net, but Run() now installs StopSignalListen (which will call b.Stop() on SIGTERM) and Stop() unconditionally calls b.Net.Stop(). To keep the API hard-to-misuse, consider enforcing b.Net is set before Run() (panic/return error with a clear message) or changing construction so Net is always initialized (e.g., StartBatcherService inside Run, or a constructor option).
There was a problem hiding this comment.
This follows the launch router design
There was a problem hiding this comment.
b.StartBatcherService() is responsible for initializing a gRPC server.
common/utils/net.go
Outdated
| func StopSignalListen(stopChan chan struct{}, node NodeStopper, logger *flogging.FabricLogger, nodeAddr string) { | ||
| signalChan := make(chan os.Signal, 1) | ||
| signal.Notify(signalChan, syscall.SIGTERM) | ||
|
|
||
| go func() { | ||
| defer signal.Stop(signalChan) | ||
|
|
||
| for { | ||
| select { | ||
| case <-signalChan: | ||
| logger.Infof("SIGTERM signal caught, the node listening on %s is about to shutdown:", nodeAddr) | ||
| node.Stop() | ||
| return | ||
| case <-stopChan: | ||
| logger.Infof("Exit StopSignalListen routine") | ||
| return | ||
| } | ||
| } | ||
| }() |
There was a problem hiding this comment.
StopSignalListen is newly added to net.go, but there are already unit tests for this file (common/utils/net_test.go) and the new shutdown behavior isn’t covered. Adding a test that closes stopChan and asserts the listener exits (and does not call node.Stop()) would help prevent goroutine leaks/regressions.
There was a problem hiding this comment.
Lets add an issue to unit test it after the dust settles on the main/launch refactoring.
| "github.com/hyperledger/fabric-x-orderer/request" | ||
| ) | ||
|
|
||
| func CreateBatcher(config *node_config.BatcherNodeConfig, logger *flogging.FabricLogger, net Net, cdrc ConsensusDecisionReplicatorCreator, senderCreator ConsenterControlEventSenderCreator, signer Signer) *Batcher { |
There was a problem hiding this comment.
why do we need to remove net from the batcher?
and now always remember to set it from the outside (otherwise it will panic on batcher stop)
5d84ed2 to
b40df83
Compare
There was a problem hiding this comment.
I would put this under node/server in a separate file "signal_listener.go".
common/utils/net.go
Outdated
| func StopSignalListen(stopChan chan struct{}, node NodeStopper, logger *flogging.FabricLogger, nodeAddr string) { | ||
| signalChan := make(chan os.Signal, 1) | ||
| signal.Notify(signalChan, syscall.SIGTERM) | ||
|
|
||
| go func() { | ||
| defer signal.Stop(signalChan) | ||
|
|
||
| for { | ||
| select { | ||
| case <-signalChan: | ||
| logger.Infof("SIGTERM signal caught, the node listening on %s is about to shutdown:", nodeAddr) | ||
| node.Stop() | ||
| return | ||
| case <-stopChan: | ||
| logger.Infof("Exit StopSignalListen routine") | ||
| return | ||
| } | ||
| } | ||
| }() |
There was a problem hiding this comment.
Lets add an issue to unit test it after the dust settles on the main/launch refactoring.
| go func() { | ||
| err := srv.Start() | ||
| if err != nil { | ||
| panic(err) |
There was a problem hiding this comment.
Is every time the server exits with an error deserves a panic?
There was a problem hiding this comment.
That is what happens in the router.
In the other nodes, we ignore the error
b40df83 to
07a4c95
Compare
Signed-off-by: May.Buzaglo <May.Buzaglo@ibm.com>
Signed-off-by: May.Buzaglo <May.Buzaglo@ibm.com>
Signed-off-by: May.Buzaglo <May.Buzaglo@ibm.com>
07a4c95 to
6c243b5
Compare
StopSignalListenmethod is currently running as part of the launch method.To be able to close the inner go routine in that method and call it again during dynamic reconfig, a refactor is needed.
The
launchBatchermethod is also refactored to be similar to thelaunchRoutermethod.Relevant issue: #504