fix: Force Lease Expiration When Leader Exits#2379
fix: Force Lease Expiration When Leader Exits#2379RaghavRoy145 wants to merge 1 commit intokubernetes-client:masterfrom
Conversation
Currently, when the leader exits (say, after receiving a SIGINT), the workers need to wait for its lease to expire before a leader is re-elected. This patch mimics the behaviour of the Go Client implementation of using ctx.Done() by capturing the SIGINT and forcing the expiration date to a past date and also sets the acquire_time to None to start the leader election.
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: RaghavRoy145 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Welcome @RaghavRoy145! |
|
/assign @yliaog |
Oops, I was supposed to do that after the reviews 🙃 |
|
The Kubernetes project currently lacks enough contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
|
The Kubernetes project currently lacks enough active contributors to adequately respond to all PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
|
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
|
@k8s-triage-robot: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
What type of PR is this?
/kind bug
What this PR does / why we need it:
Currently, when the leader exits (say, after receiving a
SIGINT) the workers need to wait for its lease to expire before a leader is re-elected. This patch mimics the behaviour of the Go Client implementation of usingctx.Done(): https://github.com/kubernetes/client-go/blob/1309f64d6648411b4a36a2f7fa84dd8df31884b6/tools/leaderelection/leaderelection.go#L265-L291. It captures theSIGINTand forces the lease to exit by setting the expiration to a date in the past, and it also sets theacquire_timeto None to force a leader election.Issue Reproduction
As mentioned in the issue: leaderelection do not stop leading properly #2075, to reproduce this issue you can follow
leaderelection/example.py. Run it on 2-3 nodes (or tmux screens) and once a leader is elected hitCtrl+Cto force the leader to exit. The workers then wait for the leader's lease to expire before a new leader is elected.Expected behavior
The leader exiting should trigger a leader election without having the workers wait for the lease to expire.
Which issue(s) this PR fixes:
Fixes #2075
Special notes for your reviewer:
This is still not a complete fix. It is definitely hacky at the moment and I would love any guidance here! Currently, the patch only handles
SIGINTbut a leader may exit for various reasons, and there should be a more elegant way of handling this. Probably using the thread context but I was not able to figure that out. Further, the implementation of theforce_expire_lease()function is not elegant; you shouldn't need to setacquire_timetoNoneand settingexpirationto the past is also a code smell in my opinion. This patch is a proof of concept because of this.I also had to change the imports to point to my definitions of
electionconfig.pyandleaderelectionrecord.pyfor this to work and I'm sure there is a better way of handling this.If its more sensible to mark this PR a draft, I'm happy to do so!
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.: