-
Notifications
You must be signed in to change notification settings - Fork 903
vault-agent-injector ClusterRole missing 'get nodes' permission for leader election #1174
Description
Description
The vault-agent-injector-clusterrole ClusterRole is missing get permission on nodes, which causes continuous error logs from the leader election mechanism in the non-leader injector pod(s).
Current behavior
When running the injector with replicas > 1, the non-leader pod logs the following error approximately every 16 seconds:
{"@level":"error","@message":"Failed to get Node","@module":"handler.operator-lib.leader","Node.Name":"<node-name>","error":"nodes \"<node-name>\" is forbidden: User \"system:serviceaccount:vault:vault-agent-injector\" cannot get resource \"nodes\" in API group \"\" at the cluster scope"}This happens because operator-framework/operator-lib's leader election calls isNotReadyNode() during the election loop to check whether the current leader's node has gone NotReady. This function calls getNode(), which requires get on nodes at cluster scope — a permission not included in the chart's ClusterRole.
Impact
The error is gracefully handled — isNotReadyNode() returns false on failure, so the non-leader pod simply waits normally. However:
- Noisy logs: The error is logged every ~16 seconds per non-leader pod, which adds up to significant log noise in environments with log aggregation/alerting.
- Degraded failover: If the leader pod's node enters a
NotReadystate, the non-leader cannot detect this and proactively take over leadership. Instead, it must wait for Kubernetes garbage collection to delete the leader ConfigMap, which can take significantly longer (see also Leader Elect Duration Configurations #743).
Current ClusterRole
rules:
- apiGroups: ["admissionregistration.k8s.io"]
resources: ["mutatingwebhookconfigurations"]
verbs: ["get", "list", "watch", "patch"]Proposed fix
Add get permission on nodes to the vault-agent-injector-clusterrole:
rules:
- apiGroups: ["admissionregistration.k8s.io"]
resources: ["mutatingwebhookconfigurations"]
verbs: ["get", "list", "watch", "patch"]
- apiGroups: [""]
resources: ["nodes"]
verbs: ["get"]This is a minimal, read-only permission that enables the operator-lib leader elector to function as designed.
Environment
- Chart version: 0.22.1
- Kubernetes: 1.34 (EKS)
- Injector replicas: 2