-
Notifications
You must be signed in to change notification settings - Fork 236
Description
Describe the bug
When installing the latest v2.18.0 version of the helm chart we encountered an issue where we could not apply changes to the generated webhook configuration yaml files that are included in the helm-chart, due to them being too large to store in ETCD.
Error message: etcdserver: request is too large.
When investigating we found that this happened in all of our persistent clusters where cert-manager had performed a certificate rotation at some point so that the clientConfig.caBundle field of a webhook configuration contained the two latest versions of the ca, which increases the total size of the object.
In clusters where no certificate rotation had been performed yet the upgrade went fine due to the smaller size.
Files:
admissionregistration.k8s.io_v1_validatingwebhookconfiguration_azureserviceoperator-validating-webhook-configuration.yamladmissionregistration.k8s.io_v1_mutatingwebhookconfiguration_azureserviceoperator-mutating-webhook-configuration.yaml
We use ArgoCD to deploy changes using Kubernetes server side apply already to try and bring the size of the object down.
Azure Service Operator Version: mcr.microsoft.com/k8s/azureserviceoperator:v2.18.0/k8s/azureserviceoperator:v2.18.0
Kubernetes Version: 1.33.6
Temporary Remediation:
Deleting the webhook configuration files from the cluster and applying them again(letting Argo CD sync them) works because then only the latest ca.crt is injected as part of the caBundle.
Expected behavior
I would expect the webhook configurations to respect the crdPattern field of the helm chart and only create webhook configurations for the crd resources actually created and used in the cluster. With the current setup where webhook configurations are created for every resource regardless I would still assume this to work seamlessly, perhaps by splitting the objects into several.
To Reproduce
I have not tried to reproduce this in a fresh cluster but this is roughly the chain of events that we have identified as likely.
- Install Azure Service Operator v2.17.0 using the helm chart.
- Wait for the caBundle to be injected into the webhook configurations by cert-manager.
- Perform a certificate rotation so that the two latest ca.crt are injected as part of the caBundle by cert-manager.
- Attempt to upgrade to v2.18.0.
Additional context
We have a way of getting around the issue for now and we could probably solve it ourselves using the Kustomize layer or similar but an official fix or recommendation for workaround would be appreciated.
Metadata
Metadata
Assignees
Labels
Type
Projects
Status