feat: webhook readiness wait for informer cache sync#367
Merged
weng271190436 merged 4 commits intokubefleet-dev:mainfrom Dec 13, 2025
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This pull request adds a webhook readiness check that prevents the webhook from accepting requests until all resource informer caches are synced. This addresses an issue where webhook replicas could report ready and start serving requests before the discovery cache was fully populated, causing validation failures for valid resources.
Key Changes:
- Adds a new
ResourceInformerReadinessCheckerfunction that verifies all informer caches are synced before marking the webhook pod as ready - Implements
GetAllResources()method in the informer manager interface to retrieve all watched resources (both cluster-scoped and namespace-scoped) - Integrates the readiness check into the hub agent startup sequence after controllers are initialized
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| pkg/webhook/readiness.go | New file implementing the readiness check function that validates all informer caches are synced |
| pkg/webhook/readiness_test.go | Comprehensive unit tests for the readiness checker covering nil informer, no resources, synced/unsynced states |
| pkg/utils/informer/informermanager.go | Adds GetAllResources() method to return all present resources (cluster + namespace scoped) with proper locking |
| pkg/utils/informer/informermanager_test.go | Unit tests for GetAllResources() including edge cases and concurrency testing |
| test/utils/informer/manager.go | Implements GetAllResources() in the fake manager for testing purposes |
| cmd/hubagent/main.go | Integrates the readiness check after controller setup to ensure ResourceInformer is initialized |
5527613 to
1b9dd16
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
686f3e2 to
3048a51
Compare
1 task
a6a7d9e to
44004f9
Compare
7d474b3 to
a798337
Compare
added 3 commits
December 11, 2025 01:37
Signed-off-by: Wei Weng <Wei.Weng@microsoft.com>
Signed-off-by: Wei Weng <Wei.Weng@microsoft.com>
a798337 to
27ebb6b
Compare
ryanzhang-oss
approved these changes
Dec 12, 2025
1 task
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description of your changes
I hit an issue with webhook informer cache not synced while working on another PR #366 where I enabled multiple webhook replicas and hit this issue frequently in e2e tests
The error I encountered is saying Namespace is not found to be a valid schema. This is due to some replicas are still getting api resources data from kube api server but they already report ready and start serving requests.
[FAILED] Failed to create cluster resource placement Expected success, but got an error: <*errors.StatusError | 0xc0011c80a0>: admission webhook "fleet.clusterresourceplacementv1beta1.validating" denied the request: deny create/update v1beta1 CRP has invalid fields the resource is not found in schema (please retry) or it is not a cluster scoped resource: /v1, Kind=Namespace { ErrStatus: { TypeMeta: {Kind: "", APIVersion: ""}, ListMeta: { SelfLink: "", ResourceVersion: "", Continue: "", RemainingItemCount: nil, }, Status: "Failure", Message: "admission webhook \"fleet.clusterresourceplacementv1beta1.validating\" denied the request: deny create/update v1beta1 CRP has invalid fields the resource is not found in schema (please retry) or it is not a cluster scoped resource: /v1, Kind=Namespace", Reason: "Forbidden", Details: nil, Code: 403, }, }I have:
make reviewableto ensure this PR is ready for review.How has this code been tested
Special notes for your reviewer