feat: skip work applier status updates if possible by michaelawyu · Pull Request #375 · kubefleet-dev/kubefleet

michaelawyu · 2025-12-11T14:05:50Z

Description of your changes

As a performance improvement, set the work applier to skip status updates if possible.

I have:

Run make reviewable to ensure this PR is ready for review.

How has this code been tested

Integration tests

Special notes for your reviewer

Signed-off-by: michaelawyu <chenyu1@microsoft.com>

michaelawyu · 2025-12-11T14:06:49Z

Note: as anticipated after this PR the status update logic in work applier has reached the linter code complexity limit; to unblock progress that specific method is now exempted from the linter, will submit separate PRs to refactor the method.

michaelawyu · 2025-12-11T14:08:03Z

This PR precedes #118.

ryanzhang-oss

how can we make sure that a work's status in cache is actually the same in etcd? If we just compare with the cache and stop updating, we run the risk of the status never gets updated to the correct state (although the chance is small).

michaelawyu · 2025-12-17T21:17:26Z

how can we make sure that a work's status in cache is actually the same in etcd? If we just compare with the cache and stop updating, we run the risk of the status never gets updated to the correct state (although the chance is small).

Hi Ryan! For the work applier we already requeue periodically (with back-off) so unless the client-side cache becomes significantly out-of-sync with the API server (in this case all writes will be rejected with optimistic locks), any inconsistent status updates will be overwritten within at max. 15 minutes.

…plier

ryanzhang-oss · 2025-12-18T20:49:41Z

pkg/controllers/workapplier/status.go

-		return controller.NewAPIServerError(false, err)
+
+	// Skip the status update if no change found.
+	if equality.Semantic.DeepEqual(originalStatus, &appliedWork.Status) {


do we have an idea of how many calls this generates?

Hi Ryan! Do you mean the # of status updates this setup will skip? Or do you mean the time complexity about the DeepEqual calls?

For the former, in the target perf test environment each member agent was generating roughly 950K status updates per 24h (though this number might be a bit biased due to the agents being restarted occasionally).

For the latter, deep-equaling is usu. expensive, esp. when the object is complex. I don't have specifics on deep-equaling Work object status data now, but I could do a mini benchmarking if you are interested.

Thanks, I was asking the former. However, just to make sure that we are on the same page, this is appliedWork so it's an update on the member cluster API. 1M call per day isn't too bad though.

Hi Ryan! This part is about the AppliedWork, which is updated via the member cluster API server. So far we haven't seen any cases of member cluster API server being overloaded by such updates.

ryanzhang-oss · 2025-12-18T20:49:54Z

pkg/controllers/workapplier/status.go

 }
+
+func shouldSkipStatusUpdate(isDriftedOrDiffed, isStatusBackReportingOn bool, originalStatus, currentStatus *fleetv1beta1.WorkStatus) bool {
+	if isDriftedOrDiffed || isStatusBackReportingOn {


this will reduce the effectiveness of this PR by a lot, is there anyway to soften the blow a bit further?

Hi Ryan! We might need a flag to omit observation timestamps if needed? Though with exponential backoff, as long as the number of drifted/diffed placements + status back-reporting placements is low, and the system does not restart often, the # of calls should be relatively limited (~96 writes per placement) when everything stablizes.

ryanzhang-oss · 2025-12-22T00:11:12Z

pkg/controllers/workapplier/status_integration_test.go

+		MockGet: func(ctx context.Context, key client.ObjectKey, obj client.Object) error {
+			return realClient.Get(ctx, key, obj)
+		},
+		MockList: func(ctx context.Context, list client.ObjectList, opts ...client.ListOption) error {
+			return realClient.List(ctx, list, opts...)
+		},
+		MockCreate: func(ctx context.Context, obj client.Object, opts ...client.CreateOption) error {
+			return realClient.Create(ctx, obj, opts...)
+		},
+		MockDelete: func(ctx context.Context, obj client.Object, opts ...client.DeleteOption) error {
+			return realClient.Delete(ctx, obj, opts...)
+		},
+		MockDeleteAllOf: func(ctx context.Context, obj client.Object, opts ...client.DeleteAllOfOption) error {
+			return realClient.DeleteAllOf(ctx, obj, opts...)
+		},
+		MockUpdate: func(ctx context.Context, obj client.Object, opts ...client.UpdateOption) error {
+			return realClient.Update(ctx, obj, opts...)
+		},
+		MockPatch: func(ctx context.Context, obj client.Object, patch client.Patch, opts ...client.PatchOption) error {
+			return realClient.Patch(ctx, obj, patch, opts...)
+		},
+		MockApply: func(ctx context.Context, config runtime.ApplyConfiguration, opts ...client.ApplyOption) error {
+			return realClient.Apply(ctx, config, opts...)
+		},


are those not default?

Hi Ryan! The default is a nil function, whose invocation would trigger an error. There are some methods (e.g., DeleteAllOf) that are not used in our controllers at all; I just added them for completeness reasons.

…plier

…o feat/deep-equal-before-status-update-work-applier

codecov · 2025-12-29T07:37:11Z

Codecov Report

❌ Patch coverage is 73.91304% with 6 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
pkg/controllers/workapplier/status.go	73.91%	4 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

michaelawyu · 2025-12-29T13:16:49Z

Merging this PR to unblock progress (approval acquired before merge conflicts are resolved) -> if there's any concern, please let me know.

Set the work applier to skip status updates if possible

40dc51c

Signed-off-by: michaelawyu <chenyu1@microsoft.com>

ryanzhang-oss reviewed Dec 16, 2025

View reviewed changes

Merge branch 'main' into feat/deep-equal-before-status-update-work-ap…

4649f16

…plier

ryanzhang-oss reviewed Dec 18, 2025

View reviewed changes

ryanzhang-oss previously approved these changes Dec 22, 2025

View reviewed changes

michaelawyu added 2 commits December 22, 2025 16:48

Merge branch 'main' into feat/deep-equal-before-status-update-work-ap…

91fb9c3

…plier

Merge branch 'main' of https://github.com/kubefleet-dev/kubefleet int…

4415e77

…o feat/deep-equal-before-status-update-work-applier

michaelawyu dismissed ryanzhang-oss’s stale review via 4415e77 December 29, 2025 06:58

michaelawyu merged commit 9291964 into kubefleet-dev:main Dec 29, 2025
13 of 15 checks passed

michaelawyu deleted the feat/deep-equal-before-status-update-work-applier branch December 29, 2025 13:17

britaniar mentioned this pull request Jan 13, 2026

fix: add state and remove generation labels from update run metrics #389

Merged

1 task

Conversation

michaelawyu commented Dec 11, 2025

Description of your changes

How has this code been tested

Special notes for your reviewer

Uh oh!

michaelawyu commented Dec 11, 2025

Uh oh!

michaelawyu commented Dec 11, 2025

Uh oh!

ryanzhang-oss left a comment

Choose a reason for hiding this comment

Uh oh!

michaelawyu commented Dec 17, 2025

Uh oh!

ryanzhang-oss Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

michaelawyu Dec 19, 2025

Choose a reason for hiding this comment

Uh oh!

ryanzhang-oss Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

michaelawyu Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

ryanzhang-oss Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

michaelawyu Dec 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ryanzhang-oss Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

michaelawyu Dec 22, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Dec 29, 2025

Codecov Report

Uh oh!

michaelawyu commented Dec 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ryanzhang-oss Dec 22, 2025 •

edited

Loading

michaelawyu Dec 19, 2025 •

edited

Loading