Skip to content

feat: wire namespace affinity scheduler plugin#461

Merged
weng271190436 merged 26 commits intokubefleet-dev:mainfrom
weng271190436:weiweng/namespace-affinity-plugin
Mar 25, 2026
Merged

feat: wire namespace affinity scheduler plugin#461
weng271190436 merged 26 commits intokubefleet-dev:mainfrom
weng271190436:weiweng/namespace-affinity-plugin

Conversation

@weng271190436
Copy link
Copy Markdown
Member

@weng271190436 weng271190436 commented Feb 23, 2026

Description of your changes

Namespace affinity scheduler plugin that schedules RP to clusters with namespace based on MemberCluster namespaces status.

The user scenario:

  1. Platform team only place namespace to a few clusters.
  2. Application team use RP to place their services to all or some of the clusters where their namespace is available.

Fixes #501

I have:

  • Run make reviewable to ensure this PR is ready for review.

How has this code been tested

Special notes for your reviewer

@codecov
Copy link
Copy Markdown

codecov bot commented Feb 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@weng271190436 weng271190436 force-pushed the weiweng/namespace-affinity-plugin branch 2 times, most recently from eebd744 to 8fa620f Compare February 24, 2026 20:16
Wei Weng added 3 commits February 26, 2026 15:50
Signed-off-by: Wei Weng <Wei.Weng@microsoft.com>
Signed-off-by: Wei Weng <Wei.Weng@microsoft.com>
Signed-off-by: Wei Weng <Wei.Weng@microsoft.com>
@weng271190436 weng271190436 force-pushed the weiweng/namespace-affinity-plugin branch from 32318e6 to 08353f0 Compare February 26, 2026 17:06
Wei Weng added 3 commits February 26, 2026 17:19
Signed-off-by: Wei Weng <Wei.Weng@microsoft.com>
Signed-off-by: Wei Weng <Wei.Weng@microsoft.com>
Signed-off-by: Wei Weng <Wei.Weng@microsoft.com>
return nil
}
Eventually(rpStatusUpdatedActual, eventuallyDuration, eventuallyInterval).Should(Succeed(), "Failed to update RP %s status as expected", rpName)
rpStatusUpdatedActual := rpStatusUpdatedActual(appConfigMapIdentifiers(), []string{memberCluster2EastCanaryName, memberCluster3WestProdName}, nil, "0")
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously we expect memberCluster1EastProdName to fail. Now with namespace affinity plugin, memberCluster1EastProdName won't be selected so the test expectation changes

}
})

It("should wait for namespace collection to sync on all member clusters", func() {
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Namespace collection can take a while to finish so placing this in all RP tests where a CRP was created to place a namespace to member clusters where RP resources go

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the namepace is not reported in the MC yet? The scheduler should be triggered again when the namespace is reported. Does the eventually block in the following IT not enough?

Copy link
Copy Markdown
Member Author

@weng271190436 weng271190436 Mar 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the namespace is finally reported in MC, the member cluster watcher can detect it and triggers placement reconciliation.

The eventually block can work if we wait long enough. The current 10 second wait time is not enough. And waiting for namespace collection makes it more deterministic

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After pulling most waitForNamespaceCollectionOnClusters into its own It block based on
michaelawyu's comment, many of the waits are after createRP because many createRP happens inside BeforeAll. So waitForNamespaceCollectionOnClusters doesn't have to happen before createRP.

There are exceptions. For example in "Test cluster scale out and shrink using pickN policy with namespaced staged update run," we expect PickN where N = 1 to pick kind-cluster-3. This can also be deterministic if the namespaces are ready before createRP.

@weng271190436 weng271190436 marked this pull request as ready for review February 26, 2026 20:19
@weng271190436 weng271190436 changed the title feat: namespace affinity scheduler plugin feat: wire namespace affinity scheduler plugin Feb 26, 2026
Wei Weng and others added 2 commits February 27, 2026 14:47
}

// TestNewProfileWithOptions tests the creation of a scheduling profile with custom options.
func TestNewProfileWithOptions(t *testing.T) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this test actually test?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am testing calling NewProfile with option

// default plugin list
clusterAffinityPlugin := clusteraffinity.New()
if opts.ClusterAffinityPlugin != nil {
clusterAffinityPlugin = *opts.ClusterAffinityPlugin
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why is ClusterAffinityPlugin customizable while others are static?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to avoid merge conflict when backporting so allowing customization here while using the customization capability in Azure/fleet

Azure/fleet uses a customized cluster affinity plugin to support cluster affinity based on VM SKU availability

https://github.com/Azure/fleet/pull/1221/changes

}
})

It("should wait for namespace collection to sync on all member clusters", func() {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if the namepace is not reported in the MC yet? The scheduler should be triggered again when the namespace is reported. Does the eventually block in the following IT not enough?

Copy link
Copy Markdown
Member

@michaelawyu michaelawyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some comments, PTAL.

Eventually(crpStatusUpdatedActual, eventuallyDuration, eventuallyInterval).Should(Succeed(), "Failed to update CRP status as expected")

By("waiting for namespace to be collected on all member clusters")
waitForNamespaceCollectionOnClusters(nsName, allMemberClusterNames)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi weng271190436! Just a nit: I understand that this might be a must considering that it takes ~15 seconds for a namespace to be reported and our eventually block runs only for 10 seconds; but since it is a necessary step, would it be better if we make it a separate It block when applicable?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pulled them into It blocks in most cases. There are 3 cases where waiting in It block doesn't work and I commented below

--set workApplierRequeueRateLimiterMaxFastBackoffDelaySeconds=5 \
--set propertyProvider=$PROPERTY_PROVIDER \
--set region=${REGIONS[$i]} \
--set enableNamespaceCollectionInPropertyProvider=true \
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi weng271190436! Since this is enabled per member agent and as a part of the property provider, we might need a check in the setup as well to make sure that it is running properly.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The member clusters are not joined to the hub during setup.sh. If they are joined I can verify that the namespaces are populated on the hub membercluster CRs. But since they are not joined, I can't think of a good way to make sure that it is running properly here. When they are joined, I check NamespaceCollectionSucceededCondType in utils_test.go to make sure that it is working.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Wei! I mean that there was a separate part of the check at the environment setup step, checkIfAzurePropertyProviderIsWorking(), for verifying if property provider is working properly before the tests start up.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Come to think of that, for the test steps that require the property provider, we also need to mark them specifically so that they can be skipped if the environment has not provider available (isAzurePropertyProviderEnabled).


// default plugin list
clusterAffinityPlugin := clusteraffinity.New()
if opts.ClusterAffinityPlugin != nil {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, a note on this part: originally we added the profile support so that developers (or users) can switch scheduler profiles as needed (e.g., for some placements, use profile A, and the others profile B). Each profile can have its own plugin chain (e.g., B has cluster affinity support, A does not); it seems that the options part was added to achieve the same purpose? Of course we haven't exposed the profile switch API yet.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the code and the options is not in use at all. Was it added back then to faciliate the development of the RP API?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I get it. We want to avoid merge conflict when backporting so allowing customization here while using the customization capability in Azure/fleet

https://github.com/Azure/fleet/pull/1221/changes

weng271190436 and others added 9 commits March 5, 2026 10:34
Signed-off-by: Wei Weng <Wei.Weng@microsoft.com>
Signed-off-by: Wei Weng <Wei.Weng@microsoft.com>
Signed-off-by: Wei Weng <Wei.Weng@microsoft.com>
Signed-off-by: Wei Weng <Wei.Weng@microsoft.com>
Signed-off-by: Wei Weng <Wei.Weng@microsoft.com>
Signed-off-by: Wei Weng <Wei.Weng@microsoft.com>
@weng271190436 weng271190436 force-pushed the weiweng/namespace-affinity-plugin branch from c848cb8 to b1c572d Compare March 5, 2026 23:41
Wei Weng added 2 commits March 6, 2026 14:37
Signed-off-by: Wei Weng <Wei.Weng@microsoft.com>
Signed-off-by: Wei Weng <Wei.Weng@microsoft.com>
Expect(memberCluster.KubeClient.Create(ctx, &ns)).To(Succeed())
}

By("waiting for namespace to be collected on all member clusters")
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't convert this to It block (based on a comment) because the RP created right below depends on namespaces created above. So need to wait inside BeforeAll

// Ensure the namespace collection is synced on all member clusters before creating the RP.
// We expect kind-cluster-3 to be picked by PickN with N=1 policy due to the alphabetical order of cluster names.
// If kind-cluster-3 does not have namespace synced during schedule time, the RP might pick kind-cluster-1 or kind-cluster-2.
By("should wait for namespace collection to sync on all member clusters")
Copy link
Copy Markdown
Member Author

@weng271190436 weng271190436 Mar 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here also needs to wait inside BeforeAll. Commented on why

crpStatusUpdatedActual := crpStatusUpdatedActual(workNamespaceIdentifiers(), allMemberClusterNames, nil, "0")
Eventually(crpStatusUpdatedActual, eventuallyDuration, eventuallyInterval).Should(Succeed(), "Failed to update CRP status as expected")

By("waiting for namespace collection to sync on all member clusters")
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In custom test, we expect "should not update RP status immediately". If I wait for namespace collection after the RP creation, the RP is likely to be updated when we expect it to be not updated yet.

Copy link
Copy Markdown
Member

@michaelawyu michaelawyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ;)

@weng271190436 weng271190436 merged commit 0c13d1a into kubefleet-dev:main Mar 25, 2026
22 of 23 checks passed
@weng271190436 weng271190436 deleted the weiweng/namespace-affinity-plugin branch March 25, 2026 23:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] namespace affinity scheduler plugin that schedules resource placement to member clusters where namespace exists

3 participants