Skip to content

Conversation

@marek-szews
Copy link

@marek-szews marek-szews commented Oct 14, 2025

Implements [gRFC A68]https://github.com/grpc/proposal/blob/master/A68-random-subsetting.md.

Note that this PR only implements the LB policy and does not implement the xDS integration specified here: https://github.com/grpc/proposal/blob/master/A68-random-subsetting.md#xds-integration

RELEASE NOTES:

  • balancer/randomsubsetting: Implementation of the random_subsetting LB policy

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Oct 14, 2025

CLA Signed

The committers listed above are authorized under a signed CLA.

@easwars easwars self-assigned this Oct 15, 2025
@easwars easwars self-requested a review October 15, 2025 19:17
@easwars easwars added Type: Feature New features or improvements in behavior Area: xDS Includes everything xDS related, including LB policies used with xDS. Area: Resolvers/Balancers Includes LB policy & NR APIs, resolver/balancer/picker wrappers, LB policy impls and utilities. and removed Area: xDS Includes everything xDS related, including LB policies used with xDS. labels Oct 15, 2025
@easwars
Copy link
Contributor

easwars commented Oct 15, 2025

@marek-szews : Could you please get the tests to pass. Thanks.

@github-actions
Copy link

This PR is labeled as requiring an update from the reporter, and no update has been received after 6 days. If no update is provided in the next 7 days, this issue will be automatically closed.

@github-actions github-actions bot added stale and removed stale labels Oct 23, 2025
@dfawley
Copy link
Member

dfawley commented Oct 29, 2025

Hi @marek-szews - I think we're waiting on you to make some updates so we can continue the review. We can discuss Friday but I wanted to ping this in part to prevent the stale bot from closing it.

@arjan-bal arjan-bal added this to the 1.78 Release milestone Oct 30, 2025
Copy link
Author

@marek-szews marek-szews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rework done, tests passed

@dfawley dfawley assigned easwars and unassigned marek-szews Oct 31, 2025
@easwars easwars changed the title Implementation of A68 random_subsetting LB policy. balancer/randomsubsetting: Implementation of the LB policy Nov 13, 2025
@easwars easwars changed the title balancer/randomsubsetting: Implementation of the LB policy balancer/randomsubsetting: Implementation of the random_subsetting LB policy Nov 13, 2025
@easwars
Copy link
Contributor

easwars commented Nov 13, 2025

Rework done, tests passed

Tests are still failing. Please read this document for guidelines: https://github.com/grpc/grpc-go/blob/master/CONTRIBUTING.md

TL;dr

All tests need to be passing before your change can be merged. We recommend you run tests locally before creating your PR to catch breakages early on:

./scripts/vet.sh to catch vet errors.
go test -cpu 1,4 -timeout 7m ./... to run the tests.
go test -race -cpu 1,4 -timeout 7m ./... to run tests in race mode

@easwars easwars assigned marek-szews and unassigned easwars Nov 13, 2025
}

if b.cfg == nil || b.cfg.ChildPolicy.Name != lbCfg.ChildPolicy.Name {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this newline.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm talking about the newline after the if and before the call to SwitchTo.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

maxDiff int
}{
{
name: "backends could be evenly distributed between small number of clients",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see here for guidelines on how best to name subtests: https://google.github.io/styleguide/go/decisions#subtest-names

TL;dr
Think of subtest names more like a function identifier than a prose description.

The test runner replaces spaces with underscores, and escapes non-printing characters. To ensure accurate correlation between test logs and source code, it is recommended to avoid using these characters in subtest names.

@github-actions
Copy link

This PR is labeled as requiring an update from the reporter, and no update has been received after 6 days. If no update is provided in the next 7 days, this issue will be automatically closed.

@github-actions github-actions bot added the stale label Nov 27, 2025
@github-actions github-actions bot closed this Dec 4, 2025
@codecov
Copy link

codecov bot commented Dec 4, 2025

Codecov Report

❌ Patch coverage is 80.64516% with 12 lines in your changes missing coverage. Please review.
✅ Project coverage is 83.19%. Comparing base (8110884) to head (db49490).
⚠️ Report is 51 commits behind head on master.

Files with missing lines Patch % Lines
balancer/randomsubsetting/randomsubsetting.go 80.64% 7 Missing and 5 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #8650      +/-   ##
==========================================
+ Coverage   79.45%   83.19%   +3.74%     
==========================================
  Files         415      419       +4     
  Lines       41339    32429    -8910     
==========================================
- Hits        32844    26978    -5866     
+ Misses       6621     4062    -2559     
+ Partials     1874     1389     -485     
Files with missing lines Coverage Δ
balancer/randomsubsetting/randomsubsetting.go 80.64% <80.64%> (ø)

... and 373 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

}

if b.cfg == nil || b.cfg.ChildPolicy.Name != lbCfg.ChildPolicy.Name {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm talking about the newline after the if and before the call to SwitchTo.

func (b *subsettingBalancer) UpdateClientConnState(s balancer.ClientConnState) error {
lbCfg, ok := s.BalancerConfig.(*lbConfig)
if !ok {
b.logger.Errorf("Received config with unexpected type %T: %v", s.BalancerConfig, s.BalancerConfig)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be logged at WARNING level instead of error. Please see https://github.com/grpc/grpc-go/blob/master/Documentation/log_levels.md for guidance on what logging levels to use.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

func (b *subsettingBalancer) prepareChildResolverState(s balancer.ClientConnState) resolver.State {
subsetSize := b.cfg.SubsetSize
endpoints := s.ResolverState.Endpoints
if len(endpoints) <= int(subsetSize) || subsetSize < 2 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The subsetting algorithm described in the gRFC does not specify any special considerations for subset sizes less than two. Please see: https://github.com/grpc/proposal/blob/master/A68-random-subsetting.md#subsetting-algorithm

So, please remove this condition subsetSize < 2. If the user wants a subset size of 1, they should get just that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done


// implements the subsetting algorithm,
// as described in A68: https://github.com/grpc/proposal/blob/master/A68-random-subsetting.md
func (b *subsettingBalancer) prepareChildResolverState(s balancer.ClientConnState) resolver.State {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason for this method to accept a balancer.ClientConnState and return a resolver.State. Why not accept a []resolver.Endpoint and return a []resolver.Endpoint? and let the caller deal with creating the resolver.State to be passed to the child.

Also, when we do that, we should rename this method appropriately.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's a good idea to do this, especially since the interface of this function has changed completely (in/out).
Done

}

// implements the subsetting algorithm,
// as described in A68: https://github.com/grpc/proposal/blob/master/A68-random-subsetting.md
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If adding a link, please add it to the specific sub-section in the gRFC that talks about the subsetting algorithm or just don't add one.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

*
*/

// Package e2e_test contains e2e test cases for the Subsetting LB Policy.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is not valid anymore.

wantErr string
}{
{
name: "invalid-json",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

{
name: "invalid-json",
input: "{{invalidjson{{",
wantErr: "invalid character",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has now become a change detector test for the json package. Please change the error string returned when unmarshal fails so that you can look for that here instead. So, maybe something like:

	if err := json.Unmarshal(s, lbCfg); err != nil {
		return nil, fmt.Errorf("randomsubsetting: json.Unmarshal failed for configuration: %s with error: %v", string(s), err)
	}

Then, your wantErr here could be "json.Unmarshal failed for configuration"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Comment on lines +116 to +118
if diff := cmp.Diff(gotCfg, test.wantCfg); diff != "" {
t.Fatalf("ParseConfig(%v) got unexpected output, diff (-got +want): %v", test.input, diff)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the order of arguments passed to cmp.Diff and in the error message is inverted. See here for examples: https://google.github.io/styleguide/go/decisions#equality-comparison-and-diffs

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}
}

func (s) TestSubsettingE2E(t *testing.T) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only real piece of functionality provided by this LB policy is the subsetting algorithm. If we refactor it the way I recommended in another comment, then we could simply have unit tests for that method to ensure that subsetting behavior is as expected.

And then we could have a functional test where we:

There is very little value is having an E2E style for this LB policy.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: Resolvers/Balancers Includes LB policy & NR APIs, resolver/balancer/picker wrappers, LB policy impls and utilities. Type: Feature New features or improvements in behavior

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants