-
Notifications
You must be signed in to change notification settings - Fork 262
A115: disable priority LB child policy retention cache #541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
10 commits
Select commit
Hold shift + click to select a range
5798457
A56 update: disable priority LB child policy retention cache
apolcyn f4ff35a
specify env var
apolcyn 5b2a3fc
rename grfc
apolcyn 0063acd
add updated header
apolcyn 27826d1
updates
apolcyn ed5414a
add updates
apolcyn a063941
review comments
apolcyn 1850df2
address comments
apolcyn 926c692
address comments
apolcyn 11a7dfc
Update A115-remove-priority-lb-child-policy-cache.md
apolcyn File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,83 @@ | ||
| A115: disable Priority LB policy child policy retention cache | ||
| ---- | ||
| * Author(s): @apolcyn, @markdroth | ||
| * Approvers: @markdroth, @ejona86, @dfawley, @easwars | ||
| * Status: {Draft} | ||
| * Implemented in: C-core, Java, Go, Node | ||
| * Last updated: 2026-03-17 | ||
| * Discussion at: <google group thread> (filled after thread exists) | ||
|
|
||
| ## Abstract | ||
|
|
||
| [A56](A56-priority-lb-policy.md) describes | ||
| [mechanisms](A56-priority-lb-policy.md#child-lifetime-management) | ||
| whereby priority LB child policies are cached. There are two cases: | ||
|
|
||
| 1) When a higher priority child becomes reachable, we deactive | ||
| the lower-priority children, and remove them only after an expiry. | ||
|
|
||
| 2) When a child is removed from the LB policy config. | ||
|
|
||
| This proposal removes the usage of a cache for case 2. | ||
|
|
||
| ## Background | ||
|
|
||
| The priority LB child policy retention cache consumes excessive memory under the | ||
| right circumstances (depending on the rate and pattern of locality updates). | ||
|
|
||
| This is especially the case when a locality is flapping between failover and primary | ||
| priorities. For example, notice how priority LB child names increase in the following | ||
| sequence of locality updates. On each child name update, previous policies are added | ||
| to the retention cache. | ||
|
|
||
| For example, consider the following sequence of updates: | ||
| 1. P0=[AA, BB], P1=[CC, DD]: P0 will be assigned "child0", P1 will be assigned "child1". | ||
| 2. P0=[CC], P1=[DD, EE]: P0 will be assigned "child1" (reusing the child name containing "CC" from the previous update), P1 will be assigned "child2" (new child number). | ||
| 3. P0=[AA, BB], P1=[CC, DD]: P0 will be assigned "child3" (new child number), P1 will be assigned "child1" (reusing the child name containing "CC" from the previous update). | ||
|
|
||
| Additionally, priority LB child names are generated with strictly increasing numbers | ||
| (once a priority LB child name is unconfigured, it will never be configured again). As such, | ||
| the cache is not providing us value. | ||
|
|
||
| ## Proposal | ||
|
|
||
| Priority LB should disable the child policy retention cache, when a | ||
| child is removed from its config (i.e., case 2 only). | ||
|
|
||
| Note this should be done for Java and Go only. | ||
|
|
||
| For C-core, we are actually potentially getting benefit from this | ||
| behavior due to subchannel pooling, so we're not planning to drop | ||
| it there until we have the longer-term solution ready. | ||
|
|
||
| ### Temporary environment variable protection | ||
|
|
||
| Implementations should provide an environment variable to revert | ||
| to the previous behavior (child policy cache enabled with 15-minute timer). | ||
|
ejona86 marked this conversation as resolved.
|
||
|
|
||
| This should be kept around for a few releases, and then removed. | ||
|
|
||
| Env var name: `GRPC_EXPERIMENTAL_ENABLE_PRIORITY_LB_CHILD_POLICY_CACHE`. | ||
|
|
||
| ## Rationale | ||
|
markdroth marked this conversation as resolved.
|
||
|
|
||
| - Caching the child when it gets removed from the config does not actually | ||
| accomplish anything useful in the case of choosing a priority within an | ||
| xDS cluster (which is the primary case where this policy is used), because | ||
| the hueristic in the cds policy that assigns the child names will never | ||
| reuse a child name once it has been removed from the config. | ||
|
|
||
| - We have seen cases where retaining the children has used up a lot of memory | ||
| and file descriptors, which has caused problems for users. | ||
|
|
||
| - In the long run, we want a better solution involving a separate layer for | ||
| caching subchannels rather than LB policies, but that will be a separate | ||
| project to be undertaken later. | ||
|
|
||
| ## Implementation | ||
|
|
||
| N/A | ||
|
|
||
| ## Open issues (if applicable) | ||
|
|
||
| N/A | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.