Skip to content

xds: Move backend service label plumbing to CDS#12882

Open
becomeStar wants to merge 2 commits into
grpc:masterfrom
becomeStar:xds-backend-service-plumbing-cds
Open

xds: Move backend service label plumbing to CDS#12882
becomeStar wants to merge 2 commits into
grpc:masterfrom
becomeStar:xds-backend-service-plumbing-cds

Conversation

@becomeStar

Copy link
Copy Markdown
Contributor

Addresses #12431.

This moves backend service metric label plumbing from xds_cluster_impl to cds, matching the A75/A89 aggregate cluster behavior.

Previously, xds_cluster_impl added the grpc.lb.backend_service pick-details label and propagated NameResolver.ATTR_BACKEND_SERVICE to its child policy. That works for simple clusters, but it is the wrong layer after A75 because aggregate clusters need the leaf CDS cluster name, not the aggregate root cluster name.

This change makes CDS responsible for backend service plumbing:

  • non-aggregate CDS adds the grpc.lb.backend_service pick-details label using the CDS cluster name
  • non-aggregate CDS propagates NameResolver.ATTR_BACKEND_SERVICE to its child policy, so WRR can consume the backend service context
  • aggregate root CDS does not add the backend service attribute or pick-details label
  • leaf CDS instances under an aggregate cluster add their own leaf cluster name
  • xds_cluster_impl no longer owns backend service pick-details label or child attribute propagation

The existing InternalEquivalentAddressGroup.ATTR_BACKEND_SERVICE path is left unchanged for endpoint/subchannel metrics.

Tests cover:

  • non-aggregate CDS propagating NameResolver.ATTR_BACKEND_SERVICE
  • non-aggregate CDS adding grpc.lb.backend_service
  • aggregate root CDS not adding the root cluster name
  • aggregate leaf CDS instances propagating leaf cluster names
  • xds_cluster_impl no longer adding backend service labels or attributes
  • picker tests using a non-null PickDetailsConsumer, matching the real PickSubchannelArgs contract

A75 changed aggregate cluster handling so aggregate clusters are
 represented as priority children whose children are CDS policies.
In that structure, A89 expects backend service metric plumbing to go
 through CDS instead of xds_cluster_impl.

Set NameResolver.ATTR_BACKEND_SERVICE from CdsLoadBalancer2 for
 non-aggregate leaf clusters so WRR metrics can read the backend service.
Also add the grpc.lb.backend_service pick details label from
 the leaf CDS picker wrapper.

Do not add the aggregate root cluster as backend_service. Remove the
 old backend_service attribute and pick details label handling from
 ClusterImplLoadBalancer.

@Test
// Both priorities will get tried using real priority LB policy.
public void discoverAggregateCluster_testChildCdsLbPolicyParsing() {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also test triggering a subchannel pick using a mock PickDetailsConsumer for the aggregate cluster and verify that the leaf instance actually adds the rpc.lb.backend_service label during a pick?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review.

Done. I added a test that uses the real priority policy to instantiate the leaf CDS path under an aggregate cluster, then triggers a pick through the top-level picker with a mock PickDetailsConsumer.

The test verifies that grpc.lb.backend_service is added with the leaf cluster name and not the aggregate root cluster name.

I also renamed the aggregate-root negative test from "Or" to "And" since it checks both the backend service attribute and the pick-details label.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR moves backend service metric label plumbing from xds_cluster_impl to cds so that aggregate clusters use the correct leaf CDS cluster name (per A75/A89 behavior), and removes the now-incorrect plumbing from xds_cluster_impl.

Changes:

  • CdsLoadBalancer2 now (for non-aggregate clusters) propagates NameResolver.ATTR_BACKEND_SERVICE downstream and injects the per-RPC pick-details label grpc.lb.backend_service using the CDS cluster name; aggregate-root CDS explicitly does not inject these.
  • ClusterImplLoadBalancer no longer sets NameResolver.ATTR_BACKEND_SERVICE for its child policy and no longer adds the grpc.lb.backend_service pick-details label.
  • Tests are updated to validate the new propagation/labeling behavior and to use a non-null PickDetailsConsumer in picker invocations (matching the PickSubchannelArgs contract).

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated no comments.

Show a summary per file
File Description
xds/src/test/java/io/grpc/xds/ClusterResolverLoadBalancerTest.java Updates picker test helper to provide a non-null PickDetailsConsumer.
xds/src/test/java/io/grpc/xds/ClusterImplLoadBalancerTest.java Adjusts assertions to confirm xds_cluster_impl no longer propagates backend service attributes or pick-details labels.
xds/src/test/java/io/grpc/xds/CdsLoadBalancer2Test.java Adds/updates tests for non-aggregate vs aggregate-root behavior and ensures picker args include a non-null PickDetailsConsumer.
xds/src/main/java/io/grpc/xds/ClusterImplLoadBalancer.java Removes backend service attribute propagation and pick-details label injection from xds_cluster_impl.
xds/src/main/java/io/grpc/xds/CdsLoadBalancer2.java Adds backend service attribute propagation and pick-details label injection at the CDS layer, with aggregate-root disabling.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Add coverage for the aggregate cluster path to verify that a pick through
 the aggregate LB stack adds grpc.lb.backend_service from the leaf CDS
 instance, not from the aggregate root.

Also rename the aggregate-root negative test to use "And" because it
 checks both the backend service attribute and pick-details label.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants