Skip to content

Service updates don't always reconcile when service routing is used.#75

Merged
bpdohall merged 3 commits intoteleportfrom
brendan/dont-reconcile-services-clusterip
Mar 21, 2025
Merged

Service updates don't always reconcile when service routing is used.#75
bpdohall merged 3 commits intoteleportfrom
brendan/dont-reconcile-services-clusterip

Conversation

@bpdohall
Copy link
Collaborator

This PR adds on to the behaviour change in #72.

Modifies the predicate filters on the the service watch so update events for services without endpoint routing will only trigger a reconcile if the cluster IP changes.

Adjusts the resource tree builder so endpoints are not collected when a service does not use endpoint routing.

@bpdohall bpdohall self-assigned this Mar 18, 2025
@bpdohall bpdohall requested review from dboslee and sclevine March 18, 2025 18:12
@bpdohall
Copy link
Collaborator Author

I did some test runs with extra logging in dev-blue while I was figuring out how all this should work. #74 (comment)

return false
}

if r.validateServiceOwnedByGateway(newSvc) {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this redundant since the other service predicate already check this? If so then there is no need to make any changes to validateServiceForReconcile.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bit unclear to me.

Doespredicate.And always run both validateServiceForReconcile and validateServiceUpdateForReconcile? If not, we'd need to make sure this function returns true when the service is owned by gateway controller so the other predicate has to run.

I made a minor effort to distinguish the behaviour with my debug builds, but I didn't make a clear determination.

Copy link
Collaborator Author

@bpdohall bpdohall Mar 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've reverted all changes to validateServiceForReconcile and modified validateServiceUpdateForReconcile to check if the service is owned by a gateway and return true in that case. 8774127

From what I see, predicate.And will only iterate predicate functions until one is false: https://github.com/kubernetes-sigs/controller-runtime/blob/5dfe3216fb7fd7f5afb61d6d0f8956c7bec8df62/pkg/predicate/predicate.go#L309-L316

Iteration of a predicate slice is indeterminate. This means we can only return false in the specific scenario we want to exclude from reconciliation when we're modifying the existing behaviour using predicate.And.

@bpdohall bpdohall force-pushed the brendan/dont-reconcile-services-clusterip branch 2 times, most recently from d0e0fbc to e580781 Compare March 19, 2025 15:56
add tests
invert service conditions in validateServiceUpdateForReconcile
@bpdohall bpdohall force-pushed the brendan/dont-reconcile-services-clusterip branch from e580781 to 8774127 Compare March 19, 2025 15:58
@bpdohall bpdohall requested a review from dboslee March 19, 2025 16:09
@bpdohall
Copy link
Collaborator Author

bpdohall commented Mar 19, 2025

Ran a debug build (v1.2.4-teleport.2-3-g684f3115-dirty) in my dev env. When the controller starts we see validateServiceForReconcile returning "true":

2025-03-19T20:31:58.025Z        INFO    config-loader   loader/configloader.go:105      running hook
2025-03-19T20:31:58.025Z        INFO    config-loader   loader/configloader.go:47       watching for changes to the EnvoyGateway configuration  {"path": "/config/envoy-gateway.yaml"}
2025-03-19T20:31:58.025Z        INFO    admin   admin/server.go:34      starting admin server   {"address": "127.0.0.1:19000", "enablePprof": false}
2025-03-19T20:31:58.025Z        INFO    metrics metrics/register.go:171 initialized metrics pull endpoint       {"address": "0.0.0.0:19001", "endpoint": "/metrics"}
2025-03-19T20:31:58.025Z        INFO    cmd/server.go:59        Setup runners
2025-03-19T20:31:58.025Z        INFO    metrics metrics/register.go:60  starting metrics server {"address": "0.0.0.0:19001"}
...snip...
2025-03-19T20:31:58.257Z        INFO    provider        controller/controller.go:175    Starting EventSource    {"runner": "provider", "controller": "gatewayapi-1742416318", "source": "kind source: *v1alpha1.HTTPRouteFilter"}
2025-03-19T20:31:58.257Z        INFO    provider        controller/controller.go:183    Starting Controller     {"runner": "provider", "controller": "gatewayapi-1742416318"}
2025-03-19T20:31:58.357Z        INFO    provider        kubernetes/predicates.go:41     gatewayclass has matching controller name, processing   {"runner": "provider", "name": "tenant-gateway"}
2025-03-19T20:31:58.377Z        INFO    provider        kubernetes/controller.go:1373   --- predicate --- validateServiceForReconcile   {"runner": "provider", "service": "brendan-us-west-2-teleport-proxy-client", "result": true}
2025-03-19T20:31:58.377Z        INFO    provider        kubernetes/controller.go:1373   --- predicate --- validateServiceForReconcile   {"runner": "provider", "service": "brendan-us-west-2-teleport-proxy", "result": true}

When I edit the service to change the pod selector we see the new predicate blocking reconciliation of the service update event:

2025-03-19T20:32:22.352Z        INFO    provider.tenant-gateway.gateway-system  kubernetes/status_updater.go:109        status unchanged, bypassing update      {"runner": "provider"}
2025-03-19T20:32:22.352Z        INFO    xds-server      runner/runner.go:151    received an update      {"runner": "xds-server"}
2025-03-19T20:32:22.358Z        INFO    xds-server      v3/simple.go:569        open delta watch ID:4 for type.googleapis.com/envoy.config.listener.v3.Listener Resources:map[] from nodeID: "envoy-gateway-system-tenant-gateway-c791e292-8595f5b66-rbscg",  version "2"
2025-03-19T20:33:26.593Z        INFO    provider        kubernetes/controller.go:1364   *** predicate *** validateServiceUpdateForReconcile     {"runner": "provider", "service": "brendan-us-west-2-teleport-proxy-client", "result": false}
2025-03-19T20:33:26.771Z        INFO    provider        kubernetes/controller.go:1364   *** predicate *** validateServiceUpdateForReconcile     {"runner": "provider", "service": "brendan-us-west-2-teleport-proxy-client", "result": false}

@bpdohall bpdohall merged commit 34f2938 into teleport Mar 21, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants