Skip to content

Canonicalize the IPv6 Addresses when they exist#1011

Closed
barbacbd wants to merge 2 commits intokubernetes:masterfrom
barbacbd:OCPBUGS-79354
Closed

Canonicalize the IPv6 Addresses when they exist#1011
barbacbd wants to merge 2 commits intokubernetes:masterfrom
barbacbd:OCPBUGS-79354

Conversation

@barbacbd
Copy link
Copy Markdown

@barbacbd barbacbd commented Mar 23, 2026

Canonicalize the IPv6 Addresses when they exist.

providers/gce/gce_instances.go:
IPv6 Address short notation is treated differently by cluster components. Canonicalize the addresses to ensure
a consistent comparison.

Tests are created with Claude and updated by @barbacbd.

This addresses OCPBUGS-79354.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 23, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

This issue is currently awaiting triage.

If the repository mantainers determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 23, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @barbacbd. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot requested review from cici37 and jpbetz March 23, 2026 16:05
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: barbacbd
Once this PR has been reviewed and has the lgtm label, please assign mrhohn for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Mar 23, 2026
@barbacbd
Copy link
Copy Markdown
Author

Originally found by @tthvo

/cc @tthvo

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@barbacbd: GitHub didn't allow me to request PR reviews from the following users: tthvo.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

Details

In response to this:

Originally found by @tthvo

/cc @tthvo

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@hdp617
Copy link
Copy Markdown
Contributor

hdp617 commented Mar 23, 2026

Thanks! Would you mind adding unit tests?

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 23, 2026
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 23, 2026
providers/gce/gce_instances.go:
IPv6 Address short notation is treated differently by cluster components. Canonicalize the addresses to ensure
a consistent comparison.
Note: The tests are generated by Claude.
@barbacbd barbacbd changed the title Ensure that the IPv6 short notation is excepted. Canonicalize the IPv6 Addresses when they exist Mar 23, 2026
@barbacbd
Copy link
Copy Markdown
Author

@hdp617 thank you, it should be updated to include tests now.

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@barbacbd: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
cloud-provider-gcp-e2e-full 6f4e89b link true /test cloud-provider-gcp-e2e-full
pull-cloud-provider-gcp-scenario-kops-simple 6f4e89b link false /test pull-cloud-provider-gcp-scenario-kops-simple
cloud-provider-gcp-tests 6f4e89b link true /test cloud-provider-gcp-tests

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@aojea
Copy link
Copy Markdown
Member

aojea commented Mar 24, 2026

this can create controller hotloops, we got bit by this in kubernetes multiple times , and most of the time is a bug in the client rather than in the server side.

I do not have access to the bug but I want to understand better this problem and why this solution.

/hold
/assign @marqc

@k8s-ci-robot
Copy link
Copy Markdown
Contributor

@aojea: GitHub didn't allow me to assign the following users: marqc.

Note that only kubernetes members with read permissions, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

Details

In response to this:

this can create controller hotloops, we got bit by this in kubernetes multiple times , and most of the time is a bug in the client rather than in the server side.

I do not have access to the bug but I want to understand better this problem and why this solution.

/hold
/assign @marqc

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 24, 2026
@tthvo
Copy link
Copy Markdown

tthvo commented Mar 24, 2026

this can create controller hotloops, we got bit by this in kubernetes multiple times , and most of the time is a bug in the client rather than in the server side.

I do not have access to the bug but I want to understand better this problem and why this solution.

@aojea Indeed, we came across this "problem" when one of openshift cluster components assumes the node IPv6 addresses to be in canonical format. In our case, the etcd-operator made such an assumption; and ran into a loop of IP mismatch (compare full vs canonical form), causing a constant cert rotation --> constant pod rollout --> etcd unhealthy.

We should probably best fix it there. But, looking at cloud-provider-aws approach, which writes the canonical IPv6 address, I was just wondering if canonical/short form is the standard way and cloud-provider-gcp should also follow (hence this PR). WDYT?

@tthvo
Copy link
Copy Markdown

tthvo commented Mar 24, 2026

I was just wondering if canonical/short form is the standard way and cloud-provider-gcp should also follow (hence this PR).

hmm, just as I wrote this, I saw cloud-provider-azure does not canonicalize the ipv6 address at all. So, maybe we really should fix in our consumer side.

Though, do you know if there is/should-be a standard or enforced way for this at all?

@aojea
Copy link
Copy Markdown
Member

aojea commented Mar 25, 2026

Though, do you know if there is/should-be a standard or enforced way for this at all?

server side should not workaround clients problems, client write A it should expect to read A not f(A) ... kubernetes has this canonicalize hook on the apiserver, and it is not used because the few apis that used it ended in hotloop problems like the one you are describing,

@barbacbd barbacbd closed this Mar 25, 2026
@barbacbd
Copy link
Copy Markdown
Author

Closed in favor of openshift/cluster-etcd-operator#1577. Rather than potentially upload a breaking change, we make the change on our "client" side.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants