NE-2421: Support dual-stack IngressController on AWS#1376
NE-2421: Support dual-stack IngressController on AWS#1376alebedev87 wants to merge 1 commit intoopenshift:masterfrom
Conversation
|
@alebedev87: This pull request references NE-2421 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Important Review skippedAuto reviews are limited based on label configuration. 🚫 Review skipped — only excluded labels are configured. (1)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository YAML (base), Organization UI (inherited) Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds dual‑stack handling across the ingress and DNS codepaths for AWS: LoadBalancer service creation and reconciliation now set and propagate 🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
@alebedev87: This pull request references NE-2421 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
pkg/operator/controller/ingress/load_balancer_service_test.go (1)
1367-1381: Add a focused test for CLB dual-stack progressing semantics.You added good update-detection coverage, but there’s still no assertion around the new CLB dual-stack path in
loadBalancerServiceIsProgressing. A dedicated test would lock in whether this is advisory-only or should affect Progressing status.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/operator/controller/ingress/load_balancer_service_test.go` around lines 1367 - 1381, Add a focused unit test in load_balancer_service_test.go that asserts the CLB dual-stack code path in loadBalancerServiceIsProgressing: create a Service fixture, mutate it to represent the CLB dual-stack scenario (set .Spec.IPFamilies to both IPv4 and IPv6 and/or .Spec.IPFamilyPolicy to RequireDualStack as in the existing table entries), call loadBalancerServiceIsProgressing(originalSvc, mutatedSvc) and assert the expected boolean (true if this should mark Progressing or false if it's advisory-only). Reference the existing test table and the function loadBalancerServiceIsProgressing to add one explicit case that checks the CLB dual-stack semantics. Ensure the test name and description clearly state it targets "CLB dual-stack progressing semantics."
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/operator/controller/ingress/load_balancer_service_test.go`:
- Around line 1367-1381: Add a focused unit test in
load_balancer_service_test.go that asserts the CLB dual-stack code path in
loadBalancerServiceIsProgressing: create a Service fixture, mutate it to
represent the CLB dual-stack scenario (set .Spec.IPFamilies to both IPv4 and
IPv6 and/or .Spec.IPFamilyPolicy to RequireDualStack as in the existing table
entries), call loadBalancerServiceIsProgressing(originalSvc, mutatedSvc) and
assert the expected boolean (true if this should mark Progressing or false if
it's advisory-only). Reference the existing test table and the function
loadBalancerServiceIsProgressing to add one explicit case that checks the CLB
dual-stack semantics. Ensure the test name and description clearly state it
targets "CLB dual-stack progressing semantics."
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 7d5b6069-ac69-45ba-bcc7-e17d7c92622b
📒 Files selected for processing (2)
pkg/operator/controller/ingress/load_balancer_service.gopkg/operator/controller/ingress/load_balancer_service_test.go
ff1332c to
0e7d3a5
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
pkg/operator/controller/ingress/load_balancer_service.go (1)
900-902:⚠️ Potential issue | 🟠 MajorDon't treat CLB dual-stack fallback as a progressing error.
At Line 900, this warning is appended to
errs, so it behaves as a persistent progress blocker instead of advisory guidance. That can leave status permanently progressing for an intentionally supported fallback (IPv4-only CLB).Suggested fix
- if wantLBType == operatorv1.AWSClassicLoadBalancer && platform.AWS != nil && isAWSDualStack(platform.AWS.IPFamily) { - errs = append(errs, fmt.Errorf("Classic Load Balancers do not support dual-stack. The IngressController %q will use IPv4-only despite that the cluster is configured as %q. Use an NLB type to support dual-stack networking.", ic.Name, platform.AWS.IPFamily)) - } + if wantLBType == operatorv1.AWSClassicLoadBalancer && platform.AWS != nil && isAWSDualStack(platform.AWS.IPFamily) { + log.Info("Classic Load Balancer does not support dual-stack; falling back to IPv4-only. Use NLB for dual-stack.", + "ingresscontroller", ic.Name, + "ipFamily", platform.AWS.IPFamily) + }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/operator/controller/ingress/load_balancer_service.go` around lines 900 - 902, The check that currently appends a dual-stack fallback message to errs (the fmt.Errorf appended when wantLBType == operatorv1.AWSClassicLoadBalancer && platform.AWS != nil && isAWSDualStack(platform.AWS.IPFamily)) should not produce a progressing/blocking error; change it to emit a non-blocking advisory instead (e.g., log via the controller logger, record an Event, or append to a dedicated warnings/advisories slice or status condition) so the CLB IPv4-only fallback is informational only and does not leave the IngressController permanently progressing. Ensure you update the code that previously consumed errs (if any) to surface these advisories appropriately instead of treating them as errors.
🧹 Nitpick comments (2)
pkg/operator/controller/ingress/load_balancer_service_test.go (1)
1017-1052: Consider deduplicatingnlbStrategywith the existingnlbhelper in the same file.The new
nlbStrategyhelper function (lines 1017-1031) appears to duplicate the existingnlbhelper defined withinTest_desiredLoadBalancerService(lines 70-79). Consider extracting thenlbhelper from the existing test to the package level and reusing it, or refactoring both into a single shared helper to reduce duplication.Example refactor
The existing
nlbfunction at line 70-79 inTest_desiredLoadBalancerServicecould be moved to package level and renamed tonlbStrategy, then reused in both test functions:// At package level (outside any test function) func nlbStrategy(scope operatorv1.LoadBalancerScope) *operatorv1.EndpointPublishingStrategy { return &operatorv1.EndpointPublishingStrategy{ Type: operatorv1.LoadBalancerServiceStrategyType, LoadBalancer: &operatorv1.LoadBalancerStrategy{ Scope: scope, ProviderParameters: &operatorv1.ProviderLoadBalancerParameters{ Type: operatorv1.AWSLoadBalancerProvider, AWS: &operatorv1.AWSLoadBalancerParameters{ Type: operatorv1.AWSNetworkLoadBalancer, }, }, }, } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/operator/controller/ingress/load_balancer_service_test.go` around lines 1017 - 1052, The nlbStrategy function duplicates the existing nlb helper inside Test_desiredLoadBalancerService; remove the duplication by moving the nlb helper to package scope (or rename the package-level nlbStrategy to the same helper) and have both tests call that single helper (reference helpers: nlb, nlbStrategy and the test Test_desiredLoadBalancerService) so there is one shared function returning the AWSNetworkLoadBalancer EndpointPublishingStrategy used by both tests.pkg/operator/controller/ingress/load_balancer_service.go (1)
733-752: Consider reconciling bothIPFamiliesandIPFamilyPolicytogether to avoid partial state.The current logic reconciles
IPFamiliesandIPFamilyPolicyindependently. If only one field is set on the expected service but not the other, this could result in a partial configuration. For consistency, consider ensuring both fields are set together when either is present.Additionally, the pattern of repeatedly checking
!changedand callingcurrent.DeepCopy()is repeated here. While this matches the existing codebase style, extracting this into a helper could reduce duplication in the future.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/operator/controller/ingress/load_balancer_service.go` around lines 733 - 752, The code reconciles expected.Spec.IPFamilies and expected.Spec.IPFamilyPolicy independently which can leave the Service in a partial state; change the logic in the reconciliation block that currently references expected.Spec.IPFamilies, expected.Spec.IPFamilyPolicy, current.Spec.IPFamilies, current.Spec.IPFamilyPolicy, changed, and updated to reconcile them together: if either expected.Spec.IPFamilies is non-empty or expected.Spec.IPFamilyPolicy is non-nil, compare both fields to current and if any differ, set changed = true once, call updated = current.DeepCopy() once, and set both updated.Spec.IPFamilies = expected.Spec.IPFamilies and updated.Spec.IPFamilyPolicy = expected.Spec.IPFamilyPolicy (handling nil/empty appropriately) so the Service is updated atomically; keep the rest of the surrounding update flow unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@pkg/operator/controller/ingress/load_balancer_service.go`:
- Around line 900-902: The check that currently appends a dual-stack fallback
message to errs (the fmt.Errorf appended when wantLBType ==
operatorv1.AWSClassicLoadBalancer && platform.AWS != nil &&
isAWSDualStack(platform.AWS.IPFamily)) should not produce a progressing/blocking
error; change it to emit a non-blocking advisory instead (e.g., log via the
controller logger, record an Event, or append to a dedicated warnings/advisories
slice or status condition) so the CLB IPv4-only fallback is informational only
and does not leave the IngressController permanently progressing. Ensure you
update the code that previously consumed errs (if any) to surface these
advisories appropriately instead of treating them as errors.
---
Nitpick comments:
In `@pkg/operator/controller/ingress/load_balancer_service_test.go`:
- Around line 1017-1052: The nlbStrategy function duplicates the existing nlb
helper inside Test_desiredLoadBalancerService; remove the duplication by moving
the nlb helper to package scope (or rename the package-level nlbStrategy to the
same helper) and have both tests call that single helper (reference helpers:
nlb, nlbStrategy and the test Test_desiredLoadBalancerService) so there is one
shared function returning the AWSNetworkLoadBalancer EndpointPublishingStrategy
used by both tests.
In `@pkg/operator/controller/ingress/load_balancer_service.go`:
- Around line 733-752: The code reconciles expected.Spec.IPFamilies and
expected.Spec.IPFamilyPolicy independently which can leave the Service in a
partial state; change the logic in the reconciliation block that currently
references expected.Spec.IPFamilies, expected.Spec.IPFamilyPolicy,
current.Spec.IPFamilies, current.Spec.IPFamilyPolicy, changed, and updated to
reconcile them together: if either expected.Spec.IPFamilies is non-empty or
expected.Spec.IPFamilyPolicy is non-nil, compare both fields to current and if
any differ, set changed = true once, call updated = current.DeepCopy() once, and
set both updated.Spec.IPFamilies = expected.Spec.IPFamilies and
updated.Spec.IPFamilyPolicy = expected.Spec.IPFamilyPolicy (handling nil/empty
appropriately) so the Service is updated atomically; keep the rest of the
surrounding update flow unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: d9fc3866-31dd-496e-82b7-b8c34041d7de
📒 Files selected for processing (2)
pkg/operator/controller/ingress/load_balancer_service.gopkg/operator/controller/ingress/load_balancer_service_test.go
0e7d3a5 to
fc38005
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
pkg/operator/controller/ingress/load_balancer_service.go (1)
499-512: LGTM with optional refactor to reduce duplication.The dual-stack configuration logic is correct. The
ipFamilyPolicyvariable is identically set in both switch cases and could be hoisted out.♻️ Optional refactor to reduce duplication
// Set ipFamilies and ipFamilyPolicy for dual-stack clusters. // Only NLB supports dual-stack; CLB does not. - if platform.AWS != nil && isAWSNLB(lbStatus) { - switch platform.AWS.IPFamily { - case configv1.DualStackIPv4Primary: - ipFamilyPolicy := corev1.IPFamilyPolicyRequireDualStack - service.Spec.IPFamilyPolicy = &ipFamilyPolicy - service.Spec.IPFamilies = []corev1.IPFamily{corev1.IPv4Protocol, corev1.IPv6Protocol} - case configv1.DualStackIPv6Primary: - ipFamilyPolicy := corev1.IPFamilyPolicyRequireDualStack - service.Spec.IPFamilyPolicy = &ipFamilyPolicy - service.Spec.IPFamilies = []corev1.IPFamily{corev1.IPv6Protocol, corev1.IPv4Protocol} - } + if platform.AWS != nil && isAWSNLB(lbStatus) && isAWSDualStack(platform.AWS.IPFamily) { + ipFamilyPolicy := corev1.IPFamilyPolicyRequireDualStack + service.Spec.IPFamilyPolicy = &ipFamilyPolicy + if platform.AWS.IPFamily == configv1.DualStackIPv4Primary { + service.Spec.IPFamilies = []corev1.IPFamily{corev1.IPv4Protocol, corev1.IPv6Protocol} + } else { + service.Spec.IPFamilies = []corev1.IPFamily{corev1.IPv6Protocol, corev1.IPv4Protocol} + } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/operator/controller/ingress/load_balancer_service.go` around lines 499 - 512, The ipFamilyPolicy variable is duplicated in both switch cases; hoist its declaration and assignment before the switch so you set service.Spec.IPFamilyPolicy once (e.g., create ipFamilyPolicy := corev1.IPFamilyPolicyRequireDualStack and assign service.Spec.IPFamilyPolicy = &ipFamilyPolicy before switch), then inside the switch (where isAWSNLB and platform.AWS.IPFamily are used) only set service.Spec.IPFamilies to the appropriate slice (corev1.IPv4Protocol, corev1.IPv6Protocol) or (corev1.IPv6Protocol, corev1.IPv4Protocol) for DualStackIPv4Primary and DualStackIPv6Primary respectively to preserve behavior while removing duplication.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/operator/controller/ingress/load_balancer_service.go`:
- Around line 733-752: The reconciler must treat a change in the primary IP
family (the first element of spec.IPFamilies) as a recreate condition; update
shouldRecreateLoadBalancer() to return true when both current.Spec.IPFamilies
and expected.Spec.IPFamilies are non-empty and their [0] entries differ. Locate
shouldRecreateLoadBalancer() and add a check like: if
len(current.Spec.IPFamilies) > 0 && len(expected.Spec.IPFamilies) > 0 &&
current.Spec.IPFamilies[0] != expected.Spec.IPFamilies[0] { return true } so the
controller will recreate the Service rather than attempting an update that the
API server will reject. Ensure this logic complements the existing
ipFamilies/ipFamilyPolicy update handling in the load balancer reconciliation.
---
Nitpick comments:
In `@pkg/operator/controller/ingress/load_balancer_service.go`:
- Around line 499-512: The ipFamilyPolicy variable is duplicated in both switch
cases; hoist its declaration and assignment before the switch so you set
service.Spec.IPFamilyPolicy once (e.g., create ipFamilyPolicy :=
corev1.IPFamilyPolicyRequireDualStack and assign service.Spec.IPFamilyPolicy =
&ipFamilyPolicy before switch), then inside the switch (where isAWSNLB and
platform.AWS.IPFamily are used) only set service.Spec.IPFamilies to the
appropriate slice (corev1.IPv4Protocol, corev1.IPv6Protocol) or
(corev1.IPv6Protocol, corev1.IPv4Protocol) for DualStackIPv4Primary and
DualStackIPv6Primary respectively to preserve behavior while removing
duplication.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: d5d71000-16e8-46e3-93d9-762b30f7c264
📒 Files selected for processing (2)
pkg/operator/controller/ingress/load_balancer_service.gopkg/operator/controller/ingress/load_balancer_service_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
- pkg/operator/controller/ingress/load_balancer_service_test.go
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/dns/aws/dns.go`:
- Around line 114-117: The code currently uses IPFamily to decide creating AAAA
records; change the logic to require both a dual-stack cluster (IPFamily ==
IPv6DualStack) AND that the resolved load balancer target is IPv6-capable before
publishing AAAA aliases. Locate the code using the IPFamily field and the
variable named target (and the Route53 record creation code around the same
block and at the later similar block referencing IPFamily) and add a check
against the resolved target's capability (e.g., a property or method on the
resolved target like IsIPv6Capable / IsDualStack / Type != "classic" or presence
of IPv6 addresses) so Classic ELBs remain IPv4-only while ELBv2/NLB/ALB may get
AAAA records when both checks pass.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c968171d-cb12-44e2-b98e-eafa45780990
📒 Files selected for processing (3)
pkg/dns/aws/dns.gopkg/operator/controller/dns/controller.gopkg/util/aws/dualstack.go
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
pkg/dns/aws/dns.go (2)
561-569:⚠️ Potential issue | 🟠 MajorPersist the LB IP address type in the fallback metadata.
When the lookup on Line 561 fails, the fallback restores only
targetHostedZoneID.targetIPAddressTypestays empty, so the delete path on Line 600 can only remove the A alias and will leave the AAAA alias behind for a dual-stack NLB.Suggested fix
const ( targetHostedZoneIdAnnotationKey = "ingress.operator.openshift.io/target-hosted-zone-id" + targetIPAddressTypeAnnotationKey = "ingress.operator.openshift.io/target-ip-address-type" )if err != nil { err = fmt.Errorf("failed to get hosted zone for load balancer target %q: %v", target, err) if v, ok := record.Annotations[targetHostedZoneIdAnnotationKey]; !ok { return err } else { log.Error(err, "falling back to the "+targetHostedZoneIdAnnotationKey+" annotation", "value", v) targetHostedZoneID = v + targetIPAddressType = record.Annotations[targetIPAddressTypeAnnotationKey] } }- } else if _, ok := current.Annotations[targetHostedZoneIdAnnotationKey]; !ok { + } else if current.Annotations[targetHostedZoneIdAnnotationKey] != targetHostedZoneID || + current.Annotations[targetIPAddressTypeAnnotationKey] != targetIPAddressType { updated := current.DeepCopy() if updated.Annotations == nil { updated.Annotations = map[string]string{} } updated.Annotations[targetHostedZoneIdAnnotationKey] = targetHostedZoneID + updated.Annotations[targetIPAddressTypeAnnotationKey] = targetIPAddressType if err := m.config.Client.Update(context.TODO(), updated); err != nil {Also applies to: 585-595, 600-600
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/dns/aws/dns.go` around lines 561 - 569, When getLBHostedZoneAndIPAddressType(target) fails and you fall back to using the annotation value for targetHostedZoneID, also persist the load‑balancer IP address type into targetIPAddressType so delete logic can remove both A and AAAA records for dual‑stack NLBs; specifically, in the same fallback block where you read record.Annotations[targetHostedZoneIdAnnotationKey], also read record.Annotations[targetIPAddressTypeAnnotationKey] (or set a sensible default if absent) and assign it to targetIPAddressType, and apply the same change to the other similar fallback blocks around the delete path (the other places handling getLBHostedZoneAndIPAddressType failures) so the IP address type is always available.
641-675:⚠️ Potential issue | 🟠 MajorDon't batch A and AAAA deletes when one record set may not exist.
This code deletes A and AAAA in a single Route53 change batch. Route53 treats change batches as atomic: if validation fails for any change (for example, a DELETE referring to a non-existent record set), the entire batch is canceled and no changes are applied.
If only the A record exists (not AAAA), the DELETE for the AAAA record fails, which cancels the entire batch. The A record remains. However, the error handler at lines 667–673 catches the resulting
"not found"error and returnsnil, falsely reporting success. The record is silently left behind.Fix: Either delete records individually per type, or first verify that both record sets exist before batching deletes.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/dns/aws/dns.go` around lines 641 - 675, The delete logic currently batches A and AAAA changes (the changes slice and input.ChangeBatch) which is atomic and can fail if one record type doesn't exist; update the code path that sets action == string(deleteAction) so deletes are performed per-record-type instead of in one batch: build separate ChangeBatch inputs for route53.RRTypeA and route53.RRTypeAaaa (using the same aliasTarget and zoneID/domain) and call m.route53.ChangeResourceRecordSets for each type individually, and keep the existing per-call error handling that checks awserr.Error and "not found" so a missing AAAA won't block A deletion; reference symbols: changes, input.ChangeBatch, m.route53.ChangeResourceRecordSets, deleteAction, action, route53.RRTypeA, route53.RRTypeAaaa, targetIPAddressType.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Outside diff comments:
In `@pkg/dns/aws/dns.go`:
- Around line 561-569: When getLBHostedZoneAndIPAddressType(target) fails and
you fall back to using the annotation value for targetHostedZoneID, also persist
the load‑balancer IP address type into targetIPAddressType so delete logic can
remove both A and AAAA records for dual‑stack NLBs; specifically, in the same
fallback block where you read
record.Annotations[targetHostedZoneIdAnnotationKey], also read
record.Annotations[targetIPAddressTypeAnnotationKey] (or set a sensible default
if absent) and assign it to targetIPAddressType, and apply the same change to
the other similar fallback blocks around the delete path (the other places
handling getLBHostedZoneAndIPAddressType failures) so the IP address type is
always available.
- Around line 641-675: The delete logic currently batches A and AAAA changes
(the changes slice and input.ChangeBatch) which is atomic and can fail if one
record type doesn't exist; update the code path that sets action ==
string(deleteAction) so deletes are performed per-record-type instead of in one
batch: build separate ChangeBatch inputs for route53.RRTypeA and
route53.RRTypeAaaa (using the same aliasTarget and zoneID/domain) and call
m.route53.ChangeResourceRecordSets for each type individually, and keep the
existing per-call error handling that checks awserr.Error and "not found" so a
missing AAAA won't block A deletion; reference symbols: changes,
input.ChangeBatch, m.route53.ChangeResourceRecordSets, deleteAction, action,
route53.RRTypeA, route53.RRTypeAaaa, targetIPAddressType.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 8c6a6ca5-0156-4bd3-a221-6b12bb81f180
📒 Files selected for processing (1)
pkg/dns/aws/dns.go
|
CodeRabbit's outside of the diff comment: Fair point, I'm going to address it. |
|
@alebedev87: This pull request references NE-2421 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@alebedev87: This pull request references NE-2421 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@alebedev87: This pull request references NE-2421 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
0d6ccb5 to
8bcbd2b
Compare
|
@alebedev87: This pull request references NE-2421 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@alebedev87: This pull request references NE-2421 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@alebedev87: This pull request references NE-2421 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@alebedev87: This pull request references NE-2421 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@alebedev87: This pull request references NE-2421 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
9af1a8c to
53c3acd
Compare
|
@alebedev87: This pull request references NE-2421 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/assign @davidesalerno |
53c3acd to
ae1dd7b
Compare
|
@alebedev87: This pull request references NE-2421 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@alebedev87: This pull request references NE-2421 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@alebedev87: This pull request references NE-2421 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@alebedev87: This pull request references NE-2421 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
| lbStatus.ProviderParameters.Type == operatorv1.AWSLoadBalancerProvider && | ||
| lbStatus.ProviderParameters.AWS != nil && | ||
| lbStatus.ProviderParameters.AWS.Type == operatorv1.AWSNetworkLoadBalancer | ||
| } |
There was a problem hiding this comment.
Grant recently added LB check from the service, when one exists. I'm not sure about the moment checking the LB type is needed, so wondering if the same approach should be applied here as well.
cluster-ingress-operator/pkg/operator/controller/ingress/controller.go
Lines 1310 to 1320 in 128729e
This is the relevant thread about the subject
#1349 (comment)
There was a problem hiding this comment.
You are right about raising this point because it's a tricky one. Here we are constructing the desired state of the load balancer service. And I stick to the existing approach (approach number 1 as described in the defaulting function which is called before ensuring of ingresscontroller). To make things more consistent with the existing checks of the lb type (like this one), let me do the same check and add a switch. This will remove the need for a dedicated isAWSNLB function too.
| expectedIPFamilies: []corev1.IPFamily{corev1.IPv4Protocol}, | ||
| expectedIPFamilyPolicy: ipFamilyPolicyPtr(corev1.IPFamilyPolicySingleStack), | ||
| }, | ||
| } |
There was a problem hiding this comment.
Another useful tests are configuring aws dual stack without a platform type, another one using a platform type other than aws. Both shouldn't configure IP families.
There was a problem hiding this comment.
Added a new test case for the platform different than AWS.
Another useful tests are configuring aws dual stack without a platform type
This should not happen, the platform type is required in Infrastructure API and should be set by the installer.
ae1dd7b to
0404bad
Compare
jcmoraisjr
left a comment
There was a problem hiding this comment.
Just a minor, wondering if my understanding makes sense, other than that lgtm.
Configure the publishing load balancer service for dual-stack AWS clusters based on the Infrastructure CR's ipFamily field (`status.platformStatus.aws.ipFamily`). For NLB-type load balancers, set `ipFamilyPolicy` to `RequireDualStack` and `ipFamilies` to match the cluster's IP family ordering. For CLB, explicitly set SingleStack/IPv4 since CLB only forwards IPv4 traffic. Without this, `DualStackIPv6Primary` clusters would default the CLB service to SingleStack/IPv6, causing OVN to refuse IPv4 traffic on the service's NodePort. When the cluster IP family is dual-stack, the AWS DNS provider creates both Alias A and Alias AAAA Route53 records for IngressController wildcard domains. AAAA records are always created regardless of LB type because stale AAAA records cannot be easily cleaned up when the LB type changes from NLB to CLB — the old NLB is deleted before the new CLB is created, so the DNS provider can no longer look up the target hostname to delete the AAAA record. For CLBs the AAAA alias simply won't resolve. When the LB type is changed to CLB on a dual-stack cluster, a warning note is appended to the effectuation message on the cluster operator status indicating that CLBs do not support dual-stack. Co-authored-by: Claude claude-opus-4-6 <noreply@anthropic.com>
0404bad to
82dc195
Compare
|
/lgtm |
|
@alebedev87: This pull request references NE-2421 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/retest |
|
@alebedev87: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
davidesalerno
left a comment
There was a problem hiding this comment.
Just a couple of minor questions
The code in general looks good.
| ResourceRecordSet: &route53.ResourceRecordSet{ | ||
| Name: aws.String(domain), | ||
| Type: aws.String(route53.RRTypeAaaa), | ||
| AliasTarget: aliasTarget, |
There was a problem hiding this comment.
Both ResourceRecordSet entries point to the same aliasTarget struct. While the AWS SDK v1 is unlikely to mutate it, this is a subtle aliasing risk. If the SDK ever serialises changes concurrently or modifies the struct, both records could be corrupted. Consider copying the AliasTarget for the AAAA entry or creating each inline.
There was a problem hiding this comment.
Yeah, something like a C++ pointer to a constant would avoid this confusion, but Golang is way simpler. I think it's fine to use a pointer to the same object. AWS SDK should not mess with the input parameters especially from the same change batch. If it does (unlikely), we may have problems with other variables too, like action and domain whose pointers are taken too.
| } | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
There is no else statement so if lbStatus is nil or its AWS sub-field is nil (e.g., initial reconcile before status is populated), the IP family configuration is silently skipped.
Are we fine or would it be better to at least log this event?
There was a problem hiding this comment.
We are fine with this, it's consistent with the rest of desiredLoadBalancerService function and as a matter of fact with most of other desired* functions. The conditions in these functions should be clear and tested, the logging is reserved mostly for errors and warning to avoid any flooding.
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: davidesalerno The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/testwith openshift/installer/main/e2e-aws-ovn-dualstack-ipv6-primary-techpreview openshift/cloud-provider-aws#135 |
|
@patrickdillon, |
|
/testwith openshift/installer/main/e2e-aws-ovn-dualstack-ipv6-primary-techpreview openshift/cloud-provider-aws#135 openshift/installer#10380 |
|
@patrickdillon : btw, note that there is a dedicated origin test I created for AWSDualStackInstall featuregate. It makes sense to run the dualstack-tp job in there too. |
Configure the publishing load balancer service for dual-stack AWS clusters based on the Infrastructure CR's ipFamily field (
status.platformStatus.aws.ipFamily).For NLB-type load balancers, set
ipFamilyPolicytoRequireDualStackandipFamiliesto match the cluster's IP family ordering. For CLB, explicitly set SingleStack/IPv4 since CLB only forwards IPv4 traffic. Without this,DualStackIPv6Primaryclusters would default the CLB service to SingleStack/IPv6, causing OVN to refuse IPv4 traffic on the service's NodePort.When the cluster IP family is dual-stack, the AWS DNS provider creates both Alias A and Alias AAAA Route53 records for IngressController wildcard domains. AAAA records are always created regardless of LB type because stale AAAA records cannot be easily cleaned up when the LB type changes from NLB to CLB — the old NLB is deleted before the new CLB is created, so the DNS provider can no longer look up the target hostname to delete the AAAA record. For CLBs the AAAA alias simply won't resolve.
When the LB type is changed to CLB on a dual-stack cluster, a warning note is appended to the effectuation message on the cluster operator status indicating that CLBs do not support dual-stack.
Enhancement proposal: openshift/enhancements#1940.
Related upstream cloud-provider-aws change: kubernetes/cloud-provider-aws#1313.
E2E testing will be implemented as part of the featuregate test coverage: openshift/origin#30904.
Manual test
The test can only be manual for the moment while CCM changes are still in progress. Clusterbot command used:
DualStackIPv4Primary
Install config:
Note that the load balancer type must be
NLBfor dual stack IP family, this is a requirement.Infrastructure CR:
IngressController CR:
Publishing load balancer service:
DNSRecord CR:
DNS query:
e2e connectivity (any IPv6 ingress rule is still missing in CCM implementation though, TBC):
Changing the LB type to Classic:
DNS query after a change to CLB:
Connectivity after a change to CLB:
Clean up of DNS (new shard to showcase the cleanup):
DualStackIPv6Primary
Install config:
Publishing service:
DNS check:
Connectivity check:
CLB shard
CLB connectivity check: