[ARO-22145] Bump to Azure Linux 3.0#4766
Conversation
|
Skipping CI for Draft Pull Request. |
273c839 to
d499bc1
Compare
|
Please rebase pull request. |
1 similar comment
|
Please rebase pull request. |
d499bc1 to
454c1c8
Compare
Podman 5.x on Azure Linux 3 requires crun (OCI runtime), netavark (network stack), and aardvark-dns explicitly installed. Without these, az acr login fails with "could not find netavark" on RP and gateway VMSS. Made-with: Cursor
aardvark-dns is not a separate package in Azure Linux 3 repos. DNS functionality is bundled with netavark on this platform. Made-with: Cursor
On Azure Linux 3, nftables is the default and native firewall backend. Forcing iptables causes firewalld to crash with a DBus NoReply error because the iptables backend is not functional on this platform. Made-with: Cursor
…e Linux 3) Made-with: Cursor
…ackages - Use block list for nginx command in route/loadbalancer e2e manifests - Rename dnf_*_pkgs to tdnf_*_pkgs and use tdnf consistently with extended repo - Regenerate gateway and rp production deploy assets Made-with: Cursor
Add a file-level comment to util-packages.sh clarifying that the RP and gateway VMSS bootstrap uses tdnf exclusively (extended repo, update, and install), consistent with the dev-env Azure Linux migration in PR #4777. Made-with: Cursor
…red gallery The Mariner 2 FIPS marketplace SKU was absent from the platform-image allowlist for VMSS Automatic OS Upgrades, so ARO used the non-FIPS image and configured FIPS manually at boot. Azure Linux 3 FIPS is referenced via the 1P Shared Gallery, which uses the gallery-based automatic upgrade path and is not subject to that allowlist restriction. Addresses reviewer question from PR #4777. Made-with: Cursor
Made-with: Cursor
Switch configure_repo_azurelinux_extended to use dnf instead of tdnf, and update the default argument fallback from 1 to empty string. Made-with: Cursor
Made-with: Cursor
- Replace dnf with tdnf in configure_repo_azurelinux_extended in util-packages.sh to prevent VMSS bootstrapping failure on Azure Linux 3 where dnf is not present - Replace yum with tdnf in devProxyVMSS.sh weekly cron job to prevent silent failures; rename cron file from yumupdate to tdnfupdate - Regenerate assets after changes Made-with: Cursor
a40c5fb to
f31a8ea
Compare
Yes. Implemented that change. |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 17 out of 19 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 17 out of 19 changed files in this pull request and generated no new comments.
Comments suppressed due to low confidence (1)
pkg/deploy/generator/scripts/util-packages.sh:58
- tdnf_install_pkgs builds the command by feeding "${pkgs[@]}" through mapfile with a space delimiter. With a here-string this typically leaves a trailing newline in the last element (and relies on word-splitting semantics), which can produce an invalid package name and make installs flaky. Prefer appending the array directly to cmd (e.g., cmd+=("${pkgs[@]}") ) so each package is passed as its own argv element with no delimiter/newline issues.
# Reference: https://www.shellcheck.net/wiki/SC2206
# append pkgs array to cmd
mapfile -O $(( ${#cmd[@]} + 1 )) -d ' ' cmd <<< "${pkgs[@]}"
local -r cmd
log "Attempting to install packages: ${pkgs[*]}"
retry cmd "$2" "${3:-}"
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 17 out of 19 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| log "FIPS mode is enabled" | ||
| } | ||
|
|
||
| # fips_configure | ||
| # | ||
| # Configures VM to run with fips mode enabled | ||
| # | ||
| # Taken and refactored from https://eng.ms/docs/products/azure-linux/features/security/fips | ||
| # TODO remove this once sku cbl-mariner-2-gen2-fips is supported by automatic OS updates | ||
| # * Reference: https://learn.microsoft.com/en-us/azure/virtual-machine-scale-sets/virtual-machine-scale-sets-automatic-upgrade#supported-os-images | ||
| fips_configure() { | ||
| # shellcheck disable=SC2034 | ||
| local boot_uuid | ||
| get_boot_dev_uuid boot_uuid | ||
|
|
||
| local grub2_env | ||
| if grub2_env="$(grub2-editenv - list | grep kernelopts)"; then | ||
| grub2-editenv - set "$grub2_env fips=1 $boot_uuid" | ||
| else | ||
| grubby --update-kernel=ALL --args="fips=1 $boot_uuid" | ||
| fi | ||
|
|
||
| # fips mode verification will fail until after the vm has been rebooted | ||
| # fips_verify | ||
| } | ||
|
|
||
| # configure_sshd | ||
| # |
There was a problem hiding this comment.
With fips_configure removed, get_boot_dev_uuid appears to be unused (no remaining references found). Consider removing it (and the related outdated comment block) to reduce dead code and avoid future confusion about how FIPS is expected to be configured on these images.
| configure_rpm_repos() { | ||
| log "starting" | ||
|
|
||
| configure_repo_mariner_extended "$1" "${2:-1}" | ||
| configure_repo_azurelinux_extended "$1" "${2:-}" | ||
| } |
There was a problem hiding this comment.
configure_rpm_repos now forwards the caller-provided retry count into configure_repo_azurelinux_extended. Since rpVMSS/gatewayVMSS pass pkg_retry_count=60, this can make enabling the extended repo retry for up to ~30 minutes (60 * 30s) before failing. Consider keeping a small fixed retry budget for repo enablement (e.g., default to 1–5) and reserving the larger retry count for package update/install operations.
Which issue this PR addresses:
Fixes ARO-22145 — Migrate Azure Red Hat OpenShift RP/Gateway VMSS from Azure Linux 2.0 (EOL July 31, 2025) to Azure Linux 3.0.
What this PR does / why we need it:
Test plan for issue:
Verify RP and Gateway VMSS boot and run successfully on Azure Linux 3 FIPS images
Is there any documentation that needs to be updated for this PR?
How do you know this will function as expected in production?
INT and Canary Testing
Refer attached screen-shots from https://redhat.atlassian.net/browse/ARO-22197