metis: add glibc floor qualification test target to Makefile#1036
metis: add glibc floor qualification test target to Makefile#1036k8s-ci-robot merged 9 commits intokubernetes:masterfrom
Conversation
Adds a GitHub Actions workflow to qualify glibc floor compatibility on ubuntu-22.04 runners for the metis CNI. Adds a new test-glibc-floor make target to run the verification locally inside a container, ensuring safety for the glibc 2.35 floor.
|
This issue is currently awaiting triage. If the repository mantainers determine this is a relevant issue, they will accept it by applying the The DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
Hi @arvindbr8. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Tip We noticed you've done this a few times! Consider joining the org to skip this step and gain Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
PTAL: @YifeiZhuang @gnossen |
|
/ok-to-test |
YifeiZhuang
left a comment
There was a problem hiding this comment.
Thanks for adding the sanity checks for the glibc version skew issue! I don't see why we cannot use both github action and prow.
But it easier to maintain to keep it consistent with prow in this repo https://github.com/kubernetes/test-infra/tree/master/config/jobs/kubernetes/cloud-provider-gcp
|
|
||
| # Use ubuntu as base image to package the binary | ||
| # CAUTION: The Metis binary leverages CGO and links against the host's C library. | ||
| # To prevent runtime panics on baseline GKE fleet nodes, this image must remain |
There was a problem hiding this comment.
Remove GKE - this is for non-GKE cluster as well.
There was a problem hiding this comment.
oh i see that this experiment is mainly just for GKE, although the issue is for everyone.
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: arvindbr8, gnossen, YifeiZhuang The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Enforce a qualification check for the
metisCNI binary to ensure it remains compatible with the GKE fleet'sglibc 2.35floor (Ubuntu 22.04 / COS Milestone 117).Context: Why
glibc 2.35?Because the Metis CNI is executed natively on the host OS by the Kubernetes Kubelet (rather than inside a container namespace), it is strictly bound by the host's C standard library.
Our oldest supported GKE node pools currently run Ubuntu 22.04 LTS and COS Milestone 117, both of which natively provide
glibc 2.35. This makes2.35the absolute lowest common denominator across our fleet. If the CGO binary links against aglibcversion higher than 2.35, it will immediately panic with aversion not founderror when scheduled on these nodes. See the Container-Optimized OS Release Notes and GKE Release Notes for concrete historical proof of the milestone baselines (Ubuntu 22.04 / COS Milestone 117).Fleet floor verification
To definitively prove that
glibc 2.35is the correct mathematical floor, we provisioned an ephemeral GKE cluster (1.30.14-gke.2250000) with two node pools reflecting our oldest supported fleet OS images:COS_CONTAINERD(COS Milestone 117)UBUNTU_CONTAINERD(Ubuntu 22.04 LTS)Using debug pods, I queried the host OS's C standard library. The results empirically prove
2.35is our hard floor, dictated by the Ubuntu nodes:1. Ubuntu 22.04 Node Pool (
UBUNTU_CONTAINERD):2. COS Node Pool (
COS_CONTAINERD):$ kubectl debug node/gke-glibc-test-clust-cos-verification-82afa5a0-sfv3 -it --image=ubuntu --profile=sysadmin root@gke-glibc-test-clust-cos-verification-82afa5a0-sfv3:/# chroot /host /lib64/libc.so.6 | head -n 1 GNU C Library (Gentoo 2.37-r15 p12) stable release version 2.37.Changes
Component:
metis/Makefiletest-glibc-floortarget: Builds image, extracts binary, and runs--helpnatively inside a vanillaubuntu:22.04container to guarantee runtime compatibility regardless of host OS.### Component: GitHub Actions- [NEW].github/workflows/metis-glibc-floor-test.yml: Pre-submit guardrail that runs the extraction test on an OS representing the fleet floor (ubuntu-22.04).Note
The GitHub Actions workflow file (
metis-glibc-floor-test.yml) was removed from this PR. The test will be run as a >Prow presubmit job (to be submitted tokubernetes/test-infra). kubernetes/test-infra#36769Verification Results
1. Symbol Analysis Proof
readelf -Vanalysis of the binary built on standardgolang:1.25.8(Bookworm) confirms the highest required version isGLIBC_2.34(safe for 2.35):2. Local Extraction Test
Running make test-glibc-floor succeeded without linkage errors on ubuntu:22.04:
The GitHub Actions workflow will be run as part of this PR's presubmit checks (i think!)