Skip to content

metis: add glibc floor qualification test target to Makefile#1036

Merged
k8s-ci-robot merged 9 commits intokubernetes:masterfrom
arvindbr8:presubmit-guard-rail
Apr 6, 2026
Merged

metis: add glibc floor qualification test target to Makefile#1036
k8s-ci-robot merged 9 commits intokubernetes:masterfrom
arvindbr8:presubmit-guard-rail

Conversation

@arvindbr8
Copy link
Copy Markdown
Contributor

@arvindbr8 arvindbr8 commented Apr 3, 2026

Enforce a qualification check for the metis CNI binary to ensure it remains compatible with the GKE fleet's glibc 2.35 floor (Ubuntu 22.04 / COS Milestone 117).

Context: Why glibc 2.35?

Because the Metis CNI is executed natively on the host OS by the Kubernetes Kubelet (rather than inside a container namespace), it is strictly bound by the host's C standard library.

Our oldest supported GKE node pools currently run Ubuntu 22.04 LTS and COS Milestone 117, both of which natively provide glibc 2.35. This makes 2.35 the absolute lowest common denominator across our fleet. If the CGO binary links against a glibc version higher than 2.35, it will immediately panic with a version not found error when scheduled on these nodes. See the Container-Optimized OS Release Notes and GKE Release Notes for concrete historical proof of the milestone baselines (Ubuntu 22.04 / COS Milestone 117).

Fleet floor verification

To definitively prove that glibc 2.35 is the correct mathematical floor, we provisioned an ephemeral GKE cluster (1.30.14-gke.2250000) with two node pools reflecting our oldest supported fleet OS images:

  1. COS_CONTAINERD (COS Milestone 117)
  2. UBUNTU_CONTAINERD (Ubuntu 22.04 LTS)

Using debug pods, I queried the host OS's C standard library. The results empirically prove 2.35 is our hard floor, dictated by the Ubuntu nodes:

1. Ubuntu 22.04 Node Pool (UBUNTU_CONTAINERD):

$ kubectl debug node/gke-glibc-test-clust-ubuntu-verificat-36dc13b9-vw47 -it --image=ubuntu --profile=sysadmin
root@gke-glibc-test-clust-ubuntu-verificat-36dc13b9-vw47:/# chroot /host /usr/bin/ldd --version | head -n 1

ldd (Ubuntu GLIBC 2.35-0ubuntu3.13) 2.35  <-- The Fleet Floor

2. COS Node Pool (COS_CONTAINERD):

$ kubectl debug node/gke-glibc-test-clust-cos-verification-82afa5a0-sfv3 -it --image=ubuntu --profile=sysadmin
root@gke-glibc-test-clust-cos-verification-82afa5a0-sfv3:/# chroot /host /lib64/libc.so.6 | head -n 1

GNU C Library (Gentoo 2.37-r15 p12) stable release version 2.37.

Changes

Component: metis/Makefile

  • [NEW] test-glibc-floor target: Builds image, extracts binary, and runs --help natively inside a vanilla ubuntu:22.04 container to guarantee runtime compatibility regardless of host OS.

### Component: GitHub Actions
- [NEW] .github/workflows/metis-glibc-floor-test.yml: Pre-submit guardrail that runs the extraction test on an OS representing the fleet floor (ubuntu-22.04).

Note

The GitHub Actions workflow file (metis-glibc-floor-test.yml) was removed from this PR. The test will be run as a >Prow presubmit job (to be submitted to kubernetes/test-infra). kubernetes/test-infra#36769


Verification Results

1. Symbol Analysis Proof

readelf -V analysis of the binary built on standard golang:1.25.8 (Bookworm) confirms the highest required version is GLIBC_2.34 (safe for 2.35):

Version needs section '.gnu.version_r' contains 1 entry:
  Name: GLIBC_2.34  Flags: none  Version: 2

2. Local Extraction Test

Running make test-glibc-floor succeeded without linkage errors on ubuntu:22.04:

Successfully copied 14.5MB to bin/metis-candidate
docker run --rm -v bin/metis-candidate:/metis ubuntu:22.04 /metis --help
Usage of /metis:
  -alsologtostderr
  ...

The GitHub Actions workflow will be run as part of this PR's presubmit checks (i think!)

Adds a GitHub Actions workflow to qualify glibc floor compatibility on ubuntu-22.04 runners for the metis CNI.

Adds a new test-glibc-floor make target to run the verification locally inside a container, ensuring safety for the glibc 2.35 floor.
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Apr 3, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

This issue is currently awaiting triage.

If the repository mantainers determine this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Apr 3, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

Hi @arvindbr8. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work.

Tip

We noticed you've done this a few times! Consider joining the org to skip this step and gain /lgtm and other bot rights. We recommend asking approvers on your previous PRs to sponsor you.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 3, 2026
@arvindbr8
Copy link
Copy Markdown
Contributor Author

PTAL: @YifeiZhuang @gnossen

@YifeiZhuang
Copy link
Copy Markdown
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 3, 2026
@arvindbr8 arvindbr8 changed the title metis: add glibc floor qualification test and makefile target metis: add glibc floor qualification test target to Makefile Apr 6, 2026
Copy link
Copy Markdown
Contributor

@YifeiZhuang YifeiZhuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding the sanity checks for the glibc version skew issue! I don't see why we cannot use both github action and prow.
But it easier to maintain to keep it consistent with prow in this repo https://github.com/kubernetes/test-infra/tree/master/config/jobs/kubernetes/cloud-provider-gcp


# Use ubuntu as base image to package the binary
# CAUTION: The Metis binary leverages CGO and links against the host's C library.
# To prevent runtime panics on baseline GKE fleet nodes, this image must remain
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove GKE - this is for non-GKE cluster as well.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh i see that this experiment is mainly just for GKE, although the issue is for everyone.

@YifeiZhuang
Copy link
Copy Markdown
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 6, 2026
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: arvindbr8, gnossen, YifeiZhuang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 6, 2026
@k8s-ci-robot k8s-ci-robot merged commit 08eef84 into kubernetes:master Apr 6, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants