[1.4] libct: prepareCgroupFD: fall back to container init cgroup by kolyshkin · Pull Request #5117 · opencontainers/runc

kolyshkin · 2026-02-11T20:07:15Z

This is a backport of #5101, fixing #5089 for 1.4. Original description follows.

Previously, when prepareCgroupFD would not open container's cgroup
(as configured in config.json and saved to state.json), it returned
a fatal error, as we presumed a container can't exist without its own
cgroup.

Apparently, it can. In a case when container is configured without
cgroupns (i.e. it uses hosts cgroups), and /sys/fs/cgroup is mounted
read-write, a rootful container's init can move itself to an entirely
different cgroup (even a new one that it just created), and then the
original container cgroup is removed by the kernel (or systemd?) as
it has no processes left. By the way, from the systemd point of view
the container is gone. And yet it is still there, and users want
runc exec to work!

And it worked, thanks to the "let's try container init's cgroup"
fallback as added by commit c91fe9a ("cgroup2: exec: join the
cgroup of the init process on EBUSY"). The fallback was added for
the entirely different reason, but it happened to work in this very
case, too.

This behavior was broken with the introduction of CLONE_INTO_CGROUP
support.

While it is debatable whether this is a valid scenario when a container
moves itself into a different cgroup, this very setup is used by e.g.
buildkitd running in a privileged kubernetes container (see issue #5089).

To restore the way things are expected to work, add the same "try
container init's cgroup" fallback into prepareCgroupFD.

A test case (reproducing issue in #5089) is added. It fails before
the fix (see #5102) and succeeds here.

Separate initProcessCgroupPath code out of addIntoCgroupV2. To be used by the next patch. While at it, describe the new scenario in which the container's configured cgroup might not be available. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com> (cherry picked from commit 94133fa) Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

1. Refactor addIntoCgroupV2 in an attempt to simplify it. 2. Fix the bug of not trying the init cgroup fallback if rootlessCgroup is set. This is a bug because rootlessCgroup tells to ignore cgroup join errors, not to never try the fallback. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com> (cherry picked from commit 1d030fa) Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

Previously, when prepareCgroupFD would not open container's cgroup (as configured in config.json and saved to state.json), it returned a fatal error, as we presumed a container can't exist without its own cgroup. Apparently, it can. In a case when container is configured without cgroupns (i.e. it uses hosts cgroups), and /sys/fs/cgroup is mounted read-write, a rootful container's init can move itself to an entirely different cgroup (even a new one that it just created), and then the original container cgroup is removed by the kernel (or systemd?) as it has no processes left. By the way, from the systemd point of view the container is gone. And yet it is still there, and users want runc exec to work! And it worked, thanks to the "let's try container init's cgroup" fallback as added by commit c91fe9a ("cgroup2: exec: join the cgroup of the init process on EBUSY"). The fallback was added for the entirely different reason, but it happened to work in this very case, too. This behavior was broken with the introduction of CLONE_INTO_CGROUP support. While it is debatable whether this is a valid scenario when a container moves itself into a different cgroup, this very setup is used by e.g. buildkitd running in a privileged kubernetes container (see issue 5089). To restore the way things are expected to work, add the same "try container init's cgroup" fallback into prepareCgroupFD. While at it, simplify the code flow. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com> (cherry picked from commit 6c07a37) Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

Add a test case to reproduce runc issue 5089. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com> (cherry picked from commit 1fdbab8) Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

kolyshkin · 2026-02-11T20:09:03Z

close/reopen to kick ci (I used the wrong base branch again 🤦🏻 )

kolyshkin added 4 commits February 11, 2026 12:04

tests/int: add "runc exec [init changes cgroup]"

9ac76a0

Add a test case to reproduce runc issue 5089. Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com> (cherry picked from commit 1fdbab8) Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>

kolyshkin added this to the 1.4.1 milestone Feb 11, 2026

kolyshkin changed the base branch from main to release-1.4 February 11, 2026 20:07

kolyshkin added the backport/1.4-pr A backport PR to release-1.4 label Feb 11, 2026

kolyshkin mentioned this pull request Feb 11, 2026

libct: prepareCgroupFD: fall back to container init cgroup #5101

Merged

kolyshkin closed this Feb 11, 2026

kolyshkin reopened this Feb 11, 2026

kolyshkin marked this pull request as ready for review February 12, 2026 00:55

kolyshkin requested review from AkihiroSuda, cyphar, lifubang and rata and removed request for rata February 12, 2026 01:11

AkihiroSuda approved these changes Feb 12, 2026

View reviewed changes

lifubang approved these changes Feb 12, 2026

View reviewed changes

lifubang merged commit 2120bfa into opencontainers:release-1.4 Feb 12, 2026
37 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[1.4] libct: prepareCgroupFD: fall back to container init cgroup#5117

[1.4] libct: prepareCgroupFD: fall back to container init cgroup#5117
lifubang merged 4 commits intoopencontainers:release-1.4from
kolyshkin:1.4-fix-exec

kolyshkin commented Feb 11, 2026

Uh oh!

kolyshkin commented Feb 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

kolyshkin commented Feb 11, 2026

Uh oh!

kolyshkin commented Feb 11, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants