Skip to content

depends_on with condition: service_healthy does not wait for healthy dependency #1422

@owst

Description

@owst

Describe the bug

When a service depends_on another with condition: service_healthy the service is immediately started, instead of waiting for the depended-on service to become healthy.

To Reproduce

For example in the following, we have a "db" service and a "server" service that depends on "db".

services:
  db:
    image: busybox
    command: ["sh", "-c", 'echo "DB started"; sleep 5; echo "DB healthy"; touch ready; sleep 5; echo "DB exit"']
    healthcheck:
      test: ["CMD", "sh -c '[[ -f ready ]]'"]
      interval: 1s
  server:
    image: busybox
    command: ["sh", "-c", 'echo "server started"; sleep 2; echo "server exit"']
    depends_on:
      db:
        condition: service_healthy

Actual output:

$ podman-compose up
[db]     | DB started
[server] | server started # Shouldn't happen until DB is healthy!
[server] | server exit
[db]     | DB healthy
[db]     | DB exit

Expected output:

$ podman-compose up
[db]     | DB started
[db]     | DB healthy
[server] | server started
[server] | server exit
[db]     | DB exit

Expected behavior

The server service to not start until the db service is healthy.

Actual behavior
The server service starts immediately and doesn't wait for the db service to become healthy.

Environment:

  • OS: macOS Tahoe 26.3
  • podman version: 5.6.2
  • podman compose version: 1.5.0

Additional context

It looks like this might be caused by podman wait --condition=healthy returning 0 for a created-but-not-started container.

For example:

$ podman create --name=example_db_1 --healthcheck-command "[\"sh -c '[[ -f ready ]]'\"]" --healthcheck-interval 1s busybox sh -c 'echo "DB started"; sleep 5; echo "DB healthy"; touch ready; sleep 5; echo "DB exit"'
c35e25c22895940755b0b30c687a07db75f180e175dcfdd05da485590a681502

$ podman wait --condition=healthy example_db_1
0

but if I start the container, then:

$ podman start example_db_1
example_db_1

$ podman wait --condition=healthy example_db_1
-1 # printed after 5s

It looks like check_dep_conditions assumes that podman wait will (presumably, block and then) return -1, but this is seemingly not true.

https://github.com/containers/podman-compose/blame/f7eeda1a3db10952424af6a5b0501c269ebe3f0d/podman_compose.py#L3056-L3061

With this change and using podman compose --verbose up:

                    output = await compose.podman.output(
                        [], "wait", [f"--condition={condition.value}"] + deps_cd
                    )
                    log.debug("podman wait output %s", output);

I see

DEBUG:podman_compose:podman wait output b'0\n'
DEBUG:podman_compose:dependencies for condition healthy have been fulfilled on containers comp_db_1

which indicates the exiting command with output 0 has fooled compose into thinking the db container is healthy.

However, instead with this change:

                    res = await compose.podman.output(
                        [], "wait", [f"--condition={condition.value}"] + deps_cd
                    )
                    if res.strip() != b"-1":
                      continue

I see the desired output:

[db]     | DB started
[db]     | DB healthy
[server] | server started
[server] | server exit
[db]     | DB exit

but I'm not sure it's the right fix - happy to create a PR with it, if it is right!

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions