Skip to content

chore: promote staging to staging-promote/d63601ea-24420891050 (2026-04-14 21:15 UTC)#2476

Merged
henrypark133 merged 1 commit intomainfrom
staging-promote/82d341d6-24423313121
Apr 18, 2026
Merged

chore: promote staging to staging-promote/d63601ea-24420891050 (2026-04-14 21:15 UTC)#2476
henrypark133 merged 1 commit intomainfrom
staging-promote/82d341d6-24423313121

Conversation

@ironclaw-ci
Copy link
Copy Markdown
Contributor

@ironclaw-ci ironclaw-ci bot commented Apr 14, 2026

Auto-promotion from staging CI

Batch range: a53eac5c2dec6b6cd5c08189086093fde64aa9cb..82d341d6282165f87387fcd5ba1e2ecac610885b
Promotion branch: staging-promote/82d341d6-24423313121
Base: staging-promote/d63601ea-24420891050
Triggered by: Staging CI batch at 2026-04-14 21:15 UTC

Commits in this batch (40):

Current commits in this promotion (0)

Current base: main
Current head: staging-promote/82d341d6-24423313121
Current range: origin/main..origin/staging-promote/82d341d6-24423313121

  • (no non-merge commits in range)

Auto-updated by staging promotion metadata workflow

Waiting for gates:

  • Tests: pending
  • E2E: pending
  • Claude Code review: pending (will post comments on this PR)

Auto-created by staging-ci workflow

* fix(sandbox): try Docker socket before CLI binary check

The sandbox detection checked `which docker` first and returned
NotInstalled if the CLI binary was absent — even when the Docker
daemon was reachable via a bind-mounted socket. This broke
container-in-container deployments (e.g., Nomad shards with
/var/run/docker.sock mounted) where bollard can talk to the daemon
but no CLI is installed in the slim image.

Reorder check_docker() to try connect_docker() (bollard socket ping)
first. If the daemon responds, return Available immediately. The CLI
check is now only used as a fallback for error-message quality when
the socket connection fails.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(sandbox): skip slow daemon ping when Docker is clearly absent

check_docker() called connect_docker() before checking whether Docker
was even present, causing a 120s bollard timeout on hosts with an
unreachable DOCKER_HOST and no Docker installation. Add a fast-path
that checks for the docker binary, DOCKER_HOST env var, and socket
files on disk before attempting the daemon ping. This preserves DinD
support (bind-mounted socket, no CLI binary) while avoiding the
latency regression for non-Docker hosts.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test(sandbox): add regression tests for check_docker fast-path

Extract should_skip_daemon_ping() predicate from check_docker() and
add unit tests covering all combinations: skip when no binary, no
DOCKER_HOST, and no socket (the bug scenario); no skip when any of
the three signals is present (DinD socket, DOCKER_HOST, CLI binary).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added scope: sandbox Docker sandbox size: M 50-199 changed lines risk: medium Business logic, config, or moderate-risk modules contributor: core 20+ merged PRs labels Apr 14, 2026
@claude
Copy link
Copy Markdown

claude bot commented Apr 14, 2026

Code review

Found 6 issues:

  1. [HIGH:92] Blocking filesystem calls without timeout on NFS/slow mounts
    https://github.com/anthropics/ironclaw/blob/82d341d6282165f87387fcd5ba1e2ecac610885b/src/sandbox/detect.rs#L196-L220

any_docker_socket_exists() performs unbounded synchronous .exists() calls on 6+ paths without timeout. On systems with slow/unresponsive mounts (NFS, Samba, stale mounts), this can block for 10-30s per path. The fast-path optimization could regress on such systems. Add tokio::time::timeout() wrapper around socket checks or cache availability asynchronously.

  1. [MEDIUM:85] Test Through Caller rule violation — missing async caller-level test
    https://github.com/anthropics/ironclaw/blob/82d341d6282165f87387fcd5ba1e2ecac610885b/src/sandbox/detect.rs#L290-L297

Per .claude/rules/testing.md, when a helper gates a side effect (here: any_docker_socket_exists() gates the 120s connect_docker() call), a unit test of the helper alone is insufficient. Add a test that drives check_docker().await through the full decision path and verifies the side effect is skipped on the fast-path.

  1. [MEDIUM:85] TOCTOU race condition between socket existence check and daemon connection
    https://github.com/anthropics/ironclaw/blob/82d341d6282165f87387fcd5ba1e2ecac610885b/src/sandbox/detect.rs#L124-L142

After any_docker_socket_exists() returns true, the socket could be deleted before connect_docker() attempts to connect. While not a security issue, this defeats the optimization goal and could cause the 120s timeout in edge cases despite the fast-path checks. Consider accepting stale socket detection as acceptable risk, or add defensive retry logic.

  1. [MEDIUM:90] UID environment variable unsanitized in path format string
    https://github.com/anthropics/ironclaw/blob/82d341d6282165f87387fcd5ba1e2ecac610885b/src/sandbox/container.rs#L717-L720

The UID env var is used in format!("/run/user/{uid}/docker.sock") with only an empty-string filter. Malicious UID=../../../etc/passwd could construct unexpected paths. Partially mitigated by .exists() check, but validate UID contains only numeric characters: uid.chars().all(|c| c.is_numeric()).

  1. [MEDIUM:80] API visibility boundary leakage — unix_socket_candidates() should not be public
    https://github.com/anthropics/ironclaw/blob/82d341d6282165f87387fcd5ba1e2ecac610885b/src/sandbox/container.rs#L686

Making unix_socket_candidates() pub(crate) exposes an internal detail of connect_docker()'s fallback strategy. Per CLAUDE.md, keep implementation details encapsulated. Either inline it into any_docker_socket_exists() or move socket discovery entirely to detect.rs to maintain clear abstraction boundaries.

  1. [MEDIUM:78] Windows fallback broken — always attempts 120s timeout despite optimization goal
    https://github.com/anthropics/ironclaw/blob/82d341d6282165f87387fcd5ba1e2ecac610885b/src/sandbox/detect.rs#L212-L219

On Windows, any_docker_socket_exists() unconditionally returns true, bypassing the fast-path check. For Windows users without Docker, the code will always attempt connect_docker() (120s timeout). The PR's stated goal of "avoiding bollard's 120s timeout" doesn't hold for Windows. Either: (1) validate named pipes via WMI/Windows API, or (2) return false here to let should_skip_daemon_ping() evaluate binary_found (the primary fast-path on Windows per the comment).


All findings are from independent agent review. Most are MEDIUM and represent correctness/optimization gaps rather than critical flaws.

Base automatically changed from staging-promote/d63601ea-24420891050 to main April 18, 2026 00:59
@henrypark133 henrypark133 merged commit 82d341d into main Apr 18, 2026
57 of 70 checks passed
@henrypark133 henrypark133 deleted the staging-promote/82d341d6-24423313121 branch April 18, 2026 01:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: core 20+ merged PRs risk: medium Business logic, config, or moderate-risk modules scope: sandbox Docker sandbox size: M 50-199 changed lines staging-promotion

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants