enforces SSRF checks on the actual browser request chain by zpbrent · Pull Request #272 · qhkm/zeptoclaw

zpbrent · 2026-03-06T14:03:36Z

Summary

Implement browser-native SSRF redirect protection for web_screenshot via CDP Fetch interception.
Validate each paused browser request (including redirect hops) for allowed scheme, non-blocked host, and safe DNS resolution.
Block unsafe browser requests with Fetch.failRequest and keep initial entry-URL SSRF checks as defense in depth.

Related Issue

Closes #271

Scope

I branched from upstream/main
This PR contains only commits related to this change
cargo test passes
cargo clippy -- -D warnings passes
cargo fmt --check passes

Test Plan

Ran cargo fmt -- --check
Ran cargo clippy -- -D warnings
Ran cargo test
Ran cargo clippy --features screenshot -- -D warnings
Ran cargo test --features screenshot --lib screenshot::tests:: (24 passed)

Summary by CodeRabbit

Bug Fixes
- Strengthened screenshot security: stricter URL validation, DNS-based SSRF checks seeded from the entry URL, blocking non-http(s) schemes and private hosts, improved redirect-loop/hop protections, safer interception/navigation ordering, and tighter timeout and error reporting during captures.
- Improved request interception lifecycle with per-navigation checks and cleanup.
Tests
- Added unit tests for request validation, SSRF scenarios, redirect handling, interception behavior, and timeout handling.

coderabbitai · 2026-03-06T14:03:52Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds per-navigation request interception and hop-by-hop SSRF validation to the screenshot tool: seeds per-navigation DNS allow-cache, validates each redirect target (scheme and private-host checks), enforces redirect hop limits and loop detection, adjusts timeouts, and cleans up interception state. Includes unit tests for new validators.

Changes

Cohort / File(s)	Summary
Screenshot tool core `src/tools/screenshot.rs`	Implements request interception lifecycle (enable/disable, paused-event handling), adds `validate_browser_request_target_basic`, `validate_browser_request_url`, `handle_paused_browser_request`, per-navigation DNS allow-cache seeding, redirect-hop tracking (`normalize_redirect_tracking_url`, `track_document_redirect`), loop detection and max-hop enforcement, timeout clipping, and enhanced internal interception/navigation errors.
Tests for SSRF validators `src/tools/.../tests`	Adds unit tests covering basic target validation and URL validation: private-host blocking, non-HTTP(s) scheme blocking, DNS-seeded checks, redirect-loop and hop-limit scenarios, and interception/navigation timeout behaviors.

Sequence Diagram

sequenceDiagram
    participant User
    participant ScreenshotTool
    participant Browser
    participant Interceptor as Network Interceptor
    participant Validator as URL Validator
    participant DNS as DNS Checker

    User->>ScreenshotTool: Request screenshot(target URL)
    ScreenshotTool->>DNS: Preflight DNS check for entry host (seed allow-cache)
    DNS-->>ScreenshotTool: Allowed / Blocked
    ScreenshotTool->>Browser: Create empty page
    ScreenshotTool->>Interceptor: Enable request interception
    ScreenshotTool->>Browser: Navigate to target URL
    loop for each paused request / redirect hop
        Browser->>Interceptor: EventRequestPaused
        Interceptor->>Validator: validate_browser_request_url(paused request)
        Validator->>DNS: Resolve/check host (per-navigation allow-cache)
        DNS-->>Validator: Safe / Blocked
        alt safe
            Validator-->>Interceptor: Resume/ContinueRequest
            Interceptor->>Browser: ContinueRequest/ResumeRequest
        else blocked or non-http(s)
            Validator-->>Interceptor: FailRequest
            Interceptor->>Browser: FailRequest (AccessDenied)
        end
    end
    Browser->>ScreenshotTool: Navigation finished / page ready
    ScreenshotTool->>Browser: Capture screenshot
    ScreenshotTool->>Interceptor: Disable interception (cleanup)
    ScreenshotTool-->>User: Return screenshot

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Poem

🐇 I sniffed each hop with careful care,
I cached the names and checked the lair.
I nipped bad schemes and loops in time,
I clipped the clock and closed the line.
A safe screenshot — a tidy hare's rhyme.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title directly and clearly summarizes the main change: enforcing SSRF checks on the browser request chain through CDP Fetch interception, which is the primary focus of all modifications in the changeset.
Linked Issues check	✅ Passed	The PR fully implements the requirements from issue `#271`: hop-by-hop redirect validation for screenshot tool, non-http(s) rejection, blocked/private/local target blocking, redirect loop detection with max hops, DNS checks, and test coverage.
Out of Scope Changes check	✅ Passed	All changes are scoped to screenshot.rs for SSRF/redirect hardening; no modifications to unrelated files, public APIs, or features outside issue `#271` requirements were introduced.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Post copyable unit tests in a comment

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

🧹 Nitpick comments (1)

src/tools/screenshot.rs (1)
232-234: Consider pre-populating the DNS cache with the initial entry URL check.

The entry URL undergoes DNS resolution here (line 234), but the dns_allow_cache is created later (line 327) inside the timeout block. This means the first browser request will re-resolve DNS for the entry URL.

This is minor since it's defense-in-depth and the early check provides fast-fail before browser launch. However, if you want to avoid the redundant lookup:
♻️ Optional: Pre-populate cache to skip redundant DNS lookup
         // Initial DNS SSRF check for the entry URL.
         resolve_and_check_host(&parsed).await?;
+        
+        // Pre-populate cache with the entry host to avoid redundant lookup.
+        let mut dns_allow_cache: HashMap<String, ()> = HashMap::new();
+        if let Some(host) = parsed.host_str() {
+            dns_allow_cache.insert(host.to_ascii_lowercase(), ());
+        }

         // ---- Parse optional parameters ----
Then remove line 327 and pass the pre-populated cache into the timeout block.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/screenshot.rs` around lines 232 - 234, The initial DNS resolution
via resolve_and_check_host(&parsed).await? is performed before dns_allow_cache
is created, causing a redundant DNS lookup later; to fix, perform the DNS cache
population immediately after resolve_and_check_host by creating and inserting
the entry URL's resolved result into dns_allow_cache (create a dns_allow_cache
instance there), then remove the dns_allow_cache instantiation inside the
timeout block and instead pass the pre-populated dns_allow_cache into the
timeout-handling code so subsequent browser requests reuse the cached
resolution.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/tools/screenshot.rs`:
- Around line 232-234: The initial DNS resolution via
resolve_and_check_host(&parsed).await? is performed before dns_allow_cache is
created, causing a redundant DNS lookup later; to fix, perform the DNS cache
population immediately after resolve_and_check_host by creating and inserting
the entry URL's resolved result into dns_allow_cache (create a dns_allow_cache
instance there), then remove the dns_allow_cache instantiation inside the
timeout block and instead pass the pre-populated dns_allow_cache into the
timeout-handling code so subsequent browser requests reuse the cached
resolution.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 4f8ad019-f074-43d2-b71a-70c868a0f693

📥 Commits

Reviewing files that changed from the base of the PR and between ab9a5ae and 70b342c.

📒 Files selected for processing (1)

src/tools/screenshot.rs

coderabbitai

🧹 Nitpick comments (2)

src/tools/screenshot.rs (2)

89-110: LGTM!

The function correctly parses, validates, and caches DNS lookups. The Vacant entry pattern is idiomatic for cache-miss handling.

Minor style note: HashSet<String> would be slightly more expressive than HashMap<String, ()>, but this is purely cosmetic.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/tools/screenshot.rs` around lines 89 - 110, The parameter dns_allow_cache
in validate_browser_request_url is using HashMap<String, ()> where a
HashSet<String> is more expressive; change the function signature to accept a
&mut HashSet<String>, replace the Entry::Vacant check with a contains/insert
flow (e.g., if !dns_allow_set.contains(&host) {
resolve_and_check_host(&parsed).await?; dns_allow_set.insert(host); }), and
update all callers to pass a HashSet<String> instead of HashMap<String, ()> so
caching behavior remains identical but the type better conveys intent.

334-349: Consider adding a redirect hop limit to prevent excessive redirect chains.

The PR objectives specifically mention: "Enforce redirect limits and detect redirect loops or excessive hops with clear errors." While each redirect hop is correctly validated for SSRF, there's no counter to prevent an attacker from creating a long chain of valid redirects (e.g., 100+ hops to public hosts) that could cause resource exhaustion or timing issues.

Consider tracking hop count and failing after a reasonable limit (e.g., 20 hops, consistent with typical HTTP client defaults):

♻️ Proposed enhancement

+            const MAX_REDIRECT_HOPS: usize = 20;
+            let mut redirect_count: usize = 0;
+
             let capture_result = loop {
                 tokio::select! {
                     captured = &mut screenshot_future => {
                         break captured;
                     }
                     maybe_event = paused_events.next() => {
                         let Some(event) = maybe_event else {
                             break Err(ZeptoError::Tool(
                                 "Browser request interception stream ended unexpectedly".to_string()
                             ));
                         };

+                        redirect_count += 1;
+                        if redirect_count > MAX_REDIRECT_HOPS {
+                            let _ = page.execute(FailRequestParams::new(
+                                event.request_id.clone(),
+                                ErrorReason::AccessDenied,
+                            )).await;
+                            break Err(ZeptoError::SecurityViolation(
+                                format!("Exceeded maximum redirect hops ({})", MAX_REDIRECT_HOPS)
+                            ));
+                        }
+
                         handle_paused_browser_request(&page, event.as_ref(), &mut dns_allow_cache).await?;
                     }
                 }
             };

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/tools/screenshot.rs` around lines 334 - 349, The redirect handling loop
around capture_result/screenshot_future should enforce a hop counter to prevent
excessive redirect chains: add an integer hop_count (start 0) and increment it
each time you handle a redirect via handle_paused_browser_request (or wherever a
redirect response is followed), and if hop_count exceeds a defined constant
(e.g., MAX_REDIRECT_HOPS = 20) return Err(ZeptoError::Tool("Too many
redirects"…)) from the loop; update the logic that breaks with captured to also
check/propagate this error, and ensure dns_allow_cache, paused_events, and
screenshot_future flow are unchanged except for the added count check so you
detect redirect loops/excessive hops while preserving SSRF validation.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/tools/screenshot.rs`:
- Around line 89-110: The parameter dns_allow_cache in
validate_browser_request_url is using HashMap<String, ()> where a
HashSet<String> is more expressive; change the function signature to accept a
&mut HashSet<String>, replace the Entry::Vacant check with a contains/insert
flow (e.g., if !dns_allow_set.contains(&host) {
resolve_and_check_host(&parsed).await?; dns_allow_set.insert(host); }), and
update all callers to pass a HashSet<String> instead of HashMap<String, ()> so
caching behavior remains identical but the type better conveys intent.
- Around line 334-349: The redirect handling loop around
capture_result/screenshot_future should enforce a hop counter to prevent
excessive redirect chains: add an integer hop_count (start 0) and increment it
each time you handle a redirect via handle_paused_browser_request (or wherever a
redirect response is followed), and if hop_count exceeds a defined constant
(e.g., MAX_REDIRECT_HOPS = 20) return Err(ZeptoError::Tool("Too many
redirects"…)) from the loop; update the logic that breaks with captured to also
check/propagate this error, and ensure dns_allow_cache, paused_events, and
screenshot_future flow are unchanged except for the added count check so you
detect redirect loops/excessive hops while preserving SSRF validation.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 960969cc-ad37-4381-a5f6-1bef75d99e9d

📥 Commits

Reviewing files that changed from the base of the PR and between 70b342c and 25bd045.

📒 Files selected for processing (1)

src/tools/screenshot.rs

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

src/tools/screenshot.rs (1)

353-385: ⚠️ Potential issue | 🟠 Major

Ensure interception/browser cleanup runs on all error paths.

Current ?/?? flow can return before FetchDisable, drop(browser), and handler_handle.abort(), which risks leaving background handler/browser work running longer than intended.

🔧 Suggested control-flow fix

-            let capture_result = loop {
+            let capture_result = loop {
                 tokio::select! {
@@
-                        handle_paused_browser_request(&page, event.as_ref(), &mut dns_allow_cache).await?;
+                        if let Err(err) = handle_paused_browser_request(&page, event.as_ref(), &mut dns_allow_cache).await {
+                            break Err(err);
+                        }
                     }
                 }
             };
@@
-        .await
-        .map_err(|_| {
+        .await;
+
+        // Always cleanup browser resources, even on timeout/error.
+        drop(browser);
+        handler_handle.abort();
+
+        let screenshot_result = screenshot_result.map_err(|_| {
             ZeptoError::Tool(format!(
                 "Screenshot timed out after {}s for '{}'",
                 timeout_secs, url_str
             ))
-        })??;
-
-        // Clean up browser resources.
-        drop(browser);
-        handler_handle.abort();
+        })??;

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/tools/screenshot.rs` around lines 353 - 385, The current early-return
paths can bypass the interception cleanup
(page.execute(FetchDisableParams::default())), browser drop and
handler_handle.abort(); factor an async cleanup function (e.g., async fn
cleanup_interception(page: Page, browser: Browser, handler_handle:
JoinHandle<...>)) that runs page.execute(FetchDisableParams::default()).await,
aborts handler_handle, and drops browser, then call that cleanup in every return
path: after the capture loop before returning capture_result, and in the error
mapping/early-return branches (the map_err closure and any place using ?/??) so
cleanup_interception always runs regardless of success or failure. Use the
existing identifiers (screenshot_future, paused_events,
handle_paused_browser_request, page.execute, drop(browser),
handler_handle.abort) to locate where to insert the calls.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/tools/screenshot.rs`:
- Around line 111-116: The dns_allow_cache lookup around parsed.host_str()
bypasses resolve_and_check_host(&parsed) for subsequent requests to the same
hostname, weakening rebinding protection; remove or bypass the dns_allow_cache
check in the paused-browser request path so resolve_and_check_host(&parsed) is
always awaited for each navigation (i.e., do not short-circuit when
dns_allow_cache.entry(host) is Vacant/Occupied), and ensure the same change is
applied to the other identical block referenced (the block around lines 255-260)
so every request revalidates DNS via resolve_and_check_host rather than relying
on a hostname-only cache.

---

Outside diff comments:
In `@src/tools/screenshot.rs`:
- Around line 353-385: The current early-return paths can bypass the
interception cleanup (page.execute(FetchDisableParams::default())), browser drop
and handler_handle.abort(); factor an async cleanup function (e.g., async fn
cleanup_interception(page: Page, browser: Browser, handler_handle:
JoinHandle<...>)) that runs page.execute(FetchDisableParams::default()).await,
aborts handler_handle, and drops browser, then call that cleanup in every return
path: after the capture loop before returning capture_result, and in the error
mapping/early-return branches (the map_err closure and any place using ?/??) so
cleanup_interception always runs regardless of success or failure. Use the
existing identifiers (screenshot_future, paused_events,
handle_paused_browser_request, page.execute, drop(browser),
handler_handle.abort) to locate where to insert the calls.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 29805b28-8abe-4891-855e-a409f53ccc07

📥 Commits

Reviewing files that changed from the base of the PR and between 25bd045 and 1969b6f.

📒 Files selected for processing (1)

src/tools/screenshot.rs

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

src/tools/screenshot.rs (1)
345-401: Add direct tests for hop-limit and loop-detection branches.

The security-critical branches at Line 375 (MAX_SCREENSHOT_REDIRECT_HOPS) and Line 389 (duplicate URL loop detection) are not directly exercised by the added tests. Extracting redirect-state updates into a small pure helper would make these regression cases unit-testable.

As per coding guidelines, "Prefer small, composable functions over large blocks in Rust code."

Also applies to: 695-727
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/tools/screenshot.rs` around lines 345 - 401, The loop in screenshot
capture updates redirect state via document_redirect_hops and
seen_document_redirect_urls and short-circuits with MAX_SCREENSHOT_REDIRECT_HOPS
or duplicate URL checks; extract that logic into a small pure helper function
(e.g., fn advance_document_redirect_state(hops: &mut usize, seen: &mut
HashSet<String>, new_url: &str, max_hops: usize) -> Result<(), RedirectError>)
that performs the hop increment, max-hop check and duplicate detection and
returns an error variant instead of directly calling
page.execute/FailRequestParams/ZeptoError; call this helper from the capture
loop (where document_redirect_hops, seen_document_redirect_urls,
event_ref.request.url are used) and create direct unit tests for the helper to
cover the MAX_SCREENSHOT_REDIRECT_HOPS and duplicate-URL loop branches; apply
the same extraction for the analogous redirect logic in the other block (the
second occurrence around the later capture code) so both sites can be
unit-tested.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/tools/screenshot.rs`:
- Around line 345-347: The seed and comparisons for redirect loop detection use
raw url_str vs browser-emitted URLs which can differ in representation; update
the logic around document_redirect_hops and seen_document_redirect_urls to
canonicalize/normalize URLs before inserting or comparing (e.g., parse url_str
and browser-emitted URLs with url::Url or a shared normalize_url helper and use
the normalized string) so the initial insert of url_str and subsequent checks
use the same canonical form and reliably detect loops.

---

Nitpick comments:
In `@src/tools/screenshot.rs`:
- Around line 345-401: The loop in screenshot capture updates redirect state via
document_redirect_hops and seen_document_redirect_urls and short-circuits with
MAX_SCREENSHOT_REDIRECT_HOPS or duplicate URL checks; extract that logic into a
small pure helper function (e.g., fn advance_document_redirect_state(hops: &mut
usize, seen: &mut HashSet<String>, new_url: &str, max_hops: usize) -> Result<(),
RedirectError>) that performs the hop increment, max-hop check and duplicate
detection and returns an error variant instead of directly calling
page.execute/FailRequestParams/ZeptoError; call this helper from the capture
loop (where document_redirect_hops, seen_document_redirect_urls,
event_ref.request.url are used) and create direct unit tests for the helper to
cover the MAX_SCREENSHOT_REDIRECT_HOPS and duplicate-URL loop branches; apply
the same extraction for the analogous redirect logic in the other block (the
second occurrence around the later capture code) so both sites can be
unit-tested.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 6effbfc2-1934-4ae7-a552-775f1a2026b1

📥 Commits

Reviewing files that changed from the base of the PR and between 1969b6f and bdf9a6d.

📒 Files selected for processing (1)

src/tools/screenshot.rs

enforces SSRF checks on the actual browser request chain

70b342c

coderabbitai bot reviewed Mar 6, 2026

View reviewed changes

enforces SSRF checks on the actual browser request chain

25bd045

coderabbitai bot reviewed Mar 6, 2026

View reviewed changes

enforces SSRF checks on the actual browser request chain

1969b6f

coderabbitai bot reviewed Mar 6, 2026

View reviewed changes

Comment thread src/tools/screenshot.rs Outdated

enforces SSRF checks on the actual browser request chain

bdf9a6d

coderabbitai bot reviewed Mar 6, 2026

View reviewed changes

Comment thread src/tools/screenshot.rs Outdated

zpbrent added 2 commits March 6, 2026 23:19

enforces SSRF checks on the actual browser request chain

aab12a3

Ensure interception/browser cleanup runs on all error paths

e0ad04b

qhkm mentioned this pull request Mar 7, 2026

security: harden screenshot tool against SSRF via redirect chains #274

Closed

5 tasks

github-actions bot mentioned this pull request Mar 7, 2026

🦞 OpenClaw 生态日报 2026-03-07 rollysys/agents-radar#48

Open

qhkm merged commit e68c859 into qhkm:main Mar 7, 2026
8 checks passed

This was referenced Mar 8, 2026

🦞 OpenClaw 生态日报 2026-03-08 duanyytop/agents-radar#96

Open

🦞 OpenClaw 生态日报 2026-03-08 rollysys/agents-radar#53

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enforces SSRF checks on the actual browser request chain#272

enforces SSRF checks on the actual browser request chain#272
qhkm merged 6 commits intoqhkm:mainfrom
zpbrent:fix-ssrf-screenshot-redirect

zpbrent commented Mar 6, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

coderabbitai bot commented Mar 6, 2026 •

edited

Loading

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

zpbrent commented Mar 6, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Related Issue

Scope

Test Plan

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zpbrent commented Mar 6, 2026 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Mar 6, 2026 •

edited

Loading