fix: 5 critical/high-priority bugs (auth bypass, relay failures, unbounded recursion, context growth)#1083
Conversation
- nearai#1033: reject webhook requests when secret is cleared at runtime via update_secret(None), preventing auth bypass through SIGHUP hot-swap - nearai#908: reset consecutive_failures counter on successful SSE stream reconnection in relay channel, so circuit breaker counts truly consecutive failures - nearai#975: add depth limit (16) to validate_tool_schema() to prevent stack overflow on deeply nested schemas - nearai#974: add depth limit (8) to resolve_nested() to prevent stack overflow on deeply nested capabilities wrappers - nearai#826: truncate oversized tool outputs (>8KB) in routine lightweight loop to prevent unbounded context growth across iterations Each fix includes a regression test. Closes nearai#1033, nearai#908, nearai#975, nearai#974, nearai#826 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request delivers significant stability and security enhancements by resolving several critical and high-priority bugs. It introduces robust measures for webhook authentication, ensures reliable operation of relay channels, and prevents potential denial-of-service attacks through unbounded recursion in schema and capability parsing. Additionally, it optimizes resource usage by truncating large tool outputs in automated routines and modularizes core safety features into a new, dedicated crate. Highlights
Changelog
Ignored Files
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This is an excellent pull request that addresses several critical and high-priority bugs while also introducing significant security hardening and reliability improvements. The fixes for the five documented bugs (auth bypass, relay failures, unbounded recursion, and context growth) are well-implemented. I'm particularly impressed with the security enhancements, including the robust SSRF/DNS-rebinding protection in the http tool, the move to a dedicated ironclaw_safety crate, the implementation of a Content Security Policy in the web UI, and the additional safeguards for FullAccess sandbox policy. The code quality is high, and the refactoring improves modularity. I have one minor suggestion to further improve the SSRF protection.
| || v4.is_link_local() | ||
| || v4.is_multicast() | ||
| || v4.is_unspecified() | ||
| || *v4 == Ipv4Addr::new(169, 254, 169, 254) |
There was a problem hiding this comment.
The SSRF protection is missing a check for the Carrier-Grade NAT (CGNAT) address space (100.64.0.0/10). While not strictly a private range, it's not globally routable and should be blocked to prevent potential SSRF to internal carrier services. A similar check was correctly added in src/setup/channels.rs.
|| *v4 == Ipv4Addr::new(169, 254, 169, 254)
|| (v4.octets()[0] == 100 && (v4.octets()[1] & 0xC0) == 64)References
- To prevent DNS-based SSRF vulnerabilities, resolve hostnames to IP addresses, validate all resolved IPs against restricted ranges, and pin the validated addresses for the connection to prevent TOCTOU race conditions.
…its) - nearai#1077: recompute next_fire_at when re-enabling cron routines via web toggle, mirroring CLI behavior so cron ticker picks them up - nearai#1076: refresh event trigger cache after web toggle/delete operations so event/system_event routines reflect changes immediately - nearai#892: remove Stuck from check_signals() stop-states in JobDelegate since Stuck is recoverable (Stuck -> InProgress via self-repair) - nearai#976: truncate oversized description strings in CapabilitiesFile to 4KB to prevent memory abuse from malicious capabilities files - nearai#977: drop oversized parameters schema JSON (>64KB) in CapabilitiesFile to prevent unbounded memory growth Each fix includes regression tests where applicable. Closes nearai#1077, nearai#1076, nearai#892, nearai#976, nearai#977 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- nearai#825: use RegexBuilder with 64KB size limit when compiling user-supplied event trigger patterns, both at creation time (routine tool) and at cache refresh (routine engine) Note: Rust's regex crate already guarantees O(n) matching, so the size limit prevents excessive memory use during compilation rather than catastrophic backtracking at match time. Closes nearai#825 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
…ed-in-issues-and-prioritize-and # Conflicts: # src/channels/http.rs # src/channels/web/handlers/routines.rs
| } | ||
| let expected_secret = webhook_secret | ||
| .as_ref() | ||
| .expect("checked is_none above") |
There was a problem hiding this comment.
expect is production code
ilblackdragon
left a comment
There was a problem hiding this comment.
Approved, but need to fix expect in prod code.
…unded recursion, context growth) (#1083) * fix: address 5 critical and high-priority bugs from issue tracker - #1033: reject webhook requests when secret is cleared at runtime via update_secret(None), preventing auth bypass through SIGHUP hot-swap - #908: reset consecutive_failures counter on successful SSE stream reconnection in relay channel, so circuit breaker counts truly consecutive failures - #975: add depth limit (16) to validate_tool_schema() to prevent stack overflow on deeply nested schemas - #974: add depth limit (8) to resolve_nested() to prevent stack overflow on deeply nested capabilities wrappers - #826: truncate oversized tool outputs (>8KB) in routine lightweight loop to prevent unbounded context growth across iterations Each fix includes a regression test. Closes #1033, #908, #975, #974, #826 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: 5 more high-priority bugs (routine cache, job signals, input limits) - #1077: recompute next_fire_at when re-enabling cron routines via web toggle, mirroring CLI behavior so cron ticker picks them up - #1076: refresh event trigger cache after web toggle/delete operations so event/system_event routines reflect changes immediately - #892: remove Stuck from check_signals() stop-states in JobDelegate since Stuck is recoverable (Stuck -> InProgress via self-repair) - #976: truncate oversized description strings in CapabilitiesFile to 4KB to prevent memory abuse from malicious capabilities files - #977: drop oversized parameters schema JSON (>64KB) in CapabilitiesFile to prevent unbounded memory growth Each fix includes regression tests where applicable. Closes #1077, #1076, #892, #976, #977 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: prevent ReDoS in event trigger regex patterns - #825: use RegexBuilder with 64KB size limit when compiling user-supplied event trigger patterns, both at creation time (routine tool) and at cache refresh (routine engine) Note: Rust's regex crate already guarantees O(n) matching, so the size limit prevents excessive memory use during compilation rather than catastrophic backtracking at match time. Closes #825 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Harden HTTP SSRF IP filtering * Apply rustfmt after staging merge --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…unded recursion, context growth) (nearai#1083) * fix: address 5 critical and high-priority bugs from issue tracker - nearai#1033: reject webhook requests when secret is cleared at runtime via update_secret(None), preventing auth bypass through SIGHUP hot-swap - nearai#908: reset consecutive_failures counter on successful SSE stream reconnection in relay channel, so circuit breaker counts truly consecutive failures - nearai#975: add depth limit (16) to validate_tool_schema() to prevent stack overflow on deeply nested schemas - nearai#974: add depth limit (8) to resolve_nested() to prevent stack overflow on deeply nested capabilities wrappers - nearai#826: truncate oversized tool outputs (>8KB) in routine lightweight loop to prevent unbounded context growth across iterations Each fix includes a regression test. Closes nearai#1033, nearai#908, nearai#975, nearai#974, nearai#826 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: 5 more high-priority bugs (routine cache, job signals, input limits) - nearai#1077: recompute next_fire_at when re-enabling cron routines via web toggle, mirroring CLI behavior so cron ticker picks them up - nearai#1076: refresh event trigger cache after web toggle/delete operations so event/system_event routines reflect changes immediately - nearai#892: remove Stuck from check_signals() stop-states in JobDelegate since Stuck is recoverable (Stuck -> InProgress via self-repair) - nearai#976: truncate oversized description strings in CapabilitiesFile to 4KB to prevent memory abuse from malicious capabilities files - nearai#977: drop oversized parameters schema JSON (>64KB) in CapabilitiesFile to prevent unbounded memory growth Each fix includes regression tests where applicable. Closes nearai#1077, nearai#1076, nearai#892, nearai#976, nearai#977 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: prevent ReDoS in event trigger regex patterns - nearai#825: use RegexBuilder with 64KB size limit when compiling user-supplied event trigger patterns, both at creation time (routine tool) and at cache refresh (routine engine) Note: Rust's regex crate already guarantees O(n) matching, so the size limit prevents excessive memory use during compilation rather than catastrophic backtracking at match time. Closes nearai#825 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Harden HTTP SSRF IP filtering * Apply rustfmt after staging merge --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
…unded recursion, context growth) (nearai#1083) * fix: address 5 critical and high-priority bugs from issue tracker - nearai#1033: reject webhook requests when secret is cleared at runtime via update_secret(None), preventing auth bypass through SIGHUP hot-swap - nearai#908: reset consecutive_failures counter on successful SSE stream reconnection in relay channel, so circuit breaker counts truly consecutive failures - nearai#975: add depth limit (16) to validate_tool_schema() to prevent stack overflow on deeply nested schemas - nearai#974: add depth limit (8) to resolve_nested() to prevent stack overflow on deeply nested capabilities wrappers - nearai#826: truncate oversized tool outputs (>8KB) in routine lightweight loop to prevent unbounded context growth across iterations Each fix includes a regression test. Closes nearai#1033, nearai#908, nearai#975, nearai#974, nearai#826 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: 5 more high-priority bugs (routine cache, job signals, input limits) - nearai#1077: recompute next_fire_at when re-enabling cron routines via web toggle, mirroring CLI behavior so cron ticker picks them up - nearai#1076: refresh event trigger cache after web toggle/delete operations so event/system_event routines reflect changes immediately - nearai#892: remove Stuck from check_signals() stop-states in JobDelegate since Stuck is recoverable (Stuck -> InProgress via self-repair) - nearai#976: truncate oversized description strings in CapabilitiesFile to 4KB to prevent memory abuse from malicious capabilities files - nearai#977: drop oversized parameters schema JSON (>64KB) in CapabilitiesFile to prevent unbounded memory growth Each fix includes regression tests where applicable. Closes nearai#1077, nearai#1076, nearai#892, nearai#976, nearai#977 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: prevent ReDoS in event trigger regex patterns - nearai#825: use RegexBuilder with 64KB size limit when compiling user-supplied event trigger patterns, both at creation time (routine tool) and at cache refresh (routine engine) Note: Rust's regex crate already guarantees O(n) matching, so the size limit prevents excessive memory use during compilation rather than catastrophic backtracking at match time. Closes nearai#825 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Harden HTTP SSRF IP filtering * Apply rustfmt after staging merge --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Summary
Fixes 11 bugs from the issue tracker, prioritized by severity.
Batch 1: Critical security and stability fixes
update_secret(None)- rejects requests with 503consecutive_failuresnot reset on successful SSE reconnectionvalidate_tool_schema()#975 (HIGH):validate_tool_schema()unbounded recursion - added depth limit of 16resolve_nested()#974 (HIGH):resolve_nested()unbounded recursion - added depth limit of 8Batch 2: Routine cache, job signals, and input validation
next_fire_atwhen re-enabling cron routines via web toggleStuckfromcheck_signals()stop-states (recoverable via self-repair)Batch 3: Security hardening
Test plan
webhook_rejects_when_secret_cleared_at_runtime- auth bypass fixtest_validate_schema_depth_limit- schema recursion limittest_resolve_nested_depth_limit- capabilities recursion limittest_description_truncated_at_limit- description size limittest_oversized_parameters_schema_dropped- schema size limitCloses #1033, #908, #975, #974, #826, #1077, #1076, #892, #976, #977, #825
Generated with Claude Code