Skip to content

chore: promote staging to staging-promote/82d341d6-24423313121 (2026-04-14 23:15 UTC)#2478

Merged
henrypark133 merged 1 commit intomainfrom
staging-promote/2dc78b2d-24427647240
Apr 18, 2026
Merged

chore: promote staging to staging-promote/82d341d6-24423313121 (2026-04-14 23:15 UTC)#2478
henrypark133 merged 1 commit intomainfrom
staging-promote/2dc78b2d-24427647240

Conversation

@ironclaw-ci
Copy link
Copy Markdown
Contributor

@ironclaw-ci ironclaw-ci bot commented Apr 14, 2026

Auto-promotion from staging CI

Batch range: a53eac5c2dec6b6cd5c08189086093fde64aa9cb..2dc78b2d941e8f00f490c603b77c729d926565d6
Promotion branch: staging-promote/2dc78b2d-24427647240
Base: staging-promote/82d341d6-24423313121
Triggered by: Staging CI batch at 2026-04-14 23:15 UTC

Commits in this batch (41):

Current commits in this promotion (0)

Current base: main
Current head: staging-promote/2dc78b2d-24427647240
Current range: origin/main..origin/staging-promote/2dc78b2d-24427647240

  • (no non-merge commits in range)

Auto-updated by staging promotion metadata workflow

Waiting for gates:

  • Tests: pending
  • E2E: pending
  • Claude Code review: pending (will post comments on this PR)

Auto-created by staging-ci workflow

* feat(db): add per-user CachedSettingsStore decorator

SettingsStore methods hit the database on every call. The v2 engine
path (effect_adapter) and the dispatcher's per-turn tool permission
loading both called get_all_settings() without caching, adding
unnecessary DB round-trips on every agentic loop iteration.

Add a write-through CachedSettingsStore decorator that caches
get_all_settings() results per user_id. Write operations (set_setting,
delete_setting, set_all_settings) delegate to the inner store then
invalidate that user's cache entry. The write lock is held across DB
loads to prevent stale-data races from concurrent invalidations.

Wire the cache into TenantScope via a new settings_store field on
AgentDeps, so all settings reads in the agent loop go through the
cache. Remove the per-turn cached_tool_permissions Mutex hack from
ChatDelegate that was working around the missing cache layer.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address PR review feedback

- Store Arc<HashMap> in cache instead of bare HashMap to avoid cloning
  the full settings map on every cache hit. get_setting/has_settings now
  only clone the single requested value or check emptiness through the Arc.
- Route get_setting_with_admin_fallback() through self.settings() instead
  of self.inner so both the per-user and admin lookups go through the cache.
- Update settings section comment to accurately describe which methods
  delegate through settings().

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address all PR review feedback

- Use `crate::db::` imports instead of `super::` (convention fix)
- Add `wrap()` factory fn to CachedSettingsStore, simplify app.rs construction
- Store `Arc<HashMap>` in cache to avoid full map clones on hits
- Expose `invalidate_user()` and `flush()` public methods
- Wire `flush()` into SIGHUP handler via concrete `settings_cache` on AppComponents
- Wire `settings_store` into GatewayState and route all settings handlers
  through it so web UI writes invalidate the cache (critical fix)
- Route `get_setting_with_admin_fallback()` through `self.settings()`
- Add error-path test (FailingStore mock, cache not poisoned on error)
- Add concurrent-access test (8 concurrent readers, inner store hit once)
- Add TenantScope caller-level test (read/write through cache wiring)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: reuse resolve_settings_store() in settings_tools_set_handler

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address all review feedback on CachedSettingsStore

- Add TTL (300s) and max-entries cap (1000) to bound cache staleness
  and memory growth. Entries expire after TTL; cache clears when cap
  exceeded.
- Route admin tool_policy GET/PUT through resolve_settings_store() so
  writes invalidate the __admin__ cache entry.
- Route settings_export_handler and settings_tools_list_handler through
  resolve_settings_store() (were bypassing cache on reads).
- Wire invalidate_user() into users_delete_handler and
  users_suspend_handler so deleted/suspended users' settings are evicted.
- Replace GatewayState.settings_store (trait object) with
  settings_cache (concrete CachedSettingsStore) — single field for both
  trait dispatch and cache management, no desync risk.
- Add settings_override to ExtensionManager with with_settings_store()
  builder. All settings reads/writes in ExtensionManager now route
  through the cached store when available.
- Make ExtensionManager::settings_store() pub(crate); update
  AuthManager to call it instead of database(), closing the auth
  descriptor cache bypass.
- Remove unused wrap() method; merge redundant invalidate/invalidate_user.
- Add tracing::debug on SIGHUP cache flush.
- Expand module docs with design assumptions, known bypass paths, TTL
  and eviction semantics.
- Add tests: expired_entry_triggers_reload, fresh_entry_does_not_reload,
  max_entries_cap_triggers_eviction.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: collapse nested if into filter to satisfy clippy

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added scope: agent Agent core (agent loop, router, scheduler) scope: channel/web Web gateway channel scope: db Database trait / abstraction scope: extensions Extension management size: XL 500+ changed lines risk: medium Business logic, config, or moderate-risk modules contributor: core 20+ merged PRs labels Apr 14, 2026
@claude
Copy link
Copy Markdown

claude bot commented Apr 14, 2026

Code review

Found 8 issues:

  1. [CRITICAL:95] Catastrophic cache eviction policy: Entire cache clears when any user load exceeds max_entries (1000), causing thundering herd of DB queries under multi-user load. Worse than no cache.

https://github.com/anthropics/ironclaw/blob/2dc78b2d941e8f00f490c603b77c729d926565d6/src/db/cached_settings.rs#L130-L140

  1. [HIGH:90] RwLock::write held across async DB I/O: Write lock held during get_all_settings() call blocks all readers. Slow DB calls serialize subsequent cache hits in multi-user scenarios.

https://github.com/anthropics/ironclaw/blob/2dc78b2d941e8f00f490c603b77c729d926565d6/src/db/cached_settings.rs#L118-L140

  1. [HIGH:85] Memory unbounded in cache entries: No per-entry size limits. Users with 10,000+ large JSON settings could consume 100+MB per entry, reaching 100GB total with 1000 max entries.

https://github.com/anthropics/ironclaw/blob/2dc78b2d941e8f00f490c603b77c729d926565d6/src/db/cached_settings.rs#L1-L50

  1. [HIGH:75] Cache bypass paths allow stale settings: Code paths in app.rs, import/openclaw/mod.rs, tools/mcp/config.rs, setup/wizard.rs write directly to DB without cache invalidation. Tool permissions could serve stale values for 5 minutes.

https://github.com/anthropics/ironclaw/blob/2dc78b2d941e8f00f490c603b77c729d926565d6/src/db/cached_settings.rs#L21-L27

  1. [MEDIUM:95] Missing handler-level integration tests: Unit tests for CachedSettingsStore exist but no integration tests through actual handlers. Per CLAUDE.md testing rules, should test end-to-end from handler through cache hits.

https://github.com/anthropics/ironclaw/blob/2dc78b2d941e8f00f490c603b77c729d926565d6/src/db/cached_settings.rs#L1200-L1300

  1. [MEDIUM:80] Metadata reads bypass cache: get_setting_full() and list_settings() pass through to inner store. Stale data could be served if called after cache-invalidating write with concurrent load between invalidation and pass-through.

https://github.com/anthropics/ironclaw/blob/2dc78b2d941e8f00f490c603b77c729d926565d6/src/db/cached_settings.rs#L176-L204

  1. [MEDIUM:70] Settings cache not fully wired: AuthManager sometimes bypasses cache and hits raw DB, creating inconsistent write paths and 5-minute staleness window.

https://github.com/anthropics/ironclaw/blob/2dc78b2d941e8f00f490c603b77c729d926565d6/src/bridge/auth_manager.rs#L40-L50

  1. [LOW:55] Unnecessary HashMap clone: Line 212 clones entire settings map even though already Arc-wrapped. Arc copy sufficient; clone defeats caching benefit for large maps.

https://github.com/anthropics/ironclaw/blob/2dc78b2d941e8f00f490c603b77c729d926565d6/src/db/cached_settings.rs#L210-L215

@railway-app railway-app bot temporarily deployed to venice-ironclaw / production April 14, 2026 23:25 Inactive
@railway-app railway-app bot temporarily deployed to cosmose-ironclaw / production April 14, 2026 23:27 Inactive
@railway-app railway-app bot temporarily deployed to ironclaw-nearai / production April 14, 2026 23:28 Inactive
Base automatically changed from staging-promote/82d341d6-24423313121 to main April 18, 2026 00:59
@henrypark133 henrypark133 merged commit 2dc78b2 into main Apr 18, 2026
38 of 45 checks passed
@henrypark133 henrypark133 deleted the staging-promote/2dc78b2d-24427647240 branch April 18, 2026 01:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: core 20+ merged PRs risk: medium Business logic, config, or moderate-risk modules scope: agent Agent core (agent loop, router, scheduler) scope: channel/web Web gateway channel scope: db Database trait / abstraction scope: extensions Extension management size: XL 500+ changed lines staging-promotion

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant