Skip to content

fix(e2e): multi-tenant widget isolation + portfolio nudge recovery#2790

Merged
serrrfirat merged 2 commits intostagingfrom
fix/e2e-widget-portfolio-tests
Apr 21, 2026
Merged

fix(e2e): multi-tenant widget isolation + portfolio nudge recovery#2790
serrrfirat merged 2 commits intostagingfrom
fix/e2e-widget-portfolio-tests

Conversation

@serrrfirat
Copy link
Copy Markdown
Collaborator

Summary

  • Widget customization (3 tests): Tests expected multi-tenant behavior (CSS/widget/CSP isolation) but ran against the single-tenant default ironclaw_server. Added a session-scoped multi_tenant_gateway_server fixture with AGENT_MULTI_TENANT=true and its own libSQL database, and rewired the three failing tests to use it.
  • Portfolio chat (2 tests): The mock LLM's nudge response swallowed portfolio context — when the engine sent a tool-intent nudge ("You said you would perform an action..."), match_response() returned the generic "I found the information you requested." instead of a portfolio-relevant reply. Added context-aware nudge recovery that checks prior user messages for portfolio/wallet keywords. Also fixed word boundaries on the hello|hi|hey canned pattern to prevent "hi" from matching inside "this".

Companion to #2744 which fixes the other 8 real E2E failures on staging.

Test plan

🤖 Generated with Claude Code

…olio nudge recovery

Widget customization: three tests expected multi-tenant behavior (CSS/widget/CSP
isolation) but ran against the single-tenant default server. Add a session-scoped
`multi_tenant_gateway_server` fixture with AGENT_MULTI_TENANT=true and its own
libSQL database, and rewire the three failing tests to use it.

Portfolio: the mock LLM's nudge response ("I found the information you
requested.") swallowed portfolio context when the engine sent a tool-intent
nudge. Add context-aware nudge recovery in match_response() that checks prior
user messages for portfolio/wallet keywords before falling through to the
generic nudge pattern. Also add word boundaries to the hello|hi|hey canned
pattern to prevent "hi" from matching inside "this".

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions Bot added size: XS < 10 changed lines (excluding docs) risk: low Changes to docs, tests, or low-risk modules contributor: core 20+ merged PRs labels Apr 21, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request enhances the E2E testing infrastructure by refining the mock LLM's response matching and introducing a dedicated multi-tenant gateway server for widget customization tests. Specifically, it adds word boundaries to greeting patterns and implements a 'nudge recovery' mechanism to preserve portfolio context when the LLM fails to call a tool. The test suite is updated to use a specialized multi-tenant fixture, ensuring better isolation for isolation-specific test cases. Feedback focuses on optimizing the nudge recovery logic by avoiding redundant regex compilation and improving context lookup efficiency by iterating through messages in reverse.

Comment thread tests/e2e/mock_llm.py
Comment on lines +795 to +813
_nudge_re = re.compile(
r"You said you would perform an action|You expressed intent",
re.IGNORECASE,
)
if _nudge_re.search(content):
for msg in messages:
if msg.get("role") == "user":
msg_text = _message_text(msg)
if re.search(r"portfolio|defi|rebalance|yield.*positions", msg_text, re.IGNORECASE):
return (
"I'll analyze your DeFi portfolio. The portfolio skill is active and I can scan "
"your wallet addresses across chains to discover positions, check yields, and "
"suggest rebalancing opportunities."
)
if re.search(r"0x[a-fA-F0-9]{40}", msg_text, re.IGNORECASE):
return (
"I found your wallet address. Let me scan your portfolio across all supported "
"chains to discover DeFi positions and classify them against known protocols."
)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The regex _nudge_re is compiled inside the match_response function, which is called frequently during E2E tests. This is inefficient as it recompiles the regex on every call. Additionally, the loop iterates through messages from the beginning, whereas it's generally more efficient and robust to search for context starting from the most recent message. To improve performance, avoid redundant computations inside frequently called functions or loops. Consider using re.search with the string pattern directly (which Python caches internally) and iterating through reversed(messages).

Suggested change
_nudge_re = re.compile(
r"You said you would perform an action|You expressed intent",
re.IGNORECASE,
)
if _nudge_re.search(content):
for msg in messages:
if msg.get("role") == "user":
msg_text = _message_text(msg)
if re.search(r"portfolio|defi|rebalance|yield.*positions", msg_text, re.IGNORECASE):
return (
"I'll analyze your DeFi portfolio. The portfolio skill is active and I can scan "
"your wallet addresses across chains to discover positions, check yields, and "
"suggest rebalancing opportunities."
)
if re.search(r"0x[a-fA-F0-9]{40}", msg_text, re.IGNORECASE):
return (
"I found your wallet address. Let me scan your portfolio across all supported "
"chains to discover DeFi positions and classify them against known protocols."
)
if re.search(r"You said you would perform an action|You expressed intent", content, re.IGNORECASE):
for msg in reversed(messages):
if msg.get("role") == "user":
msg_text = _message_text(msg)
if re.search(r"portfolio|defi|rebalance|yield.*positions", msg_text, re.IGNORECASE):
return (
"I'll analyze your DeFi portfolio. The portfolio skill is active and I can scan "
"your wallet addresses across chains to discover positions, check yields, and "
"suggest rebalancing opportunities."
)
if re.search(r"0x[a-fA-F0-9]{40}", msg_text, re.IGNORECASE):
return (
"I found your wallet address. Let me scan your portfolio across all supported "
"chains to discover DeFi positions and classify them against known protocols."
)
References
  1. To improve performance, avoid redundant computations inside loops or frequently called functions. For example, pre-calculate values or rely on internal caching instead of repeated expensive operations.

Forward cargo-llvm-cov env vars in multi_tenant_gateway_server fixture
so code coverage from the 3 rewired widget tests is captured in CI.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@serrrfirat serrrfirat merged commit e29429d into staging Apr 21, 2026
17 checks passed
@serrrfirat serrrfirat deleted the fix/e2e-widget-portfolio-tests branch April 21, 2026 19:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

contributor: core 20+ merged PRs risk: low Changes to docs, tests, or low-risk modules size: XS < 10 changed lines (excluding docs)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant