Skip to content

Add OBO token caching with proactive refresh for AG-UI ObO forwarding#11610

Draft
mingshl wants to merge 2 commits intoopensearch-project:mainfrom
o19s:main-refresh-obo-token
Draft

Add OBO token caching with proactive refresh for AG-UI ObO forwarding#11610
mingshl wants to merge 2 commits intoopensearch-project:mainfrom
o19s:main-refresh-obo-token

Conversation

@mingshl
Copy link
Copy Markdown
Contributor

@mingshl mingshl commented Mar 27, 2026

Description

  • Cache OBO tokens per user in memory with proactive refresh 30 seconds before expiry, using the actual durationSeconds from the security plugin API response
  • Extract the authenticated username via getPrincipalsFromRequest (backed by cookie credentials through HttpAuth) to key the cache safely per user
  • When username cannot be resolved, skip caching entirely to prevent cross-user token sharing — a fresh token is minted each time instead
  • Evict expired cache entries on each cache miss to bound memory growth

Context

Follow-up to #11524. OBO tokens have a hard max TTL of 10 minutes (default 5 min) and there is no refresh endpoint — a new token must be minted each time. The cookie-stored credentials (available via asCurrentUser) act as the long-lived "refresh token" to mint short-lived OBO tokens on demand.

Issues Resolved

Screenshot

Testing the changes

Changelog

Check List

  • All tests pass
    • yarn test:jest
    • yarn test:jest_integration
  • New functionality includes testing.
  • New functionality has been documented.
  • Update CHANGELOG.md
  • Commits are signed per the DCO using --signoff

@github-actions
Copy link
Copy Markdown
Contributor

ℹ️ Manual Changeset Creation Reminder

Please ensure manual commit for changeset file 11610.yml under folder changelogs/fragments to complete this PR.

If you want to use the available OpenSearch Changeset Bot App to avoid manual creation of changeset file you can install it in your forked repository following this link.

For more information about formatting of changeset files, please visit OpenSearch Auto Changeset and Release Notes Tool.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 27, 2026

PR Code Analyzer ❗

AI-powered 'Code-Diff-Analyzer' found issues on commit ada87e0.

PathLineSeverityDescription
src/plugins/chat/server/routes/index.ts35mediumModule-level `oboTokenCache` stores OBO authentication tokens as unbounded process-level state. The cache has no maximum size limit, only expired-entry eviction, meaning a high-cardinality user base could cause unbounded memory growth. Additionally, OBO tokens (short-lived auth credentials) persisting in a global Map for the full server lifetime increases the window of exposure if process memory is ever inspected or leaked.
src/plugins/chat/server/routes/index.ts80lowCache key in `getValidOboToken` is username only, not scoped by `agUiUrl`. If multiple AG-UI endpoints exist, a token minted for one endpoint is silently reused for another endpoint belonging to the same user. While OBO tokens are typically audience-agnostic within a cluster, this assumption is not validated and could result in unintended credential forwarding to a different downstream service.

The table above displays the top 10 most important findings.

Total: 2 | Critical: 0 | High: 0 | Medium: 1 | Low: 1


Pull Requests Author(s): Please update your Pull Request according to the report above.

Repository Maintainer(s): You can bypass diff analyzer by adding label skip-diff-analyzer after reviewing the changes carefully, then re-run failed actions. To re-enable the analyzer, remove the label, then re-run all actions.


⚠️ Note: The Code-Diff-Analyzer helps protect against potentially harmful code patterns. Please ensure you have thoroughly reviewed the changes beforehand.

Thanks.

@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Mar 27, 2026

✅ All unit and integration tests passing

🔗 Workflow run · commit ada87e09c9ab2782bfb1e246ae74ff18a879c6c6

mingshl added 2 commits March 27, 2026 09:48
Signed-off-by: Mingshi Liu <mingshl@amazon.com>
Signed-off-by: Mingshi Liu <mingshl@amazon.com>
@mingshl mingshl force-pushed the main-refresh-obo-token branch from 2aef47d to ada87e0 Compare March 27, 2026 16:50
@github-actions
Copy link
Copy Markdown
Contributor

Failed to generate code suggestions for PR

@mingshl
Copy link
Copy Markdown
Contributor Author

mingshl commented Mar 27, 2026

@cwperks would you please take a review of this change to refresh the obo token before it expired?

@cwperks
Copy link
Copy Markdown
Member

cwperks commented Mar 27, 2026

@mingshl I don't have a good mental model of how agUI uses the token. Can you point me to any relevant documentation?

Is there any signal that OSD gets from OpenSearch MCP Tools Service that let's it know that it has an expired token?

@mingshl
Copy link
Copy Markdown
Contributor Author

mingshl commented Mar 27, 2026

@mingshl I don't have a good mental model of how agUI uses the token. Can you point me to any relevant documentation?

Is there any signal that OSD gets from OpenSearch MCP Tools Service that let's it know that it has an expired token?

@mingshl I don't have a good mental model of how agUI uses the token. Can you point me to any relevant documentation?

Is there any signal that OSD gets from OpenSearch MCP Tools Service that let's it know that it has an expired token?

Here is the RFC and the high level design in it. opensearch-project/OpenSearch#20602 the agui server is passing through the token to mcp server, and it will reach opensearch to authenticate. Agent server doesn't authenticate the token.

The change I made in the PR, when generate the obo token, I get the obo token string and also mark down the duration. I will count the time before it expired before the duration, I will regenerate a new obo token to the streaming request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants