feat(beta): add ToolPolicyMiddleware for tool access control#2533
feat(beta): add ToolPolicyMiddleware for tool access control#2533amabito wants to merge 8 commits intoag2ai:mainfrom
Conversation
|
@amabito hi! Thank you for the middlewares contribution, but I don't see a sense of the current one
|
|
Hey @Lancetnik, thanks for the pushback -- both points are worth addressing directly. "Just don't pass the tool to context"Static exclusion at registration time works when the policy is fixed for the session, and for that case you're right that a middleware adds nothing. Where it stops working:
If even the audit case feels out of scope for AG2's middleware layer and you'd rather see this as a dedicated capability or event hook rather than a middleware, I'm open to that -- say the word and I'll repackage it. But I don't think "don't pass the tool" covers these cases. Counters and the leak concernThe counters at L90-L91 are instance attributes on I'd rather just drop the counters. They were a convenience for testing and a half-hearted observability hook, and neither justifies stateful middleware. Concrete change:
Landing it in the next revision alongside the constructor simplification from #2531 / #2532. |
Address review feedback from @Lancetnik on ag2ai#2533: - Flatten constructor API: pass blocked_tools/allowed_tools/on_blocked directly instead of requiring callers to build a ToolPolicyConfig - Drop stateful counters (_total_tool_calls, _total_blocked, _lock) and their accessor properties -- the middleware is now stateless - Add optional on_blocked callback so callers can wire their own observability (metrics, audit log) without the library retaining any per-factory state The ToolPolicyConfig dataclass is kept as an internal struct for immutability (tuple freezing in __post_init__).
Fair, it seems the commonest case for such feature - but I don't see a public API covers it. Can we introduce a some? |
There was a problem hiding this comment.
I prefer end-to-end test here using public API (or at least do not use non-public objects). Such tests are not so implementation-tied and not so fragile. Also, they cover real usecases and could be used as a some kind of documentation
Please, try to implement each test using TestClient / TracingClient. Reference: https://github.com/ag2ai/ag2/blob/main/test/beta/middleware/test_tool_execution.py
Address review feedback from @Lancetnik on ag2ai#2533: - Flatten constructor API: pass blocked_tools/allowed_tools/on_blocked directly instead of requiring callers to build a ToolPolicyConfig - Drop stateful counters (_total_tool_calls, _total_blocked, _lock) and their accessor properties -- the middleware is now stateless - Add optional on_blocked callback so callers can wire their own observability (metrics, audit log) without the library retaining any per-factory state The ToolPolicyConfig dataclass is kept as an internal struct for immutability (tuple freezing in __post_init__).
Address @Lancetnik follow-up on ag2ai#2533: the dynamic-policy justification needs a public API to be useful. Adds: - ToolPolicyMiddleware.update_policy(config: ToolPolicyConfig) Atomically replaces the active policy. Subsequent tool calls see the new config; per-invocation instances already inside on_tool_execution keep their captured snapshot, so in-flight executions are never torn across a config change. - ToolPolicyMiddleware.config property Read-only access to the active ToolPolicyConfig. Intended hooks: rate-limit kill switches, per-tenant rule updates from an ops dashboard, role changes during a long-running session -- all without rebuilding the agent. Thread-safety: _policy is assigned last so readers in __call__ never observe a state where _config is new but _policy is stale (Python attribute assignment is atomic for single references). Tests: 19/19 pass, covering update-then-permit, in-flight-snapshot isolation, restriction clearing, and the config property.
3253b29 to
5984625
Compare
|
@Lancetnik Thanks -- fair point, "it's in the shape" is not a public API. Added one in 5984625.
|
|
Sorry for the rapid-fire -- ran the middleware through another review pass and wanted to send the consolidated state. Since the last update:
25 tests, all passing. Pre-commit clean. |
Single-responsibility middleware that blocks disallowed tool calls based on configurable allow/block lists. Extracted from GovernanceMiddleware to align with beta v2 composable middleware design. - ToolPolicyConfig: blocked_tools, allowed_tools - blocked_tools overrides allowed_tools - allowed_tools=None means no restriction, [] means deny all - Immutable config (lists frozen to tuples)
Address review feedback from @Lancetnik on ag2ai#2533: - Flatten constructor API: pass blocked_tools/allowed_tools/on_blocked directly instead of requiring callers to build a ToolPolicyConfig - Drop stateful counters (_total_tool_calls, _total_blocked, _lock) and their accessor properties -- the middleware is now stateless - Add optional on_blocked callback so callers can wire their own observability (metrics, audit log) without the library retaining any per-factory state The ToolPolicyConfig dataclass is kept as an internal struct for immutability (tuple freezing in __post_init__).
Address @Lancetnik follow-up on ag2ai#2533: the dynamic-policy justification needs a public API to be useful. Adds: - ToolPolicyMiddleware.update_policy(config: ToolPolicyConfig) Atomically replaces the active policy. Subsequent tool calls see the new config; per-invocation instances already inside on_tool_execution keep their captured snapshot, so in-flight executions are never torn across a config change. - ToolPolicyMiddleware.config property Read-only access to the active ToolPolicyConfig. Intended hooks: rate-limit kill switches, per-tenant rule updates from an ops dashboard, role changes during a long-running session -- all without rebuilding the agent. Thread-safety: _policy is assigned last so readers in __call__ never observe a state where _config is new but _policy is stale (Python attribute assignment is atomic for single references). Tests: 19/19 pass, covering update-then-permit, in-flight-snapshot isolation, restriction clearing, and the config property.
…eware at top level TP-1 (update_policy atomicity): The previous docstring said "Assign _policy last so readers in __call__ never observe a state where _config is new but _policy is stale", but the assignment order was _config then _policy -- which is exactly right. The comment was misleading. Clarify: _config is assigned first so the config property never lags behind the active policy; under CPython's GIL each assignment is atomic at the bytecode level, so __call__ readers cannot observe a mismatched pair. TP-2 (export gap): autogen.beta.middleware top-level __init__ did not re-export ToolPolicyMiddleware or ToolPolicyConfig. Add imports and __all__ entries so callers can reach them from the canonical import path. Add test_tool_policy_importable_from_middleware_top_level regression test.
…ersarial tests - Wrap on_blocked callback in contextlib.suppress(Exception) so a failing audit observer does not prevent the ToolErrorEvent from being returned to the caller. - Add test_on_blocked_callback_exception_does_not_prevent_denial. - Add test_concurrent_update_policy_consistent_decisions: 20 parallel coroutines interleaved with a background policy-writer; asserts every result is a valid ToolErrorEvent or ToolResultEvent. - Add test_empty_tool_name_denied_by_allowlist. - Add test_empty_allowlist_blocks_all_via_public_interface covering ToolPolicyMiddleware(allowed_tools=[]) through the public interface.
5043ecc to
2cd9dd9
Compare
Codecov Report✅ All modified and coverable lines are covered by tests.
... and 48 files with indirect coverage changes 🚀 New features to boost your workflow:
|
Why are these changes needed?
Pulls the tool allow/block list logic out of GovernanceMiddleware (#2501,
closed) into its own middleware. Sits on
on_tool_execution, checks thetool name against blocked and allowed lists, returns ToolErrorEvent if denied.
blocked_tools overrides allowed_tools. allowed_tools=None means no
restriction; allowed_tools=[] means deny all. Config is frozen after
construction.
Related issue number
Replaces the tool policy portion of #2501 (closed).
Related: #2531 (BudgetMiddleware), #2532 (CircuitBreakerMiddleware).
Checks