Skip to content

[CHORE][PLUGINS]: Test, load test, document, and harden security and resilience plugins #3735

@crivetimihai

Description

@crivetimihai

🔧 Chore Summary

Test, load test, document, and harden the core security and resilience plugins to ensure they are production-ready for 1.0.0. These plugins sit on the critical path for safe MCP tool execution and need thorough validation beyond unit tests — including integration tests, load/stress testing, documentation, and edge-case hardening.


🧱 Area Affected

  • Other: Plugin framework — security, resilience, and compliance plugins

⚙️ Context / Rationale

These plugins enforce security invariants, protect against data exfiltration, and provide resilience guarantees. They must be battle-tested before GA. Current state varies — some have unit tests but lack integration/load tests, documentation may be incomplete, and edge cases (malformed input, high concurrency, large payloads) may not be covered.


📦 Plugins In Scope

Priority 1 — Critical path, must be fully validated

Plugin Directory Purpose
Secrets Detection plugins/secrets_detection/ Detect and block secrets (API keys, tokens, passwords) in tool inputs/outputs
Output Length Guard plugins/output_length_guard/ Enforce output size limits to prevent context window exhaustion and exfiltration
Retry with Backoff plugins/retry_with_backoff/ Resilient retry logic with exponential backoff for transient failures

Priority 2 — Important, should be validated

Plugin Directory Purpose
Rate Limiter plugins/rate_limiter/ Per-user/per-tool rate limiting to prevent abuse
Encoded Exfil Detection plugins/encoded_exfil_detection/ Detect base64/hex/URL-encoded exfiltration attempts
PII Filter plugins/pii_filter/ Detect and redact personally identifiable information
Cedar Policy Plugin plugins/external/ External policy-as-code enforcement via Cedar

📋 Acceptance Criteria

For each plugin in scope:

Testing

  • Unit tests: Verify existing unit tests pass and cover core logic (happy path + error cases)
  • Integration tests: Test plugin within the gateway pipeline (registered, activated, processes real tool calls)
  • Edge cases: Malformed input, empty payloads, unicode/binary content, extremely large payloads
  • Bypass resistance: Verify the plugin cannot be trivially bypassed (e.g., encoding tricks for secrets detection, chunked exfil for output guard)

Load Testing

  • Throughput: Measure latency overhead per plugin under normal load
  • Stress test: Validate behavior under high concurrency (100+ concurrent tool calls with plugin active)
  • Memory/CPU: Confirm no memory leaks or excessive CPU usage under sustained load
  • Graceful degradation: Plugin failures should not crash the gateway — verify fail-open/fail-closed behavior is correct and documented

Documentation

  • Plugin README: Each plugin has a README with description, configuration, examples, and limitations
  • Configuration reference: All configurable parameters documented with defaults and valid ranges
  • Architecture docs: Update docs/ if plugin behavior affects overall system guarantees

Hardening

  • Input validation: All plugin inputs validated at boundary
  • Error handling: Exceptions caught and handled — no unhandled exceptions propagating to caller
  • Logging: Appropriate log levels — no sensitive data in logs, sufficient detail for debugging
  • Configuration defaults: Secure defaults (e.g., fail-closed for security plugins, sane limits for rate limiter)

Overall

  • CI passes with no regressions
  • Load test results documented (can be a summary in the PR)

🧩 Additional Context

Priority 1 plugins are blocking for 1.0.0 — they enforce core security and resilience guarantees that customers depend on.

Priority 2 plugins are important but have more tolerance for incremental hardening post-GA, though they should still have basic integration tests and documentation.

For the Cedar Policy Plugin (plugins/external/), testing may require additional setup (Cedar policy engine). Document any external dependencies and test with mock policies at minimum.

Relevant references:

  • plugins/AGENTS.md — Plugin development guidelines
  • plugins/config.yaml — Plugin configuration
  • plugins/install.yaml — Plugin installation manifest

Metadata

Metadata

Assignees

Labels

MUSTP1: Non-negotiable, critical requirements without which the product is non-functional or unsafechoreLinting, formatting, dependency hygiene, or project maintenance choresplannedPlanned for future releasepluginssecurityImproves securitytestingTesting (unit, e2e, manual, automated, etc)wxowxo integration

Type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions