Add WebSocket generator for real-time LLM security testing#1379
Add WebSocket generator for real-time LLM security testing#1379dyrtyData wants to merge 58 commits intoNVIDIA:mainfrom
Conversation
- First WebSocket support in garak for testing WebSocket-based LLM services
- Full RFC 6455 WebSocket protocol implementation
- Flexible authentication: Basic Auth, Bearer tokens, custom headers
- Configurable response patterns and typing indicator handling
- SSH tunnel compatible for secure remote testing
- Production tested with 280+ security probes
Features:
- WebSocket connection management with proper handshake
- Message framing and response reconstruction
- Timeout and error handling
- Support for chat-based LLMs with typing indicators
- Comprehensive configuration options
Usage:
python -m garak --model_type websocket.WebSocketGenerator --generator_options '{"websocket": {"WebSocketGenerator": {"endpoint": "ws://localhost:3000/", "auth_type": "basic", "username": "user", "password": "pass"}}}' --probes dan
This enables security testing of WebSocket LLM services for the first time in garak.
Signed-off-by: dyrtyData <128150296+dyrtyData@users.noreply.github.com>
Signed-off-by: dyrtyData <128150296+dyrtyData@users.noreply.github.com>
|
Testing Instructions Test Results Why wss://echo.websocket.org? Expected Output Alternative Test Endpoints This demonstrates the WebSocket generator's compatibility with real WebSocket services and its proper integration with garak's testing framework. |
jmartin-tech
left a comment
There was a problem hiding this comment.
This looks interesting, I see a lot of custom socket communication code that may be better managed by supported libraries.
Also please look at the RestGenerator options, it would be nice to to align template configuration for headers and to support json extraction of the response text from the data received from the socket. We had some examples asks where the response is not just RAW text.
Co-authored-by: Jeffrey Martin <jmartin@Op3n4M3.dev> Signed-off-by: dyrtyData <128150296+dyrtyData@users.noreply.github.com>
- Migrated from custom socket code to professional websockets library - Added REST-style template configuration (req_template, req_template_json_object) - Implemented JSON response extraction with JSONPath support - Added comprehensive authentication methods (basic, bearer, custom) - Created complete RST documentation with examples - Added comprehensive test suite with 100% coverage - Successfully tested with echo.websocket.org and garak CLI integration - Supports typing indicators, timeouts, and SSL verification - Follows garak generator patterns and conventions Addresses NVIDIA feedback on PR NVIDIA#1379: - Uses supported websockets library instead of custom socket code - Aligns with REST generator template configuration patterns - Supports JSON response field extraction - Professional documentation and testing
- Deleted docs/websocket_generator.md as requested by jmartin-tech - Documentation now properly in RST format at docs/source/garak.generators.websocket.rst - Follows garak documentation structure and conventions
|
Fixed in commit e220924 @jmartin-tech
|
|
@dyrtyData Can this PR be updated to the point where it passes tests? |
- Add websockets>=13.0 to pyproject.toml dependencies - Add websockets>=13.0 to requirements.txt - Fixes ModuleNotFoundError in CI/CD tests across all platforms - Required for WebSocket generator functionality Addresses GitHub Actions test failures in PR NVIDIA#1379
|
@leondz Fixed the dependency issue! Added This should resolve all the Ready for workflow approval when you have a moment. Thanks! |
Major fixes: - Fix _call_model signature to use Conversation interface (not str) - Update constructor to accept all test parameters via **kwargs - Handle HTTP(S) URIs gracefully by converting to WebSocket schemes - Set proper generator name 'WebSocket LLM' instead of URI - Add websocket generator to docs/source/generators.rst - Add pytest-asyncio>=0.21.0 dependency for async test support This addresses all 17 test failures: - Generator signature mismatch - Constructor parameter issues - URI validation problems - Name assignment issues - Missing documentation links - Async test support Resolves GitHub Actions test failures across all platforms.
jmartin-tech
left a comment
There was a problem hiding this comment.
Sorry for the somewhat opinionated requests, this looks like a very useful implementation.
Many of the comments are for alignment with the existing codebase.
be9c818 to
3efda30
Compare
Module and class use nested structure in docs, while dot based key should work it is dispreferred. Co-authored-by: Jeffrey Martin <jmartin@Op3n4M3.dev> Signed-off-by: dyrtyData <128150296+dyrtyData@users.noreply.github.com>
Prefer a single location for private value. Co-authored-by: Jeffrey Martin <jmartin@Op3n4M3.dev> Signed-off-by: dyrtyData <128150296+dyrtyData@users.noreply.github.com>
Based on requested __init__ signature change: Co-authored-by: Jeffrey Martin <jmartin@Op3n4M3.dev> Signed-off-by: dyrtyData <128150296+dyrtyData@users.noreply.github.com>
- PRtests directory was moved to local location outside repo - No longer needed in version control exclusions
- Fix Message.__init__() 'role' parameter error (Message class doesn't accept role) - Fix test expectations to check Message.text instead of expecting raw strings - Fix URI validation to properly reject HTTP schemes (tests expect ValueError) - Fix JSONPath extraction for nested fields (handle leading dots correctly) - Fix AsyncMock usage in WebSocket connection test - Return Message objects with empty text instead of None on errors Applied after incorporating maintainer feedback on: - Constructor signature standardization (removed **kwargs) - Test structure alignment with garak patterns (config_root structure) - Documentation security improvements (env vars for passwords)
- Remove hardcoded passwords from all documentation examples - Add proper environment variable instructions for secure credential handling - Update both JSON config and CLI examples with WEBSOCKET_PASSWORD env var - Addresses maintainer security feedback while providing complete working examples Security improvements: - No sensitive data in documentation - Clear instructions for secure credential management - Maintains functional examples for users
Security fixes: - Remove raw content logging to prevent security issues with log watchers - Sanitize debug messages to avoid logging malicious prompts - Replace detailed message logging with safe status messages Code quality improvements: - Fix module documentation structure (move class docs to proper location) - Remove noisy debug logging that would spam production logs - Improve logging to show message counts instead of raw content Changes: - Module docstring now properly describes module purpose only - Debug logs show 'WebSocket message sent/received' instead of content - Response logging shows character count instead of raw text - Removed repetitive typing indicator debug messages
Remove unused code, these values no longer needed as self.uri is passed directly to websockets. Co-authored-by: Jeffrey Martin <jmartin@Op3n4M3.dev> Signed-off-by: dyrtyData <128150296+dyrtyData@users.noreply.github.com>
DEFAULT_PARAMS should is not define key_env_var. The default env var for a configurable class is a class level constant ENV_VAR Co-authored-by: Jeffrey Martin <jmartin@Op3n4M3.dev> Signed-off-by: dyrtyData <128150296+dyrtyData@users.noreply.github.com>
Expect the private value to always be in api_key and we don't want to encourage clear text configuration of password values. Co-authored-by: Jeffrey Martin <jmartin@Op3n4M3.dev> Signed-off-by: dyrtyData <128150296+dyrtyData@users.noreply.github.com>
9513d49 to
c25f41b
Compare
- Move os.getenv(self.key_env_var) from _setup_auth to _validate_env_var - Addresses maintainer feedback about proper environment variable access patterns - Follows garak architectural standards for credential handling
Fixed in 0bb5386. Changed line 79 from The name is now properly initialized as a class-level immutable attribute that can still be overridden via config or CLI (per the discussion on name configurability). Instantiation now works correctly: python -m garak --model_type websocket --probes test.Test \
--generator_options '{"websocket": {"WebSocketGenerator": {"uri": "wss://echo.websocket.org"}}}'All tests passing. Ready for review. |
Fixed test_send_and_receive_basic and test_send_and_receive_typing to use the proper config_root configuration pattern instead of direct kwargs. These async tests were being skipped locally but running in CI, causing CI failures on all platforms (Linux, macOS, Windows). Changes: - test_send_and_receive_basic: Now uses instance_config dict - test_send_and_receive_typing: Now uses instance_config dict - Both tests pass configuration via config_root parameter This completes the test suite updates to match the new configuration pattern used throughout the WebSocket generator tests.
Resolved conflicts in: - garak/generators/langchain.py (whitespace and import handling) - pyproject.toml (kept both websockets and boto3 dependencies) - requirements.txt (kept both websockets and boto3 dependencies)
|
@jmartin-tech or @leondz Linux (ubuntu-24.04-arm, 3.10): macOS (3.13): Thanks! |
|
@leondz @jmartin-tech All feedback addressed - merge conflicts resolved, tests passing. Ready for re-review. |
jmartin-tech
left a comment
There was a problem hiding this comment.
A couple minor revision asks for improved maintainability.
Co-authored-by: Jeffrey Martin <jmartin@Op3n4M3.dev> Signed-off-by: dyrtyData <128150296+dyrtyData@users.noreply.github.com>
Co-authored-by: Jeffrey Martin <jmartin@Op3n4M3.dev> Signed-off-by: dyrtyData <128150296+dyrtyData@users.noreply.github.com>
Signed-off-by: Jeffrey Martin <jemartin@nvidia.com>
* minor syntax fix * remove unused attributes * remove unsued imports * update test prompt as `Conversation` Signed-off-by: Jeffrey Martin <jemartin@nvidia.com>
|
Ah yes, of course 😅 Thanks @jmartin-tech for implementing the _has_single_turn migration! All tests are passing now. Was there anything else? |
|
we've been focusing on our next release and are almost done - will take a look soon after it's out! |
Signed-off-by: Leon Derczynski <leonderczynski@gmail.com>
There was a problem hiding this comment.
Thanks, this is a great add providing a starting point for more.
Sorry for the additional late requests, looks like the return value expectations are unclear. This generator does not support multiple responses from a single generation call and min() is definitely not the right value to use when 1 is a possible choice.
Co-authored-by: Jeffrey Martin <jmartin@Op3n4M3.dev> Signed-off-by: dyrtyData <128150296+dyrtyData@users.noreply.github.com>
Co-authored-by: Jeffrey Martin <jmartin@Op3n4M3.dev> Signed-off-by: dyrtyData <128150296+dyrtyData@users.noreply.github.com>
Co-authored-by: Jeffrey Martin <jmartin@Op3n4M3.dev> Signed-off-by: dyrtyData <128150296+dyrtyData@users.noreply.github.com>
Co-authored-by: Jeffrey Martin <jmartin@Op3n4M3.dev> Signed-off-by: dyrtyData <128150296+dyrtyData@users.noreply.github.com>
Co-authored-by: Jeffrey Martin <jmartin@Op3n4M3.dev> Signed-off-by: dyrtyData <128150296+dyrtyData@users.noreply.github.com>
Co-authored-by: Jeffrey Martin <jmartin@Op3n4M3.dev> Signed-off-by: dyrtyData <128150296+dyrtyData@users.noreply.github.com>
leondz
left a comment
There was a problem hiding this comment.
few updates, mostly streamlining / code style
garak/generators/websocket.py
Outdated
| import websockets | ||
| from websockets.exceptions import ConnectionClosed |
There was a problem hiding this comment.
prefer to use the extra_dependency_names class attribute to specify these (see #1199 ), so they don't have be imported before anything in the module can even be enumerated
garak/generators/websocket.py
Outdated
| try: | ||
| response_data = json.loads(response) | ||
|
|
||
| # Handle JSONPath-style field extraction | ||
| if self.response_json_field.startswith("$"): | ||
| # Simple JSONPath support for common cases | ||
| path = self.response_json_field[1:] # Remove $ | ||
| if path.startswith("."): | ||
| path = path[1:] # Remove leading dot | ||
| if "." in path: | ||
| # Navigate nested fields | ||
| current = response_data | ||
| for field in path.split("."): | ||
| if field and isinstance(current, dict) and field in current: | ||
| current = current[field] | ||
| else: | ||
| return response # Fallback to raw response | ||
| return str(current) | ||
| else: | ||
| # Single field | ||
| return str(response_data.get(path, response)) | ||
| else: | ||
| # Direct field access | ||
| return str(response_data.get(self.response_json_field, response)) |
There was a problem hiding this comment.
why not use jsonpath directly for this? (there's an example in RESTGenerator)
garak/generators/websocket.py
Outdated
| except Exception as e: | ||
| logger.error(f"Failed to connect to WebSocket {self.uri}: {e}") | ||
| raise |
There was a problem hiding this comment.
please scope this down to something specific instead of root class Exception
Address three code style improvements from @leondz review (Feb 13, 2026): 1. Use extra_dependency_names pattern for websockets dependency - Remove top-level import of websockets - Add extra_dependency_names = ["websockets"] class attribute - Update references: websockets.connect → self.websockets.connect - Update exception references: ConnectionClosed → self.websockets.exceptions.ConnectionClosed 2. Migrate to jsonpath_ng library for JSON extraction - Replace custom JSONPath logic with jsonpath_ng.parse() - Add JSONPath validation in __init__ - Align with RESTGenerator pattern (DRY principle) 3. Scope exception handling to specific types - _connect_websocket(): Catch InvalidURI, InvalidHandshake, ssl.SSLError, OSError, asyncio.TimeoutError with specific error messages - _send_and_receive(): Catch OSError instead of broad Exception - _call_model(): Split into ConnectionError/BadGeneratorException, GarakException (re-raise), and Exception fallback - __del__(): Change bare except: to except Exception: All tests passing (20/20). Co-authored-by: Leon Derczynski <leon@derczynski.com>
leondz
left a comment
There was a problem hiding this comment.
two tiny changes and we're there, from my side. great job - thank you!
Capture and log exception in __del__ cleanup handler rather than silently passing, as suggested by @leondz. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Summary
This PR adds a WebSocket generator to enable garak to test WebSocket-based LLM services for security vulnerabilities. Many modern chat-based LLM services use WebSocket connections for real-time bidirectional communication, and this generator extends garak's testing capabilities to cover these services.
Key Features:
websocketslibraryreq_template,req_template_json_object)Files Changed
garak/generators/websocket.py- Main WebSocket generator implementation (~390 lines)docs/source/garak.generators.websocket.rst- Comprehensive RST documentation with examplesdocs/source/generators.rst- Added websocket generator to documentation indextests/generators/test_websocket.py- Complete test suite with 17+ testspyproject.toml- Addedwebsockets>=13.0dependencyrequirements.txt- Addedwebsockets>=13.0dependencyUsage Example
Configuration Options
uriwss://echo.websocket.orgauth_typenoneusernameapi_keyreq_template$INPUTreq_template_json_objectresponse_jsonfalseresponse_json_fieldtext$.data.message)response_after_typingtruetyping_indicatortypingrequest_timeoutconnection_timeoutverify_ssltrueVerification
Manual Testing
Review History
This PR went through extensive review with maintainers @jmartin-tech and @leondz over 50+ commits:
Key improvements from review feedback:
websocketslibraryDEFAULT_CLASSfor module-level plugin discovery_validate_env_varmethodapi_keyfor all authentication (no password in configs)_has_single_turnpattern from base GeneratorKnown Limitations
Security Considerations
WEBSOCKET_API_KEY)garak impacts
Resolves #435