Agent breaker by eliyacohen-hub · Pull Request #1628 · NVIDIA/garak

eliyacohen-hub · 2026-02-24T12:09:47Z

Agent Breaker: Multi-turn red-team probe for agentic LLM applications

Adds a new probe (agent_breaker.AgentBreaker) that performs automated security testing of agentic LLM applications — systems that use tools (e.g. code execution, database queries, file access, API calls).

A red team model analyzes each tool for vulnerabilities, generates targeted exploits, attacks the agent in multi-turn conversations (learning from failures), and verifies attack success.

Key features:

Auto-discovery — if no tools are defined in config, the probe queries the target agent to discover its tools automatically
Parallel tool attacks — configurable max_parallel_tools (default: sequential)
Adaptive attacks — each attempt analyzes previous prompts/responses to improve exploits
Early stopping — stops attacking a tool immediately upon success

OWASP LLM Top 10: LLM01 (Prompt Injection), LLM07 (Insecure Plugin Design), LLM08 (Excessive Agency)

Verification

Create a scan config YAML pointing to your target agent REST endpoint
python -m garak --config scan_config.yaml
python -m pytest tests/probes/test_agent_breaker.py tests/detectors/test_detectors_agent_breaker.py -v
Verify auto-discovery works when agent.yaml has no tools defined
Verify parallel and sequential tool attacks both work correctly
Verify results display: agent_breaker.AgentBreakerResult: FAIL ok on X/Y

Environment notes

Requires a red team model via NVIDIA Inference API, or the user can change it to another llm endpoint
Requires a target agent exposed as a REST endpoint (or any garak generator)
No specific hardware requirements (all inference is remote API calls)

Add a new generator wrapping the NVIDIA Inference API (OpenAI-compatible endpoint at inference-api.nvidia.com). Used by the AgentBreaker probe as the default red team model. Co-authored-by: Cursor <cursoragent@cursor.com>

Multi-turn red-team probe that systematically attacks tool-using LLM agents to identify security vulnerabilities. Key features: - Analyzes each tool for exploitable vulnerabilities using a red team model - Auto-discovers agent purpose and tools if not configured - Configurable parallelism for concurrent tool attacks - Per-tool exploit verification with confidence scoring - Custom detector (AgentBreakerResult) for garak evaluator integration Includes 41 tests (33 probe + 8 detector) covering config loading, auto-discovery, attack orchestration, tool ordering, early stopping, postprocess flag propagation, and verification parsing.

github-actions · 2026-02-24T12:10:00Z

DCO Assistant Lite bot:
Thank you for your submission, we really appreciate it. Like many open-source projects, we ask that you sign our Developer Certificate of Origin before we can accept your contribution. You can sign the DCO by just posting a Pull Request Comment same as the below format.

I have read the DCO Document and I hereby sign the DCO

_{You can retrigger this bot by commenting recheck in this Pull Request}

eliyac-cyber and others added 3 commits February 24, 2026 10:49

feat: add inference_api generator for NVIDIA Inference API

643fadf

Add a new generator wrapping the NVIDIA Inference API (OpenAI-compatible endpoint at inference-api.nvidia.com). Used by the AgentBreaker probe as the default red team model. Co-authored-by: Cursor <cursoragent@cursor.com>

Merge branch 'main' into agent_breaker

a6a998a

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Agent breaker#1628

Agent breaker#1628
eliyacohen-hub wants to merge 3 commits intoNVIDIA:mainfrom
eliyacohen-hub:agent_breaker

eliyacohen-hub commented Feb 24, 2026

Uh oh!

github-actions bot commented Feb 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

eliyacohen-hub commented Feb 24, 2026

Agent Breaker: Multi-turn red-team probe for agentic LLM applications

Verification

Environment notes

Uh oh!

github-actions bot commented Feb 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants