docs: guide to running garak faster by leondz · Pull Request #1463 · NVIDIA/garak

leondz · 2025-11-06T20:27:56Z

add garak docs page with info on running garak faster, incl parallelisation, aggregation, target types, and limits

mikemckiernan

Thanks a bunch for the heads up. Some is def applicable to Auditor. I appreciate it!

LMK if I can clarify any word nerd speak.

mikemckiernan · 2025-11-06T22:05:50Z

docs/source/faster.md

+As you might be able to guess by now, there are some good advantages to using remote endpoint generators, and some notable disadvantages to local model generators. We strongly recommend using remote generators if you're trying to do things quickly. Not least because it enables parallelisation.
+
+
+Parallelisation within garak
+----------------------------
+Garak offers a couple of options for parallelisation, directly available on the CLI or via config.


British spelling is hardly wrong. Just checking if that's the direction you want or are willing to entertain American parallelization.

Definitely willing to entertain, and even prefer for this project! I was dismayed to learn that "s" only becomes "z" sometimes in Americanisation, because it means that I can't write US English, only read it.

mikemckiernan · 2025-11-06T22:07:48Z

docs/source/faster.md

+First, garak doesn't have to do orchestration.
+Second, it can be possible for multiple instances of the target to run in parallel without garak or you having to worry about it.
+Third, the people running the endpoint have often done some quality checking and testing to make sure that the endpoint runs well, reducing the chance of the target crashing weirdly.
+Fourth, because orchestration (i.e. getting models to be loaded, and to run) happens remotely, if there is a failure, the solution can be as simple (from garak's point of view) as re-sending the inference request to the target endpoint. Garak is generally pretty gentle but tenacious when it comes to dealing with endpoint failure - we know that runs can take a while and we want to mitigate the need to "babysit" them, by having garak politely try to recover the target.


nit: Generally prefer to avoid parens and Latinisms. Maybe "...because orchestration--loading and serving models--happens remotely,.." or ", such as loading and serving models,"

Acknowledged!

mikemckiernan · 2025-11-06T22:09:11Z

docs/source/faster.md

+Third, the people running the endpoint have often done some quality checking and testing to make sure that the endpoint runs well, reducing the chance of the target crashing weirdly.
+Fourth, because orchestration (i.e. getting models to be loaded, and to run) happens remotely, if there is a failure, the solution can be as simple (from garak's point of view) as re-sending the inference request to the target endpoint. Garak is generally pretty gentle but tenacious when it comes to dealing with endpoint failure - we know that runs can take a while and we want to mitigate the need to "babysit" them, by having garak politely try to recover the target.
+
+As you might be able to guess by now, there are some good advantages to using remote endpoint generators, and some notable disadvantages to local model generators. We strongly recommend using remote generators if you're trying to do things quickly. Not least because it enables parallelisation.


nit: "We" can read slightly awkwardly in tech docs. My sugg is to replace with "NVIDIA" or "Garak maintainers".

mikemckiernan · 2025-11-06T22:15:08Z

docs/source/faster.md

+Setting parallel_requests higher than generations also has the same effect as setting parallel_requests equal to generations.
+
+Parallel_requests and parallel_attempts are mutually exclusive, so you have to choose between them. 
+We find that using parallel_attempts usually gives a faster run completion time - especially when the number of generations is lower than the number of different prompts from a probe, which is more oftent he case than not in a default garak run.


sugg: I respect the goal of completeness, but if the recommendation is to use parallel_attempts, then my sugg is to remove mention of parallel_requests on the entire page. The benefit is to avoid mental load.

The sentiment that parallel_attempts typically delivers a faster run duration can be captured in the reference doc for the parallel_attempts argument.

This makes sense, moved out

mikemckiernan · 2025-11-06T22:17:05Z

docs/source/faster.md

+Prompt cap
+----------
+
+The config item run.soft_probe_prompt_cap names the max number of prompts that probes which follow this cap should generate.


sugg: s/names/specifies/ (?)

I concede limited knowledge, but I'm unclear about what "prompts that probes which follow this cap" means.

Time passed. I read the next sentence.

Sugg: "specifies a soft cap for the maximum number of prompts that a a probe should generate. This setting is a soft cap..."

jmartin-tech · 2025-11-06T20:51:32Z

docs/source/faster.md

+Aggregation with lower generations
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+
+When it's not enough to split runs up by probe - perhaps there's a slow probe, or slow model - one can also use aggregation and multiple runs to simulate the effect of the generations parameter.


I think this section may not yet be fully supported in aggregation, currently each file aggregated is expected to provide a unique set of probes. I believe, and need to test this further, that digest creation and data aggregation will need to be expanded to merge or combine generations for the same probe across multiple report.jsonl files.

will comment out

…p.; give config param paths

guide to running garak faster#

01530bd

leondz requested review from jmartin-tech and mikemckiernan November 6, 2025 20:27

leondz added the documentation Improvements or additions to documentation label Nov 6, 2025

mikemckiernan approved these changes Nov 6, 2025

View reviewed changes

jmartin-tech reviewed Nov 7, 2025

View reviewed changes

update faster docs; move parallel_reqs content to config; -izsation s…

c1db499

…p.; give config param paths

leondz merged commit 4dc9841 into NVIDIA:main Nov 7, 2025
15 checks passed

github-actions bot locked and limited conversation to collaborators Nov 7, 2025

Conversation

leondz commented Nov 6, 2025

Uh oh!

mikemckiernan left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants