Skip to content

fix: stop atkgen turn count variation in test relying on fixed turn count#1226

Merged
leondz merged 1 commit intoNVIDIA:mainfrom
leondz:fix/atkgen_toggle_repetition_tolerance
May 21, 2025
Merged

fix: stop atkgen turn count variation in test relying on fixed turn count#1226
leondz merged 1 commit intoNVIDIA:mainfrom
leondz:fix/atkgen_toggle_repetition_tolerance

Conversation

@leondz
Copy link
Collaborator

@leondz leondz commented May 21, 2025

failure

see

cause identified

atkgen stopped early if target repeats itself. The chance of this happening is increased when test.Repeat is the target, because this means attack model output becomes attack model input, creating a feedback loop. The early stopping behaviour meant transient test failures for tests counting activity within atkgen.

fix

  • Make this repetition-based early stopping behaviour configurable
  • Disable it in the failing test
  • Recommendation: prefer test.Lipsum for atkgen testing

@leondz leondz added this to the release 0.11.0 milestone May 21, 2025
@leondz leondz requested a review from jmartin-tech May 21, 2025 09:53
@leondz leondz added bug Something isn't working probes Content & activity of LLM probes tests Testing-related labels May 21, 2025
Copy link
Collaborator

@jmartin-tech jmartin-tech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense, thank you for tracking that down.

@leondz leondz merged commit 62c2bb5 into NVIDIA:main May 21, 2025
14 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators May 21, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

bug Something isn't working probes Content & activity of LLM probes tests Testing-related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants