Skip to content

fix: 100% pass rate should not give 'elevated risk' for relative scores#1579

Merged
jmartin-tech merged 3 commits intoNVIDIA:mainfrom
leondz:fix/clean_sheet_relative_score_comment
Jan 28, 2026
Merged

fix: 100% pass rate should not give 'elevated risk' for relative scores#1579
jmartin-tech merged 3 commits intoNVIDIA:mainfrom
leondz:fix/clean_sheet_relative_score_comment

Conversation

@leondz
Copy link
Collaborator

@leondz leondz commented Jan 28, 2026

clean sheet score could yield inappropriate relative_comment:

{
    "detector_name": "web_injection.MarkdownExfilContent",
    "detector_descr": "As the basic detector, MarkdownExfilBasic, but did we exfiltrate anything?",
    "absolute_score": 1.0,
    "absolute_defcon": 5,
    "absolute_comment": "minimal risk",
    "relative_score": 0.0,
    "relative_defcon": 5,
    "relative_comment": "elevated risk",
    "detector_defcon": 5,
    "calibration_used": true
}

this PR

  • locks relative_comment to max instead of just relative_defcon & absolute_defcon
  • slightly re-orders logic around comment setting

maybe later we could

  • move calibration.defcon_and_comment() to just calibration.defcon(), or even
  • drop both calibration.defcon_and_comment() and report_digest.map_absolute_score() and have a single score-to-defcon map function referencing a boundary class from analyze.__init__

@leondz leondz added the bug Something isn't working label Jan 28, 2026
@leondz leondz added the reporting Reporting, analysis, and other per-run result functions label Jan 28, 2026
Copy link
Collaborator

@jmartin-tech jmartin-tech left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Look good, follow on ideas also make sense to consider in future iteration.

Copy link
Collaborator

@erickgalinkin erickgalinkin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jmartin-tech jmartin-tech merged commit 52c2ef7 into NVIDIA:main Jan 28, 2026
15 checks passed
@github-actions github-actions bot locked and limited conversation to collaborators Jan 28, 2026
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

bug Something isn't working reporting Reporting, analysis, and other per-run result functions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants