Skip to content

Detector/Evaluator output indexing mismatch in multi-turn conversations #1430

@saichandrapandraju

Description

@saichandrapandraju

There's a mismatch between what detectors return and what evaluators expect when dealing with multi-turn conversations:

  • Detectors are expected to return a list of length len(all_outputs) (all assistant turns in the conversation)
  • Evaluators index into attempt.outputs (only the last assistant turn's output)

The issue manifests in garak/evaluators/base.py:81 - where messages.append(attempt.outputs[idx]) assumes alignment with detector results. However detector results are of length attempt.all_outputs which is greater than attempt.outputs in multi-turn setting.

I wanted to check if this is an expected behavior..?

Metadata

Metadata

Assignees

No one assigned

    Labels

    questionFurther information is requested

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions