Fix mmlu bpb bug only scoring answer=A questions by OyvindTafjord · Pull Request #718 · allenai/OLMo

OyvindTafjord · 2024-09-06T16:13:59Z

Same problem as fixed for oe-eval tasks in #712, I forgot there was separate handling for MMLU.

With the previous code, only questions with gold answer A would be counted in the bpb evaluations, now they should all be counted.

liujch1998 · 2024-09-06T16:35:05Z

Thanks Oyvind! Approving this PR.

Though it seems to me that not resetting label to 0 is fine for MMLU — MMLU’s prep_examples(), which is inherited from ICLMultiChoiceTaskDataset, does not skip cases where label_id and cont_id mismatch when metric=bpb (whereas OEEvalTask does), and ICLMetric.compute() does not use label_id when metric=bpb. So questions with any label_id should have already been included in the metric computation.

Fix mmlu bpb bug only scoring answer=A questions

138cfe9

OyvindTafjord requested review from epwalsh and liujch1998 September 6, 2024 16:15

liujch1998 approved these changes Sep 6, 2024

View reviewed changes

epwalsh approved these changes Sep 6, 2024

View reviewed changes

OyvindTafjord merged commit 0b92077 into main Sep 6, 2024

OyvindTafjord deleted the ot-fix-mmlu-bpb branch September 6, 2024 17:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Fix mmlu bpb bug only scoring answer=A questions#718

Fix mmlu bpb bug only scoring answer=A questions#718
OyvindTafjord merged 1 commit intomainfrom
ot-fix-mmlu-bpb

OyvindTafjord commented Sep 6, 2024

Uh oh!

liujch1998 commented Sep 6, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Comments

Conversation

OyvindTafjord commented Sep 6, 2024

Uh oh!

liujch1998 commented Sep 6, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants