Skip to content

[Feature] Added Predicted Win Probabilities to CompassArenaBradleyTerrySummarizer#1815

Merged
acylam merged 1 commit intoopen-compass:mainfrom
acylam:add_bt_predicted_win_rates
Jan 10, 2025
Merged

[Feature] Added Predicted Win Probabilities to CompassArenaBradleyTerrySummarizer#1815
acylam merged 1 commit intoopen-compass:mainfrom
acylam:add_bt_predicted_win_rates

Conversation

@acylam
Copy link
Collaborator

@acylam acylam commented Jan 10, 2025

Motivation

Added an option to report the predicted win rates instead of ELO scale ratings in CompassArenaBradleyTerrySummarizer

Modification

  • Added an option to report the predicted win rates (win probabilities against the baseline model) instead of the ELO scale ratings.
  • report_pred_win_rates is set to True by default, which reports the predicted win rates for downstream processes (what's being returned by the summarize method).
  • Both predicted win rates and ELO scale ratings will be saved in the summary files regardless of whether report_pred_win_rates is turned on. This parameter only affects what's being returned by the summarize method.

BC-breaking (Optional)

CompassArenaBradleyTerrySummarizer now defaults to reporting the predicted win rates instead of the ELO scale ratings that were previously reported. You can still change the output back to returning ELO ratings by setting reprot_pred_win_rates to True.

Use cases (Optional)

Perform subjective evaluation with the updated evaluation config:

opencompass configs/eval_subjective_bradleyterry.py --mode=all

…s with an option to switch between win rates and elo ratings
@acylam acylam merged commit 7f2aeef into open-compass:main Jan 10, 2025
stephen-nju pushed a commit to stephen-nju/opencompass that referenced this pull request May 14, 2025
…s with an option to switch between win rates and elo ratings (open-compass#1815)
zyc140345 pushed a commit to zyc140345/opencompass that referenced this pull request Oct 23, 2025
…s with an option to switch between win rates and elo ratings (open-compass#1815)
iamkaia pushed a commit to iamkaia/opencompass that referenced this pull request Feb 4, 2026
…s with an option to switch between win rates and elo ratings (open-compass#1815)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants