Hi,
In your current implementation for change_scores.py, you are loading the dataset with a prompt column. However, in the originalhh_rlhf_harmless dataset, or after reformatting it with your provided code, there's no such column. How did you evaluate the change scores exactly?