In the paper, it mentions that the score is defined as follow. Is that calculated by summing the logprob of each token of the ground true y conditioned on (e,x)? If that were right, the score would not be between 0 and 1 as shown in the pic. In that case, what is the threshold used for identifying pos&neg samples? Appreciate it!


In the paper, it mentions that the score is defined as follow. Is that calculated by summing the logprob of each token of the ground true y conditioned on (e,x)? If that were right, the score would not be between 0 and 1 as shown in the pic. In that case, what is the threshold used for identifying pos&neg samples? Appreciate it!

