Difference score for one sentence with sentence_blue() vs Command Line --sentence-level

**SacreBLEU version:** 1.4.12
**Python:** 3.7

**Issue:** Using Python `sentence_blue()` and the Command Line with the `--sentence-level` flag (or `-sl`) for only one sentence gives different scores.
**Question:** I need `sentence_blue()` from Python to give the same output as the Command Line with the `--sentence-level` flag (or `-sl`), what should I do?

## From the Command Line:

```
echo "Producția de zahăr primă va fi exprimată în ceea ce privește zahărul alb;" | sacrebleu -sl <(echo "producţia de zahăr brut se exprimă în zahăr alb;")
```
**Output:**
```
BLEU+case.mixed+numrefs.1+smooth.exp+tok.13a+version.1.4.12 = 8.5 35.7/15.4/4.2/2.3 (BP = 1.000 ratio = 1.400 hyp_len = 14 ref_len = 10)
```

## From Python
**Using `sentence_blue()`**
```
ref = "producţia de zahăr brut se exprimă în zahăr alb;"
sys = "Producția de zahăr primă va fi exprimată în ceea ce privește zahărul alb;"

bleu = sacrebleu.sentence_bleu(sys, ref)
print(bleu.score)
```
**Output:**
```
0.0
```
**Using `corpus_blue()`**
```
ref = [["producţia de zahăr brut se exprimă în zahăr alb;"]]
sys = ["Producția de zahăr primă va fi exprimată în ceea ce privește zahărul alb;"]

bleu = sacrebleu.corpus_bleu(sys, ref)
print(bleu.score)
```
**Output:**
```
8.493098745313148
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Difference score for one sentence with sentence_blue() vs Command Line --sentence-level #98

From the Command Line:

From Python

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Difference score for one sentence with sentence_blue() vs Command Line --sentence-level #98

Description

From the Command Line:

From Python

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions