Some fixes staged for 1.5 by ozancaglayan · Pull Request #133 · mjpost/sacrebleu

ozancaglayan · 2021-01-12T17:27:22Z

Hello,

I always end up doing this in a sub-optimal way. This one includes stuff that should normally be created as different PRs probably. Maybe we can merge without squashing this time?

`--short` is now working correctly with TER

Previously, the default settings of `floor`-smoothing was acting as if there were no smoothing applied (#98). This commit changes the default to 0.1 as in the original paper (cited from the code) (#129).

…ult to 0

Since the previous commits also updated the default value of 'floor' smoothing to 0.1 from 0.0, the users using the compat API will already be experiencing a behavior change in their future sentence BLEU scores. This commit also prioritizes the behavioral integrity between the API and the CLI (#98) by changing the default smoothing method of 'sentence_bleu()' to 'exp' similar to CLI. Because of this difference, the scores between the API and the sacreBLEU utility with -sl option passed, were never comparable. The users that want to preserve comparability with their previously computed sentence BLEU scores, should therefore call this method with smooth_method='floor' and smooth_value=0.0 in the future.

Use GitHub links for IWSLT test/dev sets. Also, perform an MD5 check if dataset.py is launched from CLI i.e. 'python sacrebleu/dataset.py'

martinpopel

Great. Thanks. Just a minor question/suggestion on moving __repr__ to BaseScorer.

mjpost

LGTM, pending the small comments.

mjpost · 2021-01-12T21:07:06Z

                    ref_streams,
-                    smooth_value=None) -> BLEUScore:
+                    smooth_value=0) -> BLEUScore:
    """Convenience function that wraps corpus_bleu().


I think we should factor out the 0 here into a global constant, e.g., SMOOTH_VALUE_DEFAULT.

The defaults for BLEU are already defined in BLEU.SMOOTH_DEFAULTS. The reason it is hard-coded to be 0 here was to preserve backward compatibility with the raw_corpus_bleu of older sacreBLEU versions. But since we're breaking API here, I can revert this to be None so that it defaults to the new internal default of 0.1 ?

The alternative is to remove this "convenience" function completely and expect people to use corpus_bleu with arguments that they'd like to use.

So raw_corpus_bleu hard-codes a smooth_value of floor. Since the API allows the value to be changed, I would suggest that instead of smooth_value=0, we use smooth_value=BLEU.SMOOTH_DEFAULTS["floor"]. That way the value is explicit, instead of potentially getting lost here. What do you think of this?

I'd like to keep the convenience function. I think Sockeye might be using it.

smooth_value=BLEU.SMOOTH_DEFAULTS["floor"] is equivalent (although more explicit) to the unmodified version of the signature which was smooth_value=None (a None fetches the default from that very same dictionary internally in Bleu class)

The question here is whether we want sacreBLEU >= 1.5 to give the same sentence bleu scores as sacreBLEU < 1.5 when using this function, or not. I think we don't care as of now since we already introduced other behavior differences as well.

So I'll make this BLEU.SMOOTH_DEFAULTS["floor"] to be more readable.

The reason why I suggested this function is that, it again falls back to floor although all other methods for sentence-bleu are now synchronized to use exp smoothing. It's kind of weird the choices here =)

mjpost · 2021-01-14T14:59:07Z

I think I'm happy with merging this, subject to the few stray comments. Does one of you want to handle this, or do you prefer I do it? I think we should do a squash merge per our earlier conversation, but take care to create a tidy commit message.

ozancaglayan · 2021-01-14T16:54:02Z

I don't know how to merge the other variable-bleu PR into this. If you want, we can merge this one first, and then rebase the other PR, add the Changelog entry and merge afterwards?

mjpost · 2021-01-15T14:27:59Z

Sure, that's fine. It doesn't particularly matter to me.

ozancaglayan added 14 commits January 12, 2021 13:41

TER: Correctly handle the --short option (#131)

1a14695

`--short` is now working correctly with TER

metrics: add __repr__() for bleu and ter

c069c12

sacrebleu: use correct method with sacrelogger

19ca228

Update Changelog

7c6e652

Update docstrings for bleu methods

4a28139

BLEU: Change default value for floor smoothing to 0.1

c183417

Previously, the default settings of `floor`-smoothing was acting as if there were no smoothing applied (#98). This commit changes the default to 0.1 as in the original paper (cited from the code) (#129).

compat: preserve raw_corpus_bleu() behavior by setting its floor defa…

c52762b

…ult to 0

test: Check raw_corpus_bleu arguments

9c62396

compat: update docstrings, remove unused import

0711fb1

Bleu: add smoothing value to signature (#98)

4d344b1

Bump to 1.5, Update Changelog

914a43f

dataset: Fix IWSLT links (#128)

3b770e2

Use GitHub links for IWSLT test/dev sets. Also, perform an MD5 check if dataset.py is launched from CLI i.e. 'python sacrebleu/dataset.py'

update Changelog

c88e960

ozancaglayan requested review from martinpopel and mjpost January 12, 2021 17:27

martinpopel approved these changes Jan 12, 2021

View reviewed changes

Comment thread sacrebleu/metrics/bleu.py

mjpost reviewed Jan 12, 2021

View reviewed changes

mjpost mentioned this pull request Jan 12, 2021

Allow variable number of references for BLEU via API #132

Merged

ozancaglayan added 2 commits January 13, 2021 12:59

API: move __repr__() to BaseScore

540cfe1

Change version to 1.5.0, update Changelog according to reviews

8c4996b

ozancaglayan added 2 commits January 14, 2021 19:57

compat: let raw_corpus_bleu() use the floor default value

c116136

test_bleu: fix test case for the floor param changes

90eeedc

ozancaglayan merged commit 907e9fb into master Jan 18, 2021

ozancaglayan deleted the fixes-2021 branch February 18, 2021 08:00

Conversation

ozancaglayan commented Jan 12, 2021

Uh oh!

martinpopel left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mjpost left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mjpost Jan 12, 2021

Choose a reason for hiding this comment

Uh oh!

ozancaglayan Jan 13, 2021

Choose a reason for hiding this comment

Uh oh!

ozancaglayan Jan 13, 2021

Choose a reason for hiding this comment

Uh oh!

mjpost Jan 14, 2021

Choose a reason for hiding this comment

Uh oh!

mjpost Jan 14, 2021

Choose a reason for hiding this comment

Uh oh!

ozancaglayan Jan 14, 2021

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mjpost commented Jan 14, 2021

Uh oh!

ozancaglayan commented Jan 14, 2021

Uh oh!

mjpost commented Jan 15, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants