Skip to content

Allow evaluation on concatenation of multiple test sets#42

Merged
mjpost merged 8 commits intomjpost:masterfrom
martinpopel:testset-concat
Sep 4, 2019
Merged

Allow evaluation on concatenation of multiple test sets#42
mjpost merged 8 commits intomjpost:masterfrom
martinpopel:testset-concat

Conversation

@martinpopel
Copy link
Copy Markdown
Collaborator

This PR is just about the last commit (2c0e788), the previous commits are part of #41 (and should be discussed there).

I used bash for the test to keep consistency with the existing test.sh and also because I can test few other things (e.g. #31) more easily than with pytest.
I think @cfedermann is considering to rewrite all the tests into Python.

ArgumentParser deletes all newlines are multiple spaces by default.

The script is now called "sacrebleu", not "sacreBLEU"
and most users should have it installed in PATH, so "./" is not needed.
… sets.

The list was included twice in the "sacrebleu -h" output.
The list is getting quite long, so when printed on single line
it is not much readable anyway.
I think a separate --list option is better (it prints 1 test set per line).

However, if anyone (@mjpost) wants to print the list within sacreble -h,
I suggest to use the ArgumentParser's epilog.

I know that `choices=DATASETS.keys()` does also validation
in addition to documentation, but this functionality is already included
(again, with a nicer multi-line listing of available test sets).
`args.langpair is None or args.langpair not in DATASETS[args.test_set])`
this was unnecessarily duplicated.
Also, I think the new structure is more readable.
SacreBLEU is not versioned in sockeye_contrib anymore.

While `sacrebleu` with no arguments still prints all the test set names,
it prints also the error message
`sacreBLEU: I need either a predefined test set (-t) or a list of references'
and ends with non-zero exit code.
Using `sacrebleu --list` seems a tiny bit more user friendly.
@mjpost
Copy link
Copy Markdown
Owner

mjpost commented Sep 1, 2019

This looks very nice. Can you merge on master so I can see the changes better here?

@martinpopel
Copy link
Copy Markdown
Collaborator Author

martinpopel commented Sep 3, 2019

Anything else I should do?

I tried to separate each PR into several commits which are easier to review and understand their motivation based on the commit log. The later PRs are based on the earlier ones, so you can either merge the oldest PRs first or you can merge just the last PR (#44) which includes all the commits, if you consider everything OK.

@mjpost
Copy link
Copy Markdown
Owner

mjpost commented Sep 4, 2019

Sorry, just returned from lots of traveling, trying to make my way through things. I'd prefer to do these one by one and remerge on master after each one, if you don't mind. Thanks very much for your patience! I will commit to looking thoroughly at one of these each night this week.

# which do not currently support multiple references, so the example is hypothetical.
if args.test_set is None:
concat_ref_files = [args.refs]
else:
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't right—we do have multiple references for some test sets, see wmt17/tworefs.

Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anything to say here? I think this is the only blocker on this one.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, I checked that -t wmt16/tworefs,wmt17/tworefs -l en-fi works as expected and deleted the misleading comment.
(I had thought that multiple refs can be only specified as multiple files on the command line or a single file with tabs, but now I see that it is supported also for the internal test sets.)

`-t wmt16/tworefs,wmt17/tworefs -l en-fi` works as expected.
@mjpost mjpost merged commit 6146c8e into mjpost:master Sep 4, 2019
@martinpopel martinpopel deleted the testset-concat branch September 4, 2019 18:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants