Skip to content

Wer tracker#414

Merged
mpariente merged 7 commits intoasteroid-team:masterfrom
JorisCos:wer_tracker
Feb 2, 2021
Merged

Wer tracker#414
mpariente merged 7 commits intoasteroid-team:masterfrom
JorisCos:wer_tracker

Conversation

@JorisCos
Copy link
Collaborator

About this PR

  • This PR makes it possible to keep track of the transcriptions made by the ASR models.

The file containing the transcriptions is a .json that looks like this :
all_trans_ex

  • This PR also adds jiwer transformation to the measure computation. It removes the punctuation and puts everything to lowercase. This leads to a more accurate WER.

Copy link
Collaborator

@mpariente mpariente left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The estimates seem a bit off 😂 but this was clearly missing, thanks!

@JorisCos
Copy link
Collaborator Author

Haha the estimates are wrong because the model was trained on noisy inputs and the enhancement is making things worse...

@mpariente
Copy link
Collaborator

Let's include the wav normalization in the WERTracker class?

@popcornell
Copy link
Collaborator

What is the ID field in .json annotation ?

@JorisCos
Copy link
Collaborator Author

JorisCos commented Jan 26, 2021

Let's include the wav normalization in the WERTracker class?

We can but we have to do it anyway before saving the files in eval.py.

What is the ID field in .json annotation ?

In a general way the ID is something we introduced for librimix to match transcriptions and wav files . For this specific screen shot this annotation and ID's are taken from CHIME 4. ( I will open a PR soon)

@popcornell
Copy link
Collaborator

In a general way the ID is something we introduced for librimix to match transcriptions and wav files . For this specific screen shot this annotation and ID's are taken from CHIME 4. ( I will open a PR soon)

Maybe call em UtteranceID or ExampleID? Because we might need also speaker IDs

@mpariente
Copy link
Collaborator

Also, please make the fields in JSON all lower case: "text_0", "utt_id_0" etc...

self.mix_counter = Counter()
self.clean_counter = Counter()
self.est_counter = Counter()
self.transformation = jiwer.Compose([jiwer.ToLowerCase(), jiwer.RemovePunctuation()])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this transformation enough?
The default is

[<jiwer.transforms.RemoveMultipleSpaces at 0x7fbc79a75df0>,
 <jiwer.transforms.Strip at 0x7fbc79a75e20>,
 <jiwer.transforms.SentencesToListOfWords at 0x7fbc79a75f10>,
 <jiwer.transforms.RemoveEmptyStrings at 0x7fbc7aa17bb0>]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I tested on CHIME4 these were the two that made a difference but you are right let's add the others. It doesn't cost that much anyway.

Comment on lines +298 to +299
def all_transcriptions(self):
return dict(transcriptions=self.transcriptions)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really see the point of the dict with one field, returning the list.
I'd remove this method entirely

remove all_transcriptions method
@mpariente mpariente mentioned this pull request Feb 2, 2021
@mpariente
Copy link
Collaborator

/lint

@mpariente mpariente merged commit cc2602e into asteroid-team:master Feb 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants