Conversation
mpariente
left a comment
There was a problem hiding this comment.
The estimates seem a bit off 😂 but this was clearly missing, thanks!
|
Haha the estimates are wrong because the model was trained on noisy inputs and the enhancement is making things worse... |
|
Let's include the wav normalization in the WERTracker class? |
|
What is the ID field in .json annotation ? |
We can but we have to do it anyway before saving the files in
In a general way the ID is something we introduced for librimix to match transcriptions and wav files . For this specific screen shot this annotation and ID's are taken from CHIME 4. ( I will open a PR soon) |
|
Maybe call em UtteranceID or ExampleID? Because we might need also speaker IDs |
|
Also, please make the fields in JSON all lower case: "text_0", "utt_id_0" etc... |
asteroid/metrics.py
Outdated
| self.mix_counter = Counter() | ||
| self.clean_counter = Counter() | ||
| self.est_counter = Counter() | ||
| self.transformation = jiwer.Compose([jiwer.ToLowerCase(), jiwer.RemovePunctuation()]) |
There was a problem hiding this comment.
Is this transformation enough?
The default is
[<jiwer.transforms.RemoveMultipleSpaces at 0x7fbc79a75df0>,
<jiwer.transforms.Strip at 0x7fbc79a75e20>,
<jiwer.transforms.SentencesToListOfWords at 0x7fbc79a75f10>,
<jiwer.transforms.RemoveEmptyStrings at 0x7fbc7aa17bb0>]
There was a problem hiding this comment.
When I tested on CHIME4 these were the two that made a difference but you are right let's add the others. It doesn't cost that much anyway.
asteroid/metrics.py
Outdated
| def all_transcriptions(self): | ||
| return dict(transcriptions=self.transcriptions) |
There was a problem hiding this comment.
I don't really see the point of the dict with one field, returning the list.
I'd remove this method entirely
remove all_transcriptions method
|
/lint |
About this PR
The file containing the transcriptions is a

.jsonthat looks like this :