Currently arabic numerals and symbols in whisper transcript cannot be aligned, needs to be phonetic alphabet.
Need to perform inverse of normalization in https://github.com/m-bain/whisperX/blob/main/whisperx/normalizers/english.py
Such that numbers and currencies are converted to their phonetic word form.
E.g.
"$300" -> "three hundred dollars"
To perform wav2vec alignment.
Then convert back to symbol form, and assign timestamps.
Currently arabic numerals and symbols in whisper transcript cannot be aligned, needs to be phonetic alphabet.
Need to perform inverse of normalization in https://github.com/m-bain/whisperX/blob/main/whisperx/normalizers/english.py
Such that numbers and currencies are converted to their phonetic word form.
E.g.
"$300" -> "three hundred dollars"
To perform wav2vec alignment.
Then convert back to symbol form, and assign timestamps.