Phoneme-level timestamps and confidence #38
-
Hi @MahmoudAshraf97, I'm looking for a tool that could provide phoneme-level timestamps as well as confidence data (about the recognized phoneme, not the timestamp). Is your tool capable of returning this data? I'm trying to build a tool that can provide feedback on pronunciation at the phoneme level. I've looked into MFA and WhisperX but they don't seem to be able to provide this data. Thank you! |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 5 replies
-
This tool provides letter level timestamps and confidence, it's later aggregated to provide word or segment timestamps and confidence, you can use custom splitting and aggregation to adjust it to phoneme level, so it's doable but you'll need to write custom code |
Beta Was this translation helpful? Give feedback.
They are not normalized, just the aggregated raw scores of each single letter, so the maximum will be 0 and the minimum will be
-inf