NANs in CTC loss: solution #1777

kfmn · 2024-10-21T06:52:31Z

Hi,

I an my colleagues have faced the problem of appearing NANs in CTC loss and interruption of training due to too much infinite gradients when training Zipformer on our data.
After some debugging we found a reason for this behaviour and I would like to share this finding for it to be fixed in all related recipes.

In training script there is a piece of code which filters out pronunciations (token sequences) which are too long to be aligned successfully with feature sequences:

icefall/egs/librispeech/ASR/zipformer/train.py

Line 1326 in f84270c

if T < len(tokens):

This code accounts for only non-blank tokens from SentencePiece tokenizer. However, when there are two or more identical tokens in a row, they must be separated by the blank token for CTC alignment! So in cases where condition above is satisfied it is still possible to fail computing CTC loss. So we suggest to fix this in such a way:

        T = ((cut.num_frames - 7) // 2 + 1) // 2
        tokens = sp.encode(cut.supervisions[0].text, out_type=str)
        num_tokens = len(tokens)
        for i in range(1, len(tokens)):
            if tokens[i] == tokens[i - 1]:
                num_tokens += 1
        if T < num_tokens:

After this correction no NANs appears in CTC losses anymore.
It seems that similar bug exists in most of training scripts related to CTC-based training.

The text was updated successfully, but these errors were encountered:

csukuangfj · 2024-10-21T07:00:06Z

Thanks for sharing!

Could you create a pull-request to integrate your proposal?

By the way, is the audio causing NaNs very short?

kfmn · 2024-10-21T07:04:42Z

Sorry I don't know how to make pull-requests indeed (
Audio is not very short, we faced this when working with grapheme tokenizer, the number of graphemes is sometimes indeed comparable to T

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NANs in CTC loss: solution #1777

NANs in CTC loss: solution #1777

kfmn commented Oct 21, 2024 •

edited

Loading

csukuangfj commented Oct 21, 2024

kfmn commented Oct 21, 2024

NANs in CTC loss: solution #1777

NANs in CTC loss: solution #1777

Comments

kfmn commented Oct 21, 2024 • edited Loading

csukuangfj commented Oct 21, 2024

kfmn commented Oct 21, 2024

kfmn commented Oct 21, 2024 •

edited

Loading