Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

paraformer微调之后模型变大,且和basemodel推理同一段wav文件时会报错 #2239

Open
YouTwoMeToo opened this issue Nov 27, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@YouTwoMeToo
Copy link

在对paraformer长音频版模型进行微调之后,保存的pt文件大小由basemodel的800多M增加到了近2.6G,
且在推理同一段wav文件时,会报错,报错信息如下:

Traceback (most recent call last):
File "/wind/aispace/train/source/src/FunASR/examples/industrial_data_pretraining/paraformer-zh-spk/tasks_bin.py", line 220, in
results_left = asr_batch_infer(output_left_folder,paraformer_model)
File "/wind/aispace/train/source/src/FunASR/examples/industrial_data_pretraining/paraformer-zh-spk/tasks_bin.py", line 124, in asr_batch_infer
res = paraformer_model.generate(input=audio_binary,fs=8000)
File "/wind/aispace/train/source/src/FunASR/funasr/auto/auto_model.py", line 300, in generate
return self.inference(input, input_len=input_len, **cfg)
File "/wind/aispace/train/source/src/FunASR/funasr/auto/auto_model.py", line 342, in inference
res = model.inference(**batch, **kwargs)
File "/wind/aispace/train/source/src/FunASR/funasr/models/bicif_paraformer/model.py", line 351, in inference
postprocess_utils.sentence_postprocess(token, timestamp)
File "/wind/aispace/train/source/src/FunASR/funasr/utils/postprocess_utils.py", line 235, in sentence_postprocess
word_lists, ts_lists = abbr_dispose(word_lists, ts_lists)
File "/wind/aispace/train/source/src/FunASR/funasr/utils/postprocess_utils.py", line 131, in abbr_dispose
begin = time_stamp[ts_nums[num]][0]
IndexError: list index out of range
0%|

funasr为最新版
请问这个问题是什么原因呢?会是与微调的数据有关系吗?

@YouTwoMeToo YouTwoMeToo added the bug Something isn't working label Nov 27, 2024
@YouTwoMeToo
Copy link
Author

您好,今天又测试了一下,在训练数据中加入了比较短的片段,效果有改善,但是依然会有少部分测试用例报了如上错误

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant