We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Notice: In order to resolve issues more efficiently, please raise issue following the template. (注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)
参考 huggingface上的代码,会存在一个问题,即当前的chunk进到 vad 模型中,得到了 value 的结果,比如说拿到了 start 的结果,但是这个 start 的时间点是位于前 3 到 4 个 chunks 里面的。请问有什么方法可以改进吗?我期待当前的 chunk 进去,就会得到当前输入的 chunk 的状态,即整个 chunk 都没有人说话或者有人说话,或者是可以拿到 start / end 的时间戳。
我尝试了流式处理的时候将 model.generate 中的 is_final 设置为 True 来处理,但是效果对比非实时处理要差。请问是否存在更好的解决方案呢?
万分感谢!
from funasr import AutoModel chunk_size = 200 # ms model = AutoModel(model="fsmn-vad", model_revision="v2.0.4") import soundfile wav_file = f"{model.model_path}/example/vad_example.wav" speech, sample_rate = soundfile.read(wav_file) chunk_stride = int(chunk_size * sample_rate / 1000) cache = {} total_chunk_num = int(len((speech)-1)/chunk_stride+1) for i in range(total_chunk_num): speech_chunk = speech[i*chunk_stride:(i+1)*chunk_stride] is_final = i == total_chunk_num - 1 res = model.generate(input=speech_chunk, cache=cache, is_final=is_final, chunk_size=chunk_size) if len(res[0]["value"]): print(res)
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Notice: In order to resolve issues more efficiently, please raise issue following the template.
(注意:为了更加高效率解决您遇到的问题,请按照模板提问,补充细节)
❓ Questions and Help
参考 huggingface上的代码,会存在一个问题,即当前的chunk进到 vad 模型中,得到了 value 的结果,比如说拿到了 start 的结果,但是这个 start 的时间点是位于前 3 到 4 个 chunks 里面的。请问有什么方法可以改进吗?我期待当前的 chunk 进去,就会得到当前输入的 chunk 的状态,即整个 chunk 都没有人说话或者有人说话,或者是可以拿到 start / end 的时间戳。
我尝试了流式处理的时候将 model.generate 中的 is_final 设置为 True 来处理,但是效果对比非实时处理要差。请问是否存在更好的解决方案呢?
万分感谢!
The text was updated successfully, but these errors were encountered: