We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
项目中的reconstruct和redecoder reconstruct似乎只能针对预训练文件,也就是bin,我想请教下train训练的pth文件能否用于推理 还有就是想请问不用任何标签也可以训练出解耦音频要素的方法是在哪个文件中体现的 感谢解答
The text was updated successfully, but these errors were encountered:
非常抱歉问题没有及时解决,现在已经更新了推理脚本以支持加载自定义checkpoint
至于.pth和.bin并没有本质区别,只是后缀不一样而已,使用torch.load得到的结果是一样的
关于解耦的原理,具体请参展NS3的原论文。大概而言,是使用了gradient reversal在鼓励encoder抽取有关content的信息的同时,阻止其抽取speaker相关的信息。原论文中使用的frame level phoneme标签以及speaker标签,这里改为使用预训练的CTC ASR以及speaker verification模型的pseudo label。由于预测目标只起到引导作用,所以这里使用伪标签对最终训练的codec性能几乎不会有影响
Sorry, something went wrong.
非常感谢
No branches or pull requests
项目中的reconstruct和redecoder reconstruct似乎只能针对预训练文件,也就是bin,我想请教下train训练的pth文件能否用于推理
还有就是想请问不用任何标签也可以训练出解耦音频要素的方法是在哪个文件中体现的
感谢解答
The text was updated successfully, but these errors were encountered: