Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

代码细节问题 #8

Open
check-777 opened this issue Jun 25, 2024 · 4 comments
Open

代码细节问题 #8

check-777 opened this issue Jun 25, 2024 · 4 comments

Comments

@check-777
Copy link

您好,请问 FAcodec/modules /quantize.py中FApredictors中forward_v2函数注释掉了
spk_pred = self.timbre_predictor(timbre)[0]
这行代码,因此timbre为None,这里会导致后面

     spk_pred_logits = preds['timbre']
     spk_loss = F.cross_entropy(spk_pred_logits, spk_labels)

spk_pred_logits 的内容为None,因此报错,这里是bug吗?

@Plachtaa
Copy link
Owner

感谢指正,之前传上来的代码版本有问题,现在修正过来了

@check-777
Copy link
Author

感谢指正,之前传上来的代码版本有问题,现在修正过来了

spk_pred = self.timbre_predictor(timbre)[0]
这个地方应该去掉[0],要不和标签的维度对不上

@Plachtaa
Copy link
Owner

感谢指正,之前传上来的代码版本有问题,现在修正过来了

spk_pred = self.timbre_predictor(timbre)[0] 这个地方应该去掉[0],要不和标签的维度对不上

对,这应该是一个Linear,改过来了

@check-777
Copy link
Author

check-777 commented Jun 26, 2024

还有一个地方有些疑问,在meldatasets处理数据的时候,
`to_mel = torchaudio.transforms.MelSpectrogram(
n_mels=MEL_PARAMS['n_mels'], **SPECT_PARAMS)
mean, std = -4, 4

def preprocess(wave):
# wave = wave.unsqueeze(0)
wave_tensor = torch.from_numpy(wave).float()
mel_tensor = to_mel(wave_tensor)
mel_tensor = (torch.log(1e-5 + mel_tensor.unsqueeze(0)) - mean) / std
return mel_tensor`

我发现你默认使用的采样率是24K,torchaudio.transforms.MelSpectrogram的默认采样率是16K,这点是出于什么考虑

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants