wav length #6

liu-x-p · 2022-01-08T10:08:19Z

Hi, I'm trying to use this work to predict the segments of audios. Then feature of this work is exacted by conv encoder, and the parameter wav_len is calculated with conv layers outputs. And the wav_len equals to the number of frames.
When I used pretrained model to get segments with melscale, I found the number of frames were different between melscale and output of encoder. For example, the length of the audio is 258560, and length of the conv layers output is 1613, which is 1617 of melscale.
How to avoid this difference?

I use torchaudio to calculate melscale, and set parameters like this
win: 30ms
hop: 10ms
n_mel: 80

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

wav length #6

wav length #6

liu-x-p commented Jan 8, 2022

wav length #6

wav length #6

Comments

liu-x-p commented Jan 8, 2022