Training time Conv-TasNet for Different Encoder / Decoder #582
Replies: 1 comment 7 replies
-
The settings you used may lead to a large time resolution and a large feature space. For example, win_len=16, stride=8, nfilters=512 for 8k sampling rate means that every 1ms will generate a 512-dim feat. As for stft, win_len=512, stride=256, nfilters=512 is reasonable, which means every 32ms yielding a 512dim feat, 1/32 of free encoder.
|
Beta Was this translation helpful? Give feedback.
7 replies
Answer selected by
cemox35
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello there,
When I used the "free" enc/dec, 1 epoch takes nearly 1 hour, however, if I used the "STFT" enc/dec, It takes only 10 minutes. Is that normal and why is there so much difference?
Beta Was this translation helpful? Give feedback.
All reactions