Why the output of Wavenet generate() always the same? #11

littleTwelve · 2018-03-23T13:26:38Z

When I use the code to train a model, it seems good. However, when I use the trained model to generate data, I get a sequence of number which are all the same value. For example, if I input 1*5000 vector [2,99,34,...,45, 27,33], then I use generate() to generate data, I get [2,99,34,...,45, 27,33,33,33,33,...,33,33,33]. As you see, I generate a sequence of number which are all the same value and what is more strange is that these numbers are all equal to the last number of the input. I can't find what's wrong with the code, I would appreciate it if someone can give me some advice.

vincentherrmann · 2018-03-23T14:07:56Z

Have you tried the generate_fast() method? I think there is probably a bug in the generate() function. I will try to fix it, but you shouldn't really be using it anyway since it's painfully slow.

littleTwelve · 2018-03-23T14:23:32Z

Thank you for your reply! I will try it later. Actually, I rewrite your code rely on my understanding of it. So I use generate() just because I can understand it clearly. As for generate_fast(), I can't understand it well. You said there is probably a bug in the generate() function. Although I can't use generate() to get what I want, I have no idea on what's wrong with it. Could you explain it clearer?

vincentherrmann · 2018-03-23T23:20:18Z

The problem in the generate function() was simply that I didn't do one-hot-encoding of the input. I have fixed it now, but let me know if there's something I can do to help your understanding!

littleTwelve · 2018-03-24T01:20:49Z

Thanks again! I just found I have a big misunderstanding of the wavenet and I'm trying to correct it. So I'm afraid that I may discuss the generate_fast() method with you after 1 or 2 days. I'm so sorry for that.

littleTwelve · 2018-03-24T04:54:55Z

I wonder why you need dilate() in wavenet_modules.py but not just use the parameter 'dilation' in nn.Conv1d?

littleTwelve · 2018-03-24T07:12:16Z

In your code, there is '(N, C, L), where N is the input dilation', but based on nn.Conv1d 'N' is the batch size, so I don't know why N is the input dilation?

vincentherrmann · 2018-03-24T08:56:35Z

Here I answered the question regarding the dilate() function. The convolution is executed in parallel for every index in the first dimension, which in the wavenet architecture is both the dilation and the batch number. So, to be exact, N = dilation * minibatch_count.

littleTwelve · 2018-03-24T13:02:50Z

Thanks! I also have a question about the item length. In your code, item_length = receptive_field+output_length-1 and I found your output_length is always some small number like 32,48,16. What I used to do at training stage is that I set item_length to be a large number for example 21600 (because I seem to remember DeepMind mentioned in their paper that they need 2 minutes data to generate 1 second data) , which may correspond a very large output_length or a deeper wavenet in your code. And then I just use the output which length is 17507 (if receptive_field is 4093, then 21600 - 4093 = 17507) to do the cross entropy. I want to know whether my idea is reasonable or not?

vincentherrmann · 2018-03-24T13:24:41Z

Intuitively it makes sense for the output_length to have the same order of magnitude as the receptive field of the model. Currently I use an output length of 4096 most of the time (you can see the stuff I'm working on in the parallel branch). If the output length is longer the computation time increases linearly and it would be better to use bigger mini batches instead.
I'm not sure which passage of the paper you're referring to, though...

littleTwelve · 2018-03-24T14:14:46Z

Wow! You are awesome! I happen to learn how to make a conditional wavenet in the few next months. I think I will bother you a lot in the next few months. Could you tell me what is your conditioning input? Types of music or something else?

vincentherrmann · 2018-03-24T23:54:25Z

I'm trying to make the model learn the structure of a piece/song and condition the wavenet on a local time embedding. Hopefully this allows to generate longer and more musically interesting sequences. It's a bit complicated, if it works I will write a blog post about it.

littleTwelve · 2018-03-24T23:58:54Z

That's great! I am looking forward to it.

littleTwelve · 2018-03-25T13:30:22Z

It seems that my problem mentioned 2 days before has nothing to do with the generating function. I can use your code to generate a sine wave very well. If I use my own dataset, I got nothing but a straight line, but the training loss is 1e-08.

littleTwelve · 2018-03-28T14:33:27Z

Is there any trick for how to train a wavenet? No matter how to change the wavenet's parameters, I've got nothing but a straight line. Do you have some suggestions?@vincentherrmann

HTT1995 · 2019-03-25T13:00:35Z

I had the same problem.Not only the generate function ,but also the trainning result. I always got the raw audio output,such as: [20 20 20 20 20 20 20 20...], I don't know why.I check my code very carefully ,but it doesn't work.I'll very appreciate if someone can help me.

littleTwelve · 2019-03-26T00:13:16Z

I think maybe you could increase the value of mu, residue and skip. For example, mu= 64, skip=64 and residue=512. I solved my problem just by this way.

HTT1995 · 2019-03-26T02:54:39Z

Thank you for your advice, I have tried different combinations of these parameters,but it seemed doesn't work. the input [batchsizeclasseslength] through the all conv layer ,and then I find each column of the one_hot output [batchsizeclassesoutput_length] is very similar, so after de_one_hot, the raw audio output[batchsize1output_length] is the same. Do you have some other suggestions?

ZXY1231 · 2020-12-01T10:49:18Z

Thank you for your advice, I have tried different combinations of these parameters,but it seemed doesn't work. the input [batchsize_classes_length] through the all conv layer ,and then I find each column of the one_hot output [batchsize_classes_output_length] is very similar, so after de_one_hot, the raw audio output[batchsize_1_output_length] is the same. Do you have some other suggestions?

Hello, I met the same problem like yours, did you solve your problem?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why the output of Wavenet generate() always the same? #11

Why the output of Wavenet generate() always the same? #11

littleTwelve commented Mar 23, 2018

vincentherrmann commented Mar 23, 2018

littleTwelve commented Mar 23, 2018

vincentherrmann commented Mar 23, 2018

littleTwelve commented Mar 24, 2018

littleTwelve commented Mar 24, 2018

littleTwelve commented Mar 24, 2018

vincentherrmann commented Mar 24, 2018

littleTwelve commented Mar 24, 2018

vincentherrmann commented Mar 24, 2018 •

edited

Loading

littleTwelve commented Mar 24, 2018 •

edited

Loading

vincentherrmann commented Mar 24, 2018

littleTwelve commented Mar 24, 2018

littleTwelve commented Mar 25, 2018

littleTwelve commented Mar 28, 2018 •

edited

Loading

HTT1995 commented Mar 25, 2019

littleTwelve commented Mar 26, 2019

HTT1995 commented Mar 26, 2019

ZXY1231 commented Dec 1, 2020

Why the output of Wavenet generate() always the same? #11

Why the output of Wavenet generate() always the same? #11

Comments

littleTwelve commented Mar 23, 2018

vincentherrmann commented Mar 23, 2018

littleTwelve commented Mar 23, 2018

vincentherrmann commented Mar 23, 2018

littleTwelve commented Mar 24, 2018

littleTwelve commented Mar 24, 2018

littleTwelve commented Mar 24, 2018

vincentherrmann commented Mar 24, 2018

littleTwelve commented Mar 24, 2018

vincentherrmann commented Mar 24, 2018 • edited Loading

littleTwelve commented Mar 24, 2018 • edited Loading

vincentherrmann commented Mar 24, 2018

littleTwelve commented Mar 24, 2018

littleTwelve commented Mar 25, 2018

littleTwelve commented Mar 28, 2018 • edited Loading

HTT1995 commented Mar 25, 2019

littleTwelve commented Mar 26, 2019

HTT1995 commented Mar 26, 2019

ZXY1231 commented Dec 1, 2020

vincentherrmann commented Mar 24, 2018 •

edited

Loading

littleTwelve commented Mar 24, 2018 •

edited

Loading

littleTwelve commented Mar 28, 2018 •

edited

Loading