Controlling the length of output when calling MusicLM model #18

Lunariz · 2023-02-15T21:47:47Z

Using the default settings, it seems that MusicLM will always output a tensor of length 163840. This is a bit of a strange number, as it's not divisible by the standard sample rate of 44100 that it would presumably be trained on.

I've found that it's possible to pass a max_length argument when calling MusicLM, which gets passed to AudioLM. But passing this argument only controls how many semantic tokens are generated - the coarse, fine and output tensor remain the same size.

For now I've hacked a solution together by additionally passing a max_length to the self.coarse.generate() call in audiolm_pytorch:1628, but I'm wondering if this is the correct way to do it.

What's the best way to generate outputs of different lengths with this model?

lucidrains · 2023-02-15T22:28:38Z

@Lunariz yea, i'm not too familiar with that myself, let's keep this open to remind me to look into it

ideally each model has knowledge of the sampling frequency, and one can just specify the length one wants in friendly human time (seconds), and it does the rest

Mingxiangyu · 2023-04-16T03:12:15Z

@Lunariz是的，我自己对此不太熟悉，让我们保持开放状态以提醒我调查一下

理想情况下，每个模型都知道采样频率，并且可以在友好的人类时间（秒）内指定所需的长度，其余的由它完成

Hello, may I ask if there is any progress on this investigation now?

ARTUROSING · 2023-05-08T18:33:34Z

The correct way to generate outputs of different lengths with the MusicLM model is to modify the max_length parameter in the generate function. This parameter controls the maximum length of the generated sequence, and you can set it to a value that is appropriate for your use case. The generated sequence will have the same number of frames as the specified max_length, and the remaining frames will be padded with zeros.

I could be wrong, correct me.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Controlling the length of output when calling MusicLM model #18

Controlling the length of output when calling MusicLM model #18

Lunariz commented Feb 15, 2023

lucidrains commented Feb 15, 2023

Mingxiangyu commented Apr 16, 2023

ARTUROSING commented May 8, 2023

Controlling the length of output when calling MusicLM model #18

Controlling the length of output when calling MusicLM model #18

Comments

Lunariz commented Feb 15, 2023

lucidrains commented Feb 15, 2023

Mingxiangyu commented Apr 16, 2023

ARTUROSING commented May 8, 2023