You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Speaking formally, the shape of variable y_cut_mask from here, might not match the shape of variable y_cut at the last dimension (which is out_size for y_cut).
To comprehend, take a look at the function sequence_mask, which we invoke to create y_cut_mask. As parameter max_length is not provided, the length dimension will be of size max(length) (look here). Thus, if all sequences in a batch, provided to GradTTS.forward(...) are shorter than out_size, the last dimension of the shape of y_cut_mask will not match the last dimension of y_cut.
An easy experiment can show up an issue. Start training GradTTS with batch_size==1. In that case if there is any sequence shorter than out_size, training will fail with shape mismatch.
The fix I suggest is elementary: provide parameter max_length=out_size when calling sequence_maskhere.
Moreover, we better skip cropping out mel when all sequences in a batch, provided to GradTTS.forward(...) are shorter than out_size. Concrete, I suggest to add condition y_max_length > out_sizehere.
The text was updated successfully, but these errors were encountered:
Speaking formally, the shape of variable
y_cut_mask
from here, might not match the shape of variabley_cut
at the last dimension (which isout_size
fory_cut
).To comprehend, take a look at the function
sequence_mask
, which we invoke to createy_cut_mask
. As parametermax_length
is not provided, the length dimension will be of sizemax(length)
(look here). Thus, if all sequences in a batch, provided toGradTTS.forward(...)
are shorter thanout_size
, the last dimension of the shape ofy_cut_mask
will not match the last dimension ofy_cut
.An easy experiment can show up an issue. Start training GradTTS with
batch_size==1
. In that case if there is any sequence shorter thanout_size
, training will fail with shape mismatch.The fix I suggest is elementary: provide parameter
max_length=out_size
when callingsequence_mask
here.Moreover, we better skip cropping out mel when all sequences in a batch, provided to
GradTTS.forward(...)
are shorter thanout_size
. Concrete, I suggest to add conditiony_max_length > out_size
here.The text was updated successfully, but these errors were encountered: