You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In noisyspeech_synthesizer.py, an array of audio samples are read from a noise file (line 78). On line 81, a slice of the noise array is taken from index 0 to len(clean) as:
noise = noise[0:len(clean)]
By always starting at index 0, in the case where the clean speech arrays are roughly the same length (~16000 samples) as in the speech commands case, it means that the number of unique noise arrays we see is equal to the number of noise files.
Even if we have one noise file with 10 hours of audio, we may only ever make use of the first 1 second of this data.
It would be better to pick a random starting index within the noise array from which to take a slice. For example start_idx = np.random.randint(low=0, high=len(noise)-len(clean), size=1) noise = noise[start_idx : start_idx+len(clean)]
The text was updated successfully, but these errors were encountered:
In noisyspeech_synthesizer.py, an array of audio samples are read from a noise file (line 78). On line 81, a slice of the
noise
array is taken from index0
tolen(clean)
as:noise = noise[0:len(clean)]
By always starting at index
0
, in the case where the clean speech arrays are roughly the same length (~16000 samples) as in the speech commands case, it means that the number of unique noise arrays we see is equal to the number of noise files.Even if we have one noise file with 10 hours of audio, we may only ever make use of the first 1 second of this data.
It would be better to pick a random starting index within the
noise
array from which to take a slice. For examplestart_idx = np.random.randint(low=0, high=len(noise)-len(clean), size=1)
noise = noise[start_idx : start_idx+len(clean)]
The text was updated successfully, but these errors were encountered: