Transcription error: wav file is empty #11

GregoryBetsey · 2021-03-26T08:02:56Z

Hello

I am running the Voice-Cloning-App.exe on Windows 10. I have a GeForce RTX 2060 Graphics Card with the GeForce Game Ready Driver Version 461.92.

When I attempt build the data set, the windows console stops after the following:

[12644] WARNING: file already exists but should not: C:\Users\GREGOR1\AppData\Local\Temp_MEI126442\torch_C.cp38-win_amd64.pyd
Server initialized for threading.
Server initialized for threading.
pydub\utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
torchaudio\extension\extension.py:14: UserWarning: torchaudio C++ extension is not available.
torchaudio\backend\utils.py:63: UserWarning: The interface of "soundfile" backend is planned to change in 0.8.0 to match that of "sox_io" backend and the current interface will be removed in 0.9.0. To use the new interface, do torchaudio.USE_SOUNDFILE_LEGACY_INTERFACE = False before setting the backend to "soundfile". Please refer to pytorch/audio#903 for the detail.
INFO:matplotlib.font_manager:Generating new fontManager, this may take some time...
[nltk_data] Downloading package wordnet to C:\Users\GREGOR1\AppData\L
[nltk_data] ocal\Temp_MEI126442\nltk_data...
[nltk_data] Package wordnet is already up-to-date!
WARNING:werkzeug:WebSocket transport not available. Install eventlet or gevent and gevent-websocket for improved performance.

Serving Flask app "main" (lazy loading)
Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
Debug mode: off
INFO:werkzeug: * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 00:56:25] "GET / HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 00:56:57] "POST / HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 00:56:57] "GET /static/error.css HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 00:56:57] "GET /favicon.ico HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 00:57:11] "GET / HTTP/1.1" 200 -
Starting Thread
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 00:57:42] "POST / HTTP/1.1" 200 -
qINJoZN0iSsAW66FAAAA: Sending packet OPEN data {'sid': 'qINJoZN0iSsAW66FAAAA', 'upgrades': [], 'pingTimeout': 5000, 'pingInterval': 25000}
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Sending packet OPEN data {'sid': 'qINJoZN0iSsAW66FAAAA', 'upgrades': [], 'pingTimeout': 5000, 'pingInterval': 25000}
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 00:57:42] "GET /socket.io/?EIO=4&transport=polling&t=NXjKmkr HTTP/1.1" 200 -
qINJoZN0iSsAW66FAAAA: Received packet MESSAGE data 0/voice,
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Received packet MESSAGE data 0/voice,
qINJoZN0iSsAW66FAAAA: Sending packet MESSAGE data 0/voice,{"sid":"hvDlhnRAa1GAVtomAAAB"}
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Sending packet MESSAGE data 0/voice,{"sid":"hvDlhnRAa1GAVtomAAAB"}
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 00:57:42] "POST /socket.io/?EIO=4&transport=polling&t=NXjKmlA&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 00:57:42] "GET /socket.io/?EIO=4&transport=polling&t=NXjKmlB&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
emitting event "logs" to all [/voice]
INFO:socketio.server:emitting event "logs" to all [/voice]
qINJoZN0iSsAW66FAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Loading audio from data\datasets\JamesEarlJones\audio.mp3..."}]
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 00:57:47] "GET /socket.io/?EIO=4&transport=polling&t=NXjKmlb&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Loading audio from data\datasets\JamesEarlJones\audio.mp3..."}]
INFO:voice:Loading audio from data\datasets\JamesEarlJones\audio.mp3...
emitting event "logs" to all [/voice]
INFO:socketio.server:emitting event "logs" to all [/voice]
qINJoZN0iSsAW66FAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Loading script from data\datasets\JamesEarlJones\text.txt..."}]
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 00:58:02] "GET /socket.io/?EIO=4&transport=polling&t=NXjKnxd&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Loading script from data\datasets\JamesEarlJones\text.txt..."}]
INFO:voice:Loading script from data\datasets\JamesEarlJones\text.txt...
emitting event "logs" to all [/voice]
INFO:socketio.server:emitting event "logs" to all [/voice]
qINJoZN0iSsAW66FAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Fetching segments..."}]
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Fetching segments..."}]
INFO:voice:Fetching segments...
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 00:58:02] "GET /socket.io/?EIO=4&transport=polling&t=NXjKrgH&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
qINJoZN0iSsAW66FAAAA: Sending packet PING data None
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 00:58:07] "GET /socket.io/?EIO=4&transport=polling&t=NXjKrgS&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Sending packet PING data None
qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 00:58:07] "POST /socket.io/?EIO=4&transport=polling&t=NXjKsst&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
emitting event "logs" to all [/voice]
INFO:socketio.server:emitting event "logs" to all [/voice]
qINJoZN0iSsAW66FAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Transcribing segments..."}]
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 00:58:20] "GET /socket.io/?EIO=4&transport=polling&t=NXjKssu&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Transcribing segments..."}]
INFO:voice:Transcribing segments...
Using cache found in C:\Users\Gregory Betsey/.cache\torch\hub\snakers4_silero-models_master
torchaudio\backend\utils.py:63: UserWarning: The interface of "soundfile" backend is planned to change in 0.8.0 to match that of "sox_io" backend and the current interface will be removed in 0.9.0. To use the new interface, do torchaudio.USE_SOUNDFILE_LEGACY_INTERFACE = False before setting the backend to "soundfile". Please refer to [Announcement] Improving I/O for correct and consistent experience pytorch/audio#903 for the detail.
Exception in thread Thread-13:
Traceback (most recent call last):
File "application\utils.py", line 47, in background_task
max_seqlength = max(max([len(_) for _ in batch]), 12800)
File "application\utils.py", line 32, in create_dataset
if wav.size(0) > 1:
File "dataset\forced_alignment\align.py", line 123, in align
File "dataset\transcribe.py", line 34, in stt
File "dataset\transcribe.py", line 16, in transcribe
File "torch\hub.py", line 370, in load
File "torch\hub.py", line 399, in _load_local
File "C:\Users\Gregory Betsey/.cache\torch\hub\snakers4_silero-models_master\hubconf.py", line 24, in silero_stt
model, decoder = init_jit_model(model_url=models.stt_models.get(language).latest.jit,
File "C:\Users\Gregory Betsey/.cache\torch\hub\snakers4_silero-models_master\utils.py", line 135, in init_jit_model
model = torch.jit.load(model_path, map_location=device)
File "torch\jit_serialization.py", line 161, in load
RuntimeError: [enforce fail at ..\caffe2\serialize\inline_container.cc:145] . PytorchStreamReader failed reading zip archive: failed finding central directory

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "threading.py", line 932, in bootstrap_inner
File "threading.py", line 870, in run
File "application\utils.py", line 50, in background_task
inputs[i, :len(wav)].copy(wav)
NameError: name 'traceback' is not defined
qINJoZN0iSsAW66FAAAA: Sending packet PING data None
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 00:58:32] "GET /socket.io/?EIO=4&transport=polling&t=NXjKvy4&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Sending packet PING data None
qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 00:58:32] "POST /socket.io/?EIO=4&transport=polling&t=NXjKyzw&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
qINJoZN0iSsAW66FAAAA: Sending packet PING data None
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 00:58:57] "GET /socket.io/?EIO=4&transport=polling&t=NXjKyzw.0&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Sending packet PING data None
qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 00:58:57] "POST /socket.io/?EIO=4&transport=polling&t=NXjL358&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
qINJoZN0iSsAW66FAAAA: Sending packet PING data None
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 00:59:22] "GET /socket.io/?EIO=4&transport=polling&t=NXjL358.0&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Sending packet PING data None
qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 00:59:22] "POST /socket.io/?EIO=4&transport=polling&t=NXjL9CA&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
qINJoZN0iSsAW66FAAAA: Sending packet PING data None
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 00:59:47] "GET /socket.io/?EIO=4&transport=polling&t=NXjL9CB&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Sending packet PING data None
qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 00:59:47] "POST /socket.io/?EIO=4&transport=polling&t=NXjLFJ8&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
qINJoZN0iSsAW66FAAAA: Sending packet PING data None
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 01:00:12] "GET /socket.io/?EIO=4&transport=polling&t=NXjLFJ8.0&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Sending packet PING data None
qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 01:00:12] "POST /socket.io/?EIO=4&transport=polling&t=NXjLLQ8&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
qINJoZN0iSsAW66FAAAA: Sending packet PING data None
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Sending packet PING data None
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 01:00:37] "GET /socket.io/?EIO=4&transport=polling&t=NXjLLQ9&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 01:00:37] "POST /socket.io/?EIO=4&transport=polling&t=NXjLRWz&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
qINJoZN0iSsAW66FAAAA: Sending packet PING data None
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 01:01:02] "GET /socket.io/?EIO=4&transport=polling&t=NXjLRW-&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Sending packet PING data None
qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 01:01:02] "POST /socket.io/?EIO=4&transport=polling&t=NXjLXe3&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
qINJoZN0iSsAW66FAAAA: Sending packet PING data None
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 01:01:27] "GET /socket.io/?EIO=4&transport=polling&t=NXjLXe4&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Sending packet PING data None
qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 01:01:27] "POST /socket.io/?EIO=4&transport=polling&t=NXjLdkv&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
qINJoZN0iSsAW66FAAAA: Sending packet PING data None
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 01:01:52] "GET /socket.io/?EIO=4&transport=polling&t=NXjLdkv.0&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Sending packet PING data None
qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 01:01:52] "POST /socket.io/?EIO=4&transport=polling&t=NXjLjrp&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
qINJoZN0iSsAW66FAAAA: Sending packet PING data None
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 01:02:17] "GET /socket.io/?EIO=4&transport=polling&t=NXjLjrq&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Sending packet PING data None
qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 01:02:17] "POST /socket.io/?EIO=4&transport=polling&t=NXjLpyh&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
qINJoZN0iSsAW66FAAAA: Sending packet PING data None
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 01:02:42] "GET /socket.io/?EIO=4&transport=polling&t=NXjLpyh.0&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Sending packet PING data None
qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:engineio.server:qINJoZN0iSsAW66FAAAA: Received packet PONG data
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 01:02:42] "POST /socket.io/?EIO=4&transport=polling&t=NXjLw3g&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1" 200 -
qINJoZN0iSsAW66FAAAA: Sending packet PING data None
INFO:werkzeug:127.0.0.1 - - [26/Mar/2021 01:03:07] "qINJoZN0iSsAW66FAAAA: Received packet CLOSE data
GET /socket.io/?EIO=4&transport=polling&t=NXjLw3g.0&sid=qINJoZN0iSsAW66FAAAA HTTP/1.1qINJoZN0iSsAW66FAAAA: Client is gone, closing socket
Error.txt

The text was updated successfully, but these errors were encountered:

BenAAndrew · 2021-03-28T21:19:56Z

@GregoryBetsey It looks like something went wrong when trying to transcribe your audio to build the dataset. Could you firstly check that you used the latest executable Version 0.3 as the second error should have been fixed in that release.

If you did use that or the error still occurs could you upload your audio/text to google drive or email it to me at [email protected] so I can run some analysis

GregoryBetsey · 2021-03-29T04:40:42Z

@BenAAndrew Thanks for responding. I will send you a download link to your email address. I did not use the "automatic" audiobook method shown in your Youtube video, rather I transcribed the text manually.

GregoryBetsey · 2021-04-02T02:31:12Z

Update: I tried the latest release and got this error: [enforce fail at ..\caffe2\serialize\inline_container.cc:145] . PytorchStreamReader failed reading zip archive: failed finding central directory.

Server initialized for threading.
Server initialized for threading.
pydub\utils.py:170: RuntimeWarning: Couldn't find ffmpeg or avconv - defaulting to ffmpeg, but may not work
['C:\Users\GREGOR~~1\AppData\Local\Temp\_MEI104602\base_library.zip', 'C:\Users\GREGOR~~1\AppData\Local\Temp\_MEI104602', 'synthesis/waveglow/', 'C:\Users\Gregory Betsey']
torchaudio\extension\extension.py:14: UserWarning: torchaudio C++ extension is not available.
torchaudio\backend\utils.py:63: UserWarning: The interface of "soundfile" backend is planned to change in 0.8.0 to match that of "sox_io" backend and the current interface will be removed in 0.9.0. To use the new interface, do torchaudio.USE_SOUNDFILE_LEGACY_INTERFACE = False before setting the backend to "soundfile". Please refer to pytorch/audio#903 for the detail.
INFO:matplotlib.font_manager:Generating new fontManager, this may take some time...
[nltk_data] Downloading package wordnet to C:\Users\GREGOR~1\AppData\L
[nltk_data] ocal\Temp_MEI104602\nltk_data...
[nltk_data] Package wordnet is already up-to-date!
INSTALLING FFMPEG
VERIFYING FFMPEG INSTALL
WARNING:werkzeug:WebSocket transport not available. Install eventlet or gevent and gevent-websocket for improved performance.

Serving Flask app "main" (lazy loading)
Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.
Debug mode: off
INFO:werkzeug: * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
INFO:werkzeug:127.0.0.1 - - [01/Apr/2021 20:26:40] "GET / HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [01/Apr/2021 20:26:42] "GET /static/main.css HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [01/Apr/2021 20:26:42] "GET /static/pane.js HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [01/Apr/2021 20:26:42] "GET /static/favicon/favicon-16x16.png HTTP/1.1" 200 -
Starting Thread
INFO:werkzeug:127.0.0.1 - - [01/Apr/2021 20:28:09] "POST / HTTP/1.1" 200 -
INFO:werkzeug:127.0.0.1 - - [01/Apr/2021 20:28:09] "GET /static/application.js HTTP/1.1" 200 -
CxB55ktHT5jOvFCmAAAA: Sending packet OPEN data {'sid': 'CxB55ktHT5jOvFCmAAAA', 'upgrades': [], 'pingTimeout': 5000, 'pingInterval': 25000}
INFO:engineio.server:CxB55ktHT5jOvFCmAAAA: Sending packet OPEN data {'sid': 'CxB55ktHT5jOvFCmAAAA', 'upgrades': [], 'pingTimeout': 5000, 'pingInterval': 25000}
INFO:werkzeug:127.0.0.1 - - [01/Apr/2021 20:28:09] "GET /socket.io/?EIO=4&transport=polling&t=NYGQCZ1 HTTP/1.1" 200 -
CxB55ktHT5jOvFCmAAAA: Received packet MESSAGE data 0/voice,
INFO:engineio.server:CxB55ktHT5jOvFCmAAAA: Received packet MESSAGE data 0/voice,
CxB55ktHT5jOvFCmAAAA: Sending packet MESSAGE data 0/voice,{"sid":"aOmhwVQaFaYr50KBAAAB"}
INFO:werkzeug:127.0.0.1 - - [01/Apr/2021 20:28:09] "GET /socket.io/?EIO=4&transport=polling&t=NYGQCZM&sid=CxB55ktHT5jOvFCmAAAA HTTP/1.1" 200 -
INFO:engineio.server:CxB55ktHT5jOvFCmAAAA: Sending packet MESSAGE data 0/voice,{"sid":"aOmhwVQaFaYr50KBAAAB"}
INFO:werkzeug:127.0.0.1 - - [01/Apr/2021 20:28:09] "POST /socket.io/?EIO=4&transport=polling&t=NYGQCZL&sid=CxB55ktHT5jOvFCmAAAA HTTP/1.1" 200 -
emitting event "logs" to all [/voice]
INFO:socketio.server:emitting event "logs" to all [/voice]
CxB55ktHT5jOvFCmAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Coverting data\datasets\JamesEarlJones\audio.mp3..."}]
INFO:werkzeug:127.0.0.1 - - [01/Apr/2021 20:28:14] "GET /socket.io/?EIO=4&transport=polling&t=NYGQCZn&sid=CxB55ktHT5jOvFCmAAAA HTTP/1.1" 200 -
INFO:engineio.server:CxB55ktHT5jOvFCmAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Coverting data\datasets\JamesEarlJones\audio.mp3..."}]
INFO:voice:Coverting data\datasets\JamesEarlJones\audio.mp3...
ffmpeg version 4.3.2-2021-02-27-essentials_build-www.gyan.dev Copyright (c) 2000-2021 the FFmpeg developers
built with gcc 10.2.0 (Rev6, Built by MSYS2 project)
configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-lzma --enable-zlib --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-sdl2 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-libaom --enable-libopenjpeg --enable-libvpx --enable-libass --enable-libfreetype --enable-libfribidi --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-libgme --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libtheora --enable-libvo-amrwbenc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-librubberband
libavutil 56. 51.100 / 56. 51.100
libavcodec 58. 91.100 / 58. 91.100
libavformat 58. 45.100 / 58. 45.100
libavdevice 58. 10.100 / 58. 10.100
libavfilter 7. 85.100 / 7. 85.100
libswscale 5. 7.100 / 5. 7.100
libswresample 3. 7.100 / 3. 7.100
libpostproc 55. 7.100 / 55. 7.100
Input #0, mp3, from 'data\datasets\JamesEarlJones\audio.mp3':
Duration: 02:12:54.56, start: 0.025057, bitrate: 96 kb/s
Stream #0:0: Audio: mp3, 44100 Hz, mono, fltp, 96 kb/s
Metadata:
encoder : LAME3.100
Stream mapping:
Stream #0:0 -> #0:0 (mp3 (mp3float) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to 'data\datasets\JamesEarlJones\audio-converted.wav':
Metadata:
ISFT : Lavf58.45.100
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 22050 Hz, mono, s16, 352 kb/s
Metadata:
encoder : Lavc58.91.100 pcm_s16le
size= 343434kB time=02:12:54.52 bitrate= 352.8kbits/s speed=1.3e+03x
video:0kB audio:343434kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: 0.000022%
emitting event "logs" to all [/voice]
INFO:socketio.server:emitting event "logs" to all [/voice]
CxB55ktHT5jOvFCmAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Loading script from data\datasets\JamesEarlJones\text.txt..."}]
INFO:werkzeug:127.0.0.1 - - [01/Apr/2021 20:28:20] "GET /socket.io/?EIO=4&transport=polling&t=NYGQDj5&sid=CxB55ktHT5jOvFCmAAAA HTTP/1.1" 200 -
INFO:engineio.server:CxB55ktHT5jOvFCmAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Loading script from data\datasets\JamesEarlJones\text.txt..."}]
INFO:voice:Loading script from data\datasets\JamesEarlJones\text.txt...
emitting event "logs" to all [/voice]
INFO:socketio.server:emitting event "logs" to all [/voice]
CxB55ktHT5jOvFCmAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Searching text for matching fragments..."}]
INFO:engineio.server:CxB55ktHT5jOvFCmAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Searching text for matching fragments..."}]
INFO:voice:Searching text for matching fragments...
INFO:werkzeug:127.0.0.1 - - [01/Apr/2021 20:28:20] "emitting event "logs" to all [/voice]
GET /socket.io/?EIO=4&transport=polling&t=NYGQFEN&sid=CxB55ktHT5jOvFCmAAAA HTTP/1.1" 200 -
INFO:socketio.server:emitting event "logs" to all [/voice]
CxB55ktHT5jOvFCmAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Changing sample rate..."}]
INFO:engineio.server:CxB55ktHT5jOvFCmAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Changing sample rate..."}]
INFO:voice:Changing sample rate...
INFO:werkzeug:127.0.0.1 - - [01/Apr/2021 20:28:20] "GET /socket.io/?EIO=4&transport=polling&t=NYGQFF7&sid=CxB55ktHT5jOvFCmAAAA HTTP/1.1" 200 -
emitting event "logs" to all [/voice]
INFO:socketio.server:emitting event "logs" to all [/voice]
CxB55ktHT5jOvFCmAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Fetching segments..."}]
INFO:werkzeug:127.0.0.1 - - [01/Apr/2021 20:28:21] "GET /socket.io/?EIO=4&transport=polling&t=NYGQFFF&sid=CxB55ktHT5jOvFCmAAAA HTTP/1.1" 200 -
INFO:engineio.server:CxB55ktHT5jOvFCmAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Fetching segments..."}]
INFO:voice:Fetching segments...
emitting event "logs" to all [/voice]
INFO:socketio.server:emitting event "logs" to all [/voice]
CxB55ktHT5jOvFCmAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Matching segments..."}]
INFO:werkzeug:127.0.0.1 - - [01/Apr/2021 20:28:23] "GET /socket.io/?EIO=4&transport=polling&t=NYGQFaD&sid=CxB55ktHT5jOvFCmAAAA HTTP/1.1" 200 -
INFO:engineio.server:CxB55ktHT5jOvFCmAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Matching segments..."}]
INFO:voice:Matching segments...
emitting event "logs" to all [/voice]
INFO:socketio.server:emitting event "logs" to all [/voice]
CxB55ktHT5jOvFCmAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Generating segments..."}]
INFO:engineio.server:CxB55ktHT5jOvFCmAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Generating segments..."}]
INFO:voice:Generating segments...
emitting event "progress" to all [/voice]
INFO:socketio.server:emitting event "progress" to all [/voice]
CxB55ktHT5jOvFCmAAAA: Sending packet MESSAGE data 2/voice,["progress",{"number":" 1","total":"2725"}]
INFO:engineio.server:CxB55ktHT5jOvFCmAAAA: Sending packet MESSAGE data 2/voice,["progress",{"number":" 1","total":"2725"}]
INFO:voice:Progress - 1/2725
INFO:werkzeug:127.0.0.1 - - [01/Apr/2021 20:28:23] "GET /socket.io/?EIO=4&transport=polling&t=NYGQG4i&sid=CxB55ktHT5jOvFCmAAAA HTTP/1.1" 200 -
Using cache found in C:\Users\Gregory Betsey/.cache\torch\hub\snakers4_silero-models_master
torchaudio\backend\utils.py:63: UserWarning: The interface of "soundfile" backend is planned to change in 0.8.0 to match that of "sox_io" backend and the current interface will be removed in 0.9.0. To use the new interface, do torchaudio.USE_SOUNDFILE_LEGACY_INTERFACE = False before setting the backend to "soundfile". Please refer to [Announcement] Improving I/O for correct and consistent experience pytorch/audio#903 for the detail.
error logging recieved invalid response
emitting event "error" to all [/voice]
INFO:socketio.server:emitting event "error" to all [/voice]
CxB55ktHT5jOvFCmAAAA: Sending packet MESSAGE data 2/voice,["error",{"type":"RuntimeError","text":"[enforce fail at ..\caffe2\serialize\inline_container.cc:145] . PytorchStreamReader failed reading zip archive: failed finding central directory","stacktrace":"Traceback (most recent call last):\n File "application\utils.py", line 63, in background_task\n File "application\utils.py", line 39, in create_dataset\n File "dataset\clip_generator.py", line 60, in clip_generator\n File "dataset\forced_alignment\align.py", line 69, in process_segments\n File "dataset\transcribe.py", line 16, in transcribe\n File "torch\hub.py", line 370, in load\n File "torch\hub.py", line 399, in _load_local\n File "C:\Users\Gregory Betsey/.cache\torch\hub\snakers4_silero-models_master\hubconf.py", line 24, in silero_stt\n model, decoder = init_jit_model(model_url=models.stt_models.get(language).latest.jit,\n File "C:\Users\Gregory Betsey/.cache\torch\hub\snakers4_silero-models_master\utils.py", line 135, in init_jit_model\n model = torch.jit.load(model_path, map_location=device)\n File "torch\jit\_serialization.py", line 161, in load\nRuntimeError: [enforce fail at ..\caffe2\serialize\inline_container.cc:145] . PytorchStreamReader failed reading zip archive: failed finding central directory\n"}]
INFO:werkzeug:127.0.0.1 - - [01/Apr/2021 20:28:34] "GET /socket.io/?EIO=4&transport=polling&t=NYGQG4q&sid=CxB55ktHT5jOvFCmAAAA HTTP/1.1" 200 -
INFO:engineio.server:CxB55ktHT5jOvFCmAAAA: Sending packet MESSAGE data 2/voice,["error",{"type":"RuntimeError","text":"[enforce fail at ..\caffe2\serialize\inline_container.cc:145] . PytorchStreamReader failed reading zip archive: failed finding central directory","stacktrace":"Traceback (most recent call last):\n File "application\utils.py", line 63, in background_task\n File "application\utils.py", line 39, in create_dataset\n File "dataset\clip_generator.py", line 60, in clip_generator\n File "dataset\forced_alignment\align.py", line 69, in process_segments\n File "dataset\transcribe.py", line 16, in transcribe\n File "torch\hub.py", line 370, in load\n File "torch\hub.py", line 399, in _load_local\n File "C:\Users\Gregory Betsey/.cache\torch\hub\snakers4_silero-models_master\hubconf.py", line 24, in silero_stt\n model, decoder = init_jit_model(model_url=models.stt_models.get(language).latest.jit,\n File "C:\Users\Gregory Betsey/.cache\torch\hub\snakers4_silero-models_master\utils.py", line 135, in init_jit_model\n model = torch.jit.load(model_path, map_location=device)\n File "torch\jit\_serialization.py", line 161, in load\nRuntimeError: [enforce fail at ..\caffe2\serialize\inline_container.cc:145] . PytorchStreamReader failed reading zip archive: failed finding central directory\n"}]
[enforce fail at ..\caffe2\serialize\inline_container.cc:145] . PytorchStreamReader failed reading zip archive: failed finding central directory
CxB55ktHT5jOvFCmAAAA: Sending packet PING data None
INFO:engineio.server:CxB55ktHT5jOvFCmAAAA: Sending packet PING data None
CxB55ktHT5jOvFCmAAAA: Client is gone, closing socket
INFO:engineio.server:CxB55ktHT5jOvFCmAAAA: Client is gone, closing socket
CxB55ktHT5jOvFCmAAAA: Client is gone, closing socket
INFO:engineio.server:CxB55ktHT5jOvFCmAAAA: Client is gone, closing socket

BenAAndrew · 2021-04-02T10:37:26Z

@GregoryBetsey if you look at the folder which contains your .exe, is there a file called latest_silero_models.yml ?

GregoryBetsey · 2021-04-02T19:44:16Z

Yes, it does. I ran it through edge this time and got farther than before but got a new error this time: cannot reshape tensor of 0 elements into shape [-1, 0] because the unspecified dimension size -1 can be any value and is ambiguous.

Error.txt

BenAAndrew · 2021-04-02T21:03:18Z

I'll investigate this and get back to you.

BenAAndrew · 2021-04-04T11:37:46Z

@GregoryBetsey It the latest build 0.4.1 I've added some extra validation to the transcription process which may fix the bug. Could you give it a go?

GregoryBetsey · 2021-04-04T21:03:10Z

@GregoryBetsey It the latest build 0.4.1 I've added some extra validation to the transcription process which may fix the bug. Could you give it a go?

Thanks. I tried the latest built today any got stuck on the "generating segments..." section. I will attach the log file.
Error 4.4.2021.txt

P.S. I am using the same files I sent to you via google drive.

BenAAndrew · 2021-04-06T17:08:30Z

@GregoryBetsey Thank you for the error log. The issue seems to be with the torchaudio library not being able to change the audio sample rate. I will investigate now

BenAAndrew · 2021-04-07T11:43:12Z

@GregoryBetsey I've removed the code throwing the bug and replaced it with a different library. If you get a minute could you try release 0.5.1?

GregoryBetsey · 2021-04-07T20:57:21Z

I gave it a go and got a different error this time: The expanded size of the tensor (12800) must match the existing size (0) at non-singleton dimension 0. Target sizes: [12800]. Tensor sizes: [0]
Error.txt

BenAAndrew · 2021-04-08T10:51:53Z

@GregoryBetsey did this error occur with the data source you sent to me?

BenAAndrew · 2021-04-08T11:43:38Z

@GregoryBetsey I haven't been able to replicate the issue but I have identified what may have caused it and tried to fix in 0.5.3.

GregoryBetsey · 2021-04-08T20:13:59Z

@GregoryBetsey did this error occur with the data source you sent to me?

Yes, I am using the same files I sent you earlier. I will try your latest release and test the results.

GregoryBetsey · 2021-04-09T07:43:03Z

Update: I tried the latest release. I got a different error: data\datasets\JamesEarlJones\wavs\1470_2520.wav wav file is empty

Text.txt

BenAAndrew · 2021-04-09T10:56:59Z

@GregoryBetsey very interesting, seems like it can't open that file. Could you find that file and make sure it is playable. If it is could you email it to me?

GregoryBetsey · 2021-04-09T23:50:54Z

@GregoryBetsey very interesting, seems like it can't open that file. Could you find that file and make sure it is playable. If it is could you email it to me?

I am using the same audio and text transcript that I sent to you using google drive. The audio file is fine. If you need the link again, I can send it to you.

BenAAndrew · 2021-04-10T11:52:42Z

@GregoryBetsey I've produced the dataset and that clip (1470_2520.wav) is playable and can be transcribed. Just to double-check did you try playing the original audio or the 1470_2520.wav clip?

BenAAndrew · 2021-04-14T19:34:38Z

@GregoryBetsey, I've been able to reproduce this error once. It seems to be that FFmpeg (very rarely) corrupts the audio when trimming. Handling of this will be added in an upcoming release

GregoryBetsey · 2021-04-14T19:44:12Z

@GregoryBetsey, I've been able to reproduce this error once. It seems to be that FFmpeg (very rarely) corrupts the audio when trimming. Handling of this will be added in an upcoming release

Thanks for the update. I haven't got past the error.

BenAAndrew · 2021-04-14T23:52:33Z

Hi @GregoryBetsey, thank you for your patience. This should be handled in 0.6. Please let me know how you get on

GregoryBetsey · 2021-04-15T19:19:59Z

your

Thanks for working on this. I don't know if this is progress, but it actually started generating segments this time except I got a message saying the audio can't be transcribed. [Again, I using the files from my Google Drive].

Log.txt

BenAAndrew · 2021-04-15T21:51:39Z

@GregoryBetsey That's interesting. It looks like there's an issue with FFmpeg cutting the clips. Could you do the following:

Check that the audio files listed in the logs exist
Check if there is a folder called 'ffmpeg' in the same directory as the application. If there is, delete it.
Try installing FFmpeg manually. i.e. following https://www.youtube.com/watch?v=hD9bQE4R6eA

The issue must be to do with FFmpeg, so if those files exist then it is not working correctly

GregoryBetsey · 2021-04-17T02:42:09Z

Okay, the app is generating the audio files and I installed ffmpeg to C:\ and is working. I deleted FFmpeg in the app folder but I still get errors.

Error.txt

GregoryBetsey · 2021-04-19T16:56:22Z

Yes, it works.

BenAAndrew · 2021-04-19T19:47:52Z

Hmm, this is interesting. You see the app just runs the conversion command and then the trim command which is exactly what you've done here. Have you tried running the app again since reinstalling ffmpeg?

GregoryBetsey · 2021-04-20T03:08:04Z

Yes, I

Okay, the app is generating the audio files and I installed ffmpeg to C:\ and is working. I deleted FFmpeg in the app folder but I still get errors.

Yes I I did that here. The app generates clips. It says it cannot transcribe at the end and then it deletes all the generated waves.
Error Log.txt

BenAAndrew · 2021-04-20T18:28:11Z

@GregoryBetsey whilst it is running could you copy one of the generated wav files. It should be saved to data\datasets\ dataset_name\wavs where dataset_name is the name of the dataset. Then could you check if that is playable?

GregoryBetsey · 2021-04-20T23:48:48Z

@GregoryBetsey whilst it is running could you copy one of the generated wav files. It should be saved to data\datasets\ dataset_name\wavs where dataset_name is the name of the dataset. Then could you check if that is playable?

The wavs file can be opened, but since the generated length is 00:00:00 there isn't any audio sound. [see attachment]
Example.zip

BenAAndrew · 2021-04-21T10:31:57Z

Ok so FFmpeg isn't working when cutting the audio as all of these clips should be at least 1 second long. I don't understand why the command would outside of the app but not in it as both should be using the same FFmpeg and command. I will try and resolve this week

RayDAnt3D · 2021-04-21T16:25:14Z

also experiencing this issue exactly as described in #27 (nothing but "Could not transcribe data\datasets..." messages and zero-length wave files despite having a tested working ffmpeg install) when attempting to build either my own or the provided demo datasets.

Something I noticed that does seem off is that regardless of whether my source audio file is an mp3 or a wave, the application logfile always says that it is converting from an mp3. eg:

Coverting data\datasets\TestVoice\audio.mp3...
Loading script from data\datasets\TestVoice\text.txt...
Searching text for matching fragments...
Changing sample rate...
Fetching segments...
Matching segments...
Generating segments...
Could not transcribe data\datasets\TestVoice\wavs\1650_2730.wav
Could not transcribe data\datasets\TestVoice\wavs\5850_7680.wav
Could not transcribe data\datasets\TestVoice\wavs\7680_9330.wav
Could not transcribe data\datasets\TestVoice\wavs\9450_10530.wav
Could not transcribe data\datasets\TestVoice\wavs\10560_12720.wav

The audio file being converted above was a wave file named "this_is_a_wave_file.wav". Having said that, the "audio-converted.wav" and "audio-converted-16000.wav" files generated in the dataset's working directly isare playable and seemingly in the right format according to VLC Player:

Stream 0 ("audio-converted.wav")
Codec: PCM S16 LE (s16l)
Type: Audio
Channels: Mono
Sample rate: 22050 Hz
Bits per sample: 16

Stream 0 ("audio-converted-16000.wav")
Codec: PCM S16 LE (s16l)
Type: Audio
Channels: Mono
Sample rate: 16000 Hz
Bits per sample: 16

It's just the separated out segments that are inoperable (nothing but 78 bytes of metadata in each one.)

BenAAndrew · 2021-04-21T17:42:24Z

@RayDAnt3D thank you for this info. This seems to be an issue for several people so it is my number one priority. I'm hoping to have it fixed by Sunday 🤞

BenAAndrew · 2021-04-22T20:23:29Z

@GregoryBetsey @RayDAnt3D I'm struggling to figure out what's causing this issue & I can't get it to replicate locally. The issue must be to do with either the FFmpeg install or one of the commands.

To test this I've produced the following: https://drive.google.com/drive/folders/17zT6fg7V_gu_kMVZs2ERPmfGyFRuDhWg?usp=sharing

In there you'll find a test audio file and a script. Could you try downloading both & running the script. Then check that it produces an audio file called test-final.wav that is playable & 3 seconds long.

Thank you for your patience

arthur465 · 2021-04-22T21:08:21Z

@GregoryBetsey @RayDAnt3D I'm struggling to figure out what's causing this issue & I can't get it to replicate locally. The issue must be to do with either the FFmpeg install or one of the commands.

To test this I've produced the following: https://drive.google.com/drive/folders/17zT6fg7V_gu_kMVZs2ERPmfGyFRuDhWg?usp=sharing

In there you'll find a test audio file and a script. Could you try downloading both & running the script. Then check that it produces an audio file called test-final.wav that is playable & 3 seconds long.

Thank you for your patience

Hey I downloaded it and ran the script. I can confirm it produced a 3 second playable clip called "test-clip.wav"

RayDAnt3D · 2021-04-23T02:47:49Z

Also downloaded/ran the test script and audio clip and got the following tested working audio files generated:

test-clean.wav
test-clean-16000.wav
test-clip.wav

No "test-final.wav" though.

BenAAndrew · 2021-04-23T10:10:03Z

@RayDAnt3D @arthur465 Sorry I meant test-clip.wav. So it sounds like the FFmpeg commands are working for all of you. I'm going to try and create a release today which has improved error logging on the clip building process so we can find out where it is failing in the app

BenAAndrew · 2021-04-23T11:14:05Z

@GregoryBetsey @arthur465 @RayDAnt3D I've created a new release here: https://github.com/BenAAndrew/Voice-Cloning-App/releases/tag/v0.6.2. It won't fix the issue but it might help tell us what the error is. It will now check the output of the FFmpeg commands and will also show it running in the console. Could you give it a go and let me know what happens

arthur465 · 2021-04-23T18:53:41Z

@GregoryBetsey @arthur465 @RayDAnt3D I've created a new release here: https://github.com/BenAAndrew/Voice-Cloning-App/releases/tag/v0.6.2. It won't fix the issue but it might help tell us what the error is. It will now check the output of the FFmpeg commands and will also show it running in the console. Could you give it a go and let me know what happens

Ok here's the error I get

INFO:voice:Progress - 391/416
INFO:werkzeug:127.0.0.1 - - [23/Apr/2021 11:44:13] "GET /socket.io/?EIO=4&transport=polling&t=Na02_Cm&sid=7KcT1PoIXIKbnCvUAAAA HTTP/1.1" 200 -
ffmpeg version 2021-04-18-git-d43b26b30d-full_build-www.gyan.dev Copyright (c) 2000-2021 the FFmpeg developers
built with gcc 10.2.0 (Rev6, Built by MSYS2 project)
configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libdav1d --enable-libzvbi --enable-librav1e --enable-libsvtav1 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-libaom --enable-libopenjpeg --enable-libvpx --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-libglslang --enable-vulkan --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libilbc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
libavutil 56. 73.100 / 56. 73.100
libavcodec 58.136.101 / 58.136.101
libavformat 58. 78.100 / 58. 78.100
libavdevice 58. 14.100 / 58. 14.100
libavfilter 7.111.100 / 7.111.100
libswscale 5. 10.100 / 5. 10.100
libswresample 3. 10.100 / 3. 10.100
libpostproc 55. 10.100 / 55. 10.100
Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from 'data\datasets\Arthur 2\audio-converted.wav':
Metadata:
encoder : Lavf58.78.100
Duration: 00:17:31.99, bitrate: 352 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 22050 Hz, mono, s16, 352 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to 'data\datasets\Arthur 2\wavs\994140_995250.wav':
Metadata:
ISFT : Lavf58.78.100
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 22050 Hz, mono, s16, 352 kb/s
Metadata:
encoder : Lavc58.136.101 pcm_s16le
size= 0kB time=00:00:00.00 bitrate=N/A speed= 0x
video:0kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Output file is empty, nothing was encoded (check -ss / -t / -frames parameters if used)
emitting event "logs" to all [/voice]
INFO:socketio.server:emitting event "logs" to all [/voice]
7KcT1PoIXIKbnCvUAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Could not transcribe data\datasets\Arthur 2\wavs\994140_995250.wav"}]
INFO:werkzeug:127.0.0.1 - - [23/Apr/2021 11:44:13] "GET /socket.io/?EIO=4&transport=polling&t=Na02_Cs&sid=7KcT1PoIXIKbnCvUAAAA HTTP/1.1" 200 -
INFO:engineio.server:7KcT1PoIXIKbnCvUAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Could not transcribe data\datasets\Arthur 2\wavs\994140_995250.wav"}]
INFO:voice:Could not transcribe data\datasets\Arthur 2\wavs\994140_995250.wav
emitting event "progress" to all [/voice]
INFO:socketio.server:emitting event "progress" to all [/voice]
7KcT1PoIXIKbnCvUAAAA: Sending packet MESSAGE data 2/voice,["progress",{"number":" 392","total":"416"}]
INFO:engineio.server:7KcT1PoIXIKbnCvUAAAA: Sending packet MESSAGE data 2/voice,["progress",{"number":" 392","total":"416"}]

RayDAnt3D · 2021-04-23T19:33:42Z

Here's what I get for the first sample cutting attempt (and every other thereafter) using the Ayaode dataset assets:

INFO:voice:Generating segments...
ffmpeg version 2021-04-18-git-d43b26b30d-full_build-www.gyan.dev Copyright (c) 2000-2021 the FFmpeg developers
built with gcc 10.2.0 (Rev6, Built by MSYS2 project)
configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libdav1d --enable-libzvbi --enable-librav1e --enable-libsvtav1 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxvid --enable-libaom --enable-libopenjpeg --enable-libvpx --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libmfx --enable-libglslang --enable-vulkan --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libilbc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
libavutil 56. 73.100 / 56. 73.100
libavcodec 58.136.101 / 58.136.101
libavformat 58. 78.100 / 58. 78.100
libavdevice 58. 14.100 / 58. 14.100
libavfilter 7.111.100 / 7.111.100
libswscale 5. 10.100 / 5. 10.100
libswresample 3. 10.100 / 3. 10.100
libpostproc 55. 10.100 / 55. 10.100
Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from 'data\datasets\Ayoade\audio-converted.wav':
Metadata:
artist : Richard Ayoade
comment : At last, the definitive audiobook about perhaps the best cabin crew dramedy ever filmed: View from the Top starring Gwyneth Paltrow. In Ayoade on Top, Richard Ayoade, perhaps one of the most 'insubstantial' people of our age, takes us on a journey from Pe
copyright : ©2019 Richard Ayoade (P)2019 Audible, Ltd
date : 2019
genre : Audiobook
title : 1 - Ayoade on Top
album : Ayoade on Top
track : 1/1
encoder : Lavf58.78.100
Duration: 04:39:25.09, bitrate: 352 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 22050 Hz, mono, s16, 352 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to 'data\datasets\Ayoade\wavs\60_1680.wav':
Metadata:
IART : Richard Ayoade
ICMT : At last, the definitive audiobook about perhaps the best cabin crew dramedy ever filmed: View from the Top starring Gwyneth Paltrow. In Ayoade on Top, Richard Ayoade, perhaps one of the most 'insubstantial' people of our age, takes us on a journey from Pe
ICOP : ©2019 Richard Ayoade (P)2019 Audible, Ltd
ICRD : 2019
IGNR : Audiobook
INAM : 1 - Ayoade on Top
IPRD : Ayoade on Top
IPRT : 1/1
ISFT : Lavf58.78.100
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 22050 Hz, mono, s16, 352 kb/s
Metadata:
encoder : Lavc58.136.101 pcm_s16le
size= 1kB time=00:00:00.00 bitrate=N/A speed= 0x
video:0kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Output file is empty, nothing was encoded (check -ss / -t / -frames parameters if used)
Using cache found in C:\Users\gbase/.cache\torch\hub\snakers4_silero-models_master
NpmL9qWiQt7YXzLnAAAA: Sending packet PING data None
INFO:engineio.server:NpmL9qWiQt7YXzLnAAAA: Sending packet PING data None
INFO:werkzeug:127.0.0.1 - - [23/Apr/2021 15:19:47] "GET /socket.io/?EIO=4&transport=polling&t=Na0B7hC&sid=NpmL9qWiQt7YXzLnAAAA HTTP/1.1" 200 -
NpmL9qWiQt7YXzLnAAAA: Received packet PONG data
INFO:engineio.server:NpmL9qWiQt7YXzLnAAAA: Received packet PONG data
INFO:werkzeug:127.0.0.1 - - [23/Apr/2021 15:19:47] "POST /socket.io/?EIO=4&transport=polling&t=Na0B84R&sid=NpmL9qWiQt7YXzLnAAAA HTTP/1.1" 200 -
emitting event "logs" to all [/voice]
INFO:socketio.server:emitting event "logs" to all [/voice]
NpmL9qWiQt7YXzLnAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Could not transcribe data\datasets\Ayoade\wavs\60_1680.wav"}]
INFO:engineio.server:NpmL9qWiQt7YXzLnAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Could not transcribe data\datasets\Ayoade\wavs\60_1680.wav"}]
INFO:voice:Could not transcribe data\datasets\Ayoade\wavs\60_1680.wav
emitting event "progress" to all [/voice]
INFO:socketio.server:emitting event "progress" to all [/voice]
NpmL9qWiQt7YXzLnAAAA: Sending packet MESSAGE data 2/voice,["progress",{"number":" 1","total":"5021"}]
INFO:engineio.server:NpmL9qWiQt7YXzLnAAAA: Sending packet MESSAGE data 2/voice,["progress",{"number":" 1","total":"5021"}]
INFO:voice:Progress - 1/5021

For what it's worth, here also is my app log at first startup:

[12568] WARNING: file already exists but should not: C:\Users\gbase\AppData\Local\Temp_MEI125682\torch_C.cp38-win_amd64.pyd
Server initialized for threading.
Server initialized for threading.
torchaudio\extension\extension.py:14: UserWarning: torchaudio C++ extension is not available.
torchaudio\backend\utils.py:63: UserWarning: The interface of "soundfile" backend is planned to change in 0.8.0 to match that of "sox_io" backend and the current interface will be removed in 0.9.0. To use the new interface, do torchaudio.USE_SOUNDFILE_LEGACY_INTERFACE = False before setting the backend to "soundfile". Please refer to pytorch/audio#903 for the detail.
INFO:matplotlib.font_manager:Generating new fontManager, this may take some time...
[nltk_data] Downloading package wordnet to
[nltk_data] C:\Users\gbase\AppData\Local\Temp_MEI125682\nltk_data
[nltk_data] ...
[nltk_data] Package wordnet is already up-to-date!
WARNING:werkzeug:WebSocket transport not available. Install eventlet or gevent and gevent-websocket for improved performance.

Serving Flask app "main" (lazy loading)

Environment: production
WARNING: This is a development server. Do not use it in a production deployment.
Use a production WSGI server instead.

Debug mode: off
INFO:werkzeug: * Running on http://127.0.0.1:5000/ (Press CTRL+C to quit)
INFO:werkzeug:127.0.0.1 - - [23/Apr/2021 15:17:35] "GET / HTTP/1.1" 200 -

RayDAnt3D · 2021-04-23T20:45:10Z

Did some source code snooping and noticed that running:

start_timestamp = datetime.fromtimestamp(start / 1000).strftime("%H:%M:%S.%f")

As appears in dataset\audio_processing.py inside the cut_audio() routine with start=60 (as the first Ayoade clip would be) on the Python commandline like so:

from subprocess import call
from pathlib import Path
from datetime import datetime
from pydub import AudioSegment
import os
datetime.fromtimestamp(60 / 1000).strftime("%H:%M:%S.%f")

results in the following output:

'19:00:00.060000'

Pretty sure that additional '19:00:00.000000' shouldn't be there. The root of the problem may just be a date/time localization mismatch.

BenAAndrew · 2021-04-24T13:05:33Z

@RayDAnt3D great find. What time localization do you use?

ironpanther · 2021-04-24T17:29:36Z

I tried the latest version (0.63) just to see if anything was different-----the initial files it creates from my sample mp3------audio.mp3, audio-converted.wav, and audio-converted-16000.wav are all fine, same as before. The many individual clip-wavs inside the folder, are all "empty" files, with length 00:00:00, size 78 bytes. I believe that's the same as before (I stopped it before it auto-deleted them this time, so I could check them)

Error when trying to process are similar to the post above:

Guessed Channel Layout for Input Stream #0.0 : mono
Input #0, wav, from 'data\datasets\Kate\audio-converted.wav':
Metadata:
encoder : Lavf58.76.100
Duration: 04:18:04.84, bitrate: 352 kb/s
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 22050 Hz, mono, s16, 352 kb/s
Stream mapping:
Stream #0:0 -> #0:0 (pcm_s16le (native) -> pcm_s16le (native))
Press [q] to stop, [?] for help
Output #0, wav, to 'data\datasets\Kate\wavs\1436520_1438290.wav':
Metadata:
ISFT : Lavf58.76.100
Stream #0:0: Audio: pcm_s16le ([1][0][0][0] / 0x0001), 22050 Hz, mono, s16, 352 kb/s
Metadata:
encoder : Lavc58.134.100 pcm_s16le
size= 0kB time=00:00:00.00 bitrate=N/A speed= 0x
video:0kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Output file is empty, nothing was encoded (check -ss / -t / -frames parameters if used)
emitting event "logs" to all [/voice]
INFO:socketio.server:emitting event "logs" to all [/voice]
MRg0ipT8vkRYLkMJAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Could not transcribe data\datasets\Kate\wavs\1436520_1438290.wav"}]
INFO:werkzeug:127.0.0.1 - - [24/Apr/2021 12:19:54] "GET /socket.io/?EIO=4&transport=polling&t=Na4vHg3&sid=MRg0ipT8vkRYLkMJAAAA HTTP/1.1" 200 -
INFO:engineio.server:MRg0ipT8vkRYLkMJAAAA: Sending packet MESSAGE data 2/voice,["logs",{"text":"Could not transcribe data\datasets\Kate\wavs\1436520_1438290.wav"}]
INFO:voice:Could not transcribe data\datasets\Kate\wavs\1436520_1438290.wav
emitting event "progress" to all [/voice]
INFO:socketio.server:emitting event "progress" to all [/voice]
MRg0ipT8vkRYLkMJAAAA: Sending packet MESSAGE data 2/voice,["progress",{"number":" 356","total":"5134"}]
INFO:engineio.server:MRg0ipT8vkRYLkMJAAAA: Sending packet MESSAGE data 2/voice,["progress",{"number":" 356","total":"5134"}]
INFO:voice:Progress - 356/5134

RayDAnt3D · 2021-04-24T18:38:16Z

@BenAAndrew US EST (technically currently EDT.)

BenAAndrew · 2021-04-24T19:15:47Z

@RayDAnt3D @ironpanther @arthur465 @GregoryBetsey I've rewritten the timestamp function to fix this. Added in https://github.com/BenAAndrew/Voice-Cloning-App/releases/tag/v0.7. Please test if you get a chance

arthur465 · 2021-04-24T19:37:26Z

@RayDAnt3D @ironpanther @arthur465 @GregoryBetsey I've rewritten the timestamp function to fix this. Will be added in release 0.7. Please test if you get a chance

It looks like it's working!

KoolenDasheppi · 2021-04-24T19:38:27Z

Release 0.7 fixed it for me (I've been keeping an eye on this repo and this issue so I can know when it got fixed). I'm also excited about the HiFi-GAN addition. Thanks for developing this by the way, you're doing an awesome job!

RayDAnt3D · 2021-04-24T20:11:15Z

It's fixed for me! Currently doing my first training run now.

ironpanther · 2021-04-24T21:00:11Z

The .wav generation/clips seems to work now, and it gets much further, but it's been "stuck in a loop" for a while now----I have:

Coverting data\datasets\Kate\audio.mp3...
Loading script from data\datasets\Kate\text.txt...
Searching text for matching fragments...
Changing sample rate...
Fetching segments...
Matching segments...
Generating segments...

And the cmd window just keeps repeating:
INFO:engineio.server:Fo5fjmXqSKaqAfxdAAAC: Sending packet PING data None
Fo5fjmXqSKaqAfxdAAAC: Received packet PONG data
INFO:engineio.server:Fo5fjmXqSKaqAfxdAAAC: Received packet PONG data
INFO:werkzeug:127.0.0.1 - - [24/Apr/2021 15:54:33] "POST /socket.io/?EIO=4&transport=polling&t=Na5gP-I&sid=Fo5fjmXqSKaqAfxdAAAC HTTP/1.1" 200 -
Fo5fjmXqSKaqAfxdAAAC: Sending packet PING data None
INFO:werkzeug:127.0.0.1 - - [24/Apr/2021 15:54:58] "GET /socket.io/?EIO=4&transport=polling&t=Na5gP-K&sid=Fo5fjmXqSKaqAfxdAAAC HTTP/1.1" 200 -
INFO:engineio.server:Fo5fjmXqSKaqAfxdAAAC: Sending packet PING data None
Fo5fjmXqSKaqAfxdAAAC: Received packet PONG data
INFO:engineio.server:Fo5fjmXqSKaqAfxdAAAC: Received packet PONG data
INFO:werkzeug:127.0.0.1 - - [24/Apr/2021 15:54:58] "POST /socket.io/?EIO=4&transport=polling&t=Na5gWB0&sid=Fo5fjmXqSKaqAfxdAAAC HTTP/1.1" 200 -
Fo5fjmXqSKaqAfxdAAAC: Sending packet PING data None
INFO:werkzeug:127.0.0.1 - - [24/Apr/2021 15:55:23] "GET /socket.io/?EIO=4&transport=polling&t=Na5gWB1&sid=Fo5fjmXqSKaqAfxdAAAC HTTP/1.1" 200 -
INFO:engineio.server:Fo5fjmXqSKaqAfxdAAAC: Sending packet PING data None
Fo5fjmXqSKaqAfxdAAAC: Received packet PONG data
INFO:engineio.server:Fo5fjmXqSKaqAfxdAAAC: Received packet PONG data
INFO:werkzeug:127.0.0.1 - - [24/Apr/2021 15:55:23] "POST /socket.io/?EIO=4&transport=polling&t=Na5gcM6&sid=Fo5fjmXqSKaqAfxdAAAC HTTP/1.1" 200 -
Fo5fjmXqSKaqAfxdAAAC: Sending packet PING data None
INFO:werkzeug:127.0.0.1 - - [24/Apr/2021 15:55:48] "GET /socket.io/?

::edit:: saw something new while typing---
INFO:engineio.server:M-50GuX-ODIZG_s_AAAE: Received packet CLOSE data
INFO:engineio.server:M-50GuX-ODIZG_s_AAAE: Client is gone, closing socket

Could a future version, have an option on which browser to open with? I think that being able to choose chrome etc, may work better, as my PC has 4 different browsers, and all behave differently when running scripts.

ironpanther · 2021-04-24T22:05:15Z

Update----I "let the browser window it opened automatically" just sit there, and opened a new browser window but in chrome, and that worked. So I think either "let user choose browser, or default to chrome browser instead of OS browser" is a needed option.

::edit:: Would also suggest that in train.py, "ITERS_PER_CHECKPOINT = 1000" be lowered----currently, that results in only saving approximately once per hour, on my GTX 1080. I could easily lose internet connection etc before it saves again, and lose many iterations. Or if I wish to stop for a while, and do something else with my GPU, after 45 mins of training--that would all be lost, as it wouldn't have saved since then. "Manual save" and/or more frequent checkpoints would also allow more experimenting with determining optimum batch size etc.

BenAAndrew · 2021-04-25T10:40:54Z

@ironpanther thanks for the feedback. Trying to handle the non-default browser is a bit complex and also the browser shouldn't affect performance. Additionally, the app does not need an internet connection to run (despite running in the browser).

As for the "stuck in a loop" I don't think it is, those messages are just logging for the app and not the process itself. It may take a while to finish processing even after the progress bar is done.

Changing the checkpoint frequency is a good idea and I will add in the future

BenAAndrew · 2021-04-25T16:27:57Z

Closing as everyone seems happy this particular issue is now fixed. If it has not been fixed please reopen it.

BenAAndrew added the bug Something isn't working label Mar 28, 2021

BenAAndrew self-assigned this Mar 29, 2021

BenAAndrew changed the title ~~Voice-Cloning-App.exe not working on Windows~~ Dataset Generation audio processing Apr 8, 2021

BenAAndrew mentioned this issue Apr 8, 2021

Error: The expanded size of the tensor must match the existing size at non-singleton dimension 0. #17

Closed

BenAAndrew changed the title ~~Dataset Generation audio processing~~ Error: The expanded size of the tensor must match the existing size at non-singleton dimension 0. Apr 8, 2021

BenAAndrew changed the title ~~Error: The expanded size of the tensor must match the existing size at non-singleton dimension 0.~~ Transcription error: wav file is empty Apr 14, 2021

BenAAndrew mentioned this issue Apr 20, 2021

Training: list index out of range #24

Closed

BenAAndrew mentioned this issue Apr 21, 2021

Can't transcribe data when building dataset #27

Closed

BenAAndrew closed this as completed Apr 25, 2021

Transcription error: wav file is empty #11

Transcription error: wav file is empty #11

Comments

GregoryBetsey commented Mar 26, 2021

BenAAndrew commented Mar 28, 2021

GregoryBetsey commented Mar 29, 2021

GregoryBetsey commented Apr 2, 2021

BenAAndrew commented Apr 2, 2021

GregoryBetsey commented Apr 2, 2021

BenAAndrew commented Apr 2, 2021

BenAAndrew commented Apr 4, 2021

GregoryBetsey commented Apr 4, 2021 • edited Loading

BenAAndrew commented Apr 6, 2021

BenAAndrew commented Apr 7, 2021

GregoryBetsey commented Apr 7, 2021

BenAAndrew commented Apr 8, 2021

BenAAndrew commented Apr 8, 2021

GregoryBetsey commented Apr 8, 2021

GregoryBetsey commented Apr 9, 2021

BenAAndrew commented Apr 9, 2021 • edited Loading

GregoryBetsey commented Apr 9, 2021

BenAAndrew commented Apr 10, 2021

BenAAndrew commented Apr 14, 2021

GregoryBetsey commented Apr 14, 2021

BenAAndrew commented Apr 14, 2021

GregoryBetsey commented Apr 15, 2021

BenAAndrew commented Apr 15, 2021 • edited Loading

GregoryBetsey commented Apr 17, 2021 • edited Loading

GregoryBetsey commented Apr 19, 2021

BenAAndrew commented Apr 19, 2021

GregoryBetsey commented Apr 20, 2021 • edited Loading

BenAAndrew commented Apr 20, 2021 • edited Loading

GregoryBetsey commented Apr 20, 2021

BenAAndrew commented Apr 21, 2021

RayDAnt3D commented Apr 21, 2021

BenAAndrew commented Apr 21, 2021

BenAAndrew commented Apr 22, 2021

arthur465 commented Apr 22, 2021

RayDAnt3D commented Apr 23, 2021

BenAAndrew commented Apr 23, 2021

BenAAndrew commented Apr 23, 2021

arthur465 commented Apr 23, 2021

RayDAnt3D commented Apr 23, 2021

RayDAnt3D commented Apr 23, 2021

BenAAndrew commented Apr 24, 2021

ironpanther commented Apr 24, 2021

RayDAnt3D commented Apr 24, 2021

BenAAndrew commented Apr 24, 2021 • edited Loading

arthur465 commented Apr 24, 2021

KoolenDasheppi commented Apr 24, 2021

RayDAnt3D commented Apr 24, 2021

ironpanther commented Apr 24, 2021 • edited Loading

ironpanther commented Apr 24, 2021 • edited Loading

BenAAndrew commented Apr 25, 2021 • edited Loading

BenAAndrew commented Apr 25, 2021

GregoryBetsey commented Apr 4, 2021 •

edited

Loading

BenAAndrew commented Apr 9, 2021 •

edited

Loading

BenAAndrew commented Apr 15, 2021 •

edited

Loading

GregoryBetsey commented Apr 17, 2021 •

edited

Loading

GregoryBetsey commented Apr 20, 2021 •

edited

Loading

BenAAndrew commented Apr 20, 2021 •

edited

Loading

BenAAndrew commented Apr 24, 2021 •

edited

Loading

ironpanther commented Apr 24, 2021 •

edited

Loading

ironpanther commented Apr 24, 2021 •

edited

Loading

BenAAndrew commented Apr 25, 2021 •

edited

Loading