-
Notifications
You must be signed in to change notification settings - Fork 5
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
11 changed files
with
124 additions
and
97 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,4 @@ | ||
*__pycache__* | ||
dataset/ | ||
dataset/* | ||
logs/ | ||
test.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,104 @@ | ||
# JK-VITS | ||
Bilingual-TTS (Japanese and Korean) | ||
This Repository can speak Japanese even if you train with Korean dataset, and can speak Korean even if you train with Japanese dataset. | ||
By transcribing pronunciation from Japanese to Korean and Korean to Japanese, the unstable voice produced when using the existing multilingual ipa cleaners has been improved. | ||
|
||
|
||
|
||
## Table of Contents | ||
- [Prerequisites](#prerequisites) | ||
- [Installation](#installation) | ||
- [Prepare_Datasets](#Prepare_Datasets) | ||
- [Usage](#usage) | ||
- [Inference](#inference) | ||
- [References](#References) | ||
|
||
|
||
## Pre-requisites | ||
- A Windows/Linux system with a minimum of `16GB` RAM. | ||
- A GPU with at least `12GB` of VRAM. | ||
- Python >= 3.8 | ||
- Anaconda installed. | ||
- PyTorch installed. | ||
- CUDA 11.7 installed. | ||
|
||
|
||
|
||
Pytorch install command: | ||
```sh | ||
pip install torch==1.13.1+cu117 torchvision==0.14.1+cu117 torchaudio==0.13.1 --extra-index-url https://download.pytorch.org/whl/cu117 | ||
``` | ||
CUDA 11.7 Install: | ||
`https://developer.nvidia.com/cuda-11-7-0-download-archive` | ||
CUDNN 11.x Install: | ||
`https://developer.nvidia.com/rdp/cudnn-archive` | ||
|
||
|
||
--- | ||
## Installation | ||
1. **Create an Anaconda environment:** | ||
|
||
```sh | ||
conda create -n jk-vits python=3.8 | ||
``` | ||
|
||
2. **Activate the environment:** | ||
|
||
```sh | ||
conda activate jk-vits | ||
``` | ||
|
||
3. **Clone this repository to your local machine:** | ||
|
||
```sh | ||
git clone https://github.com/kdrkdrkdr/JK-VITS.git | ||
``` | ||
|
||
4. **Navigate to the cloned directory:** | ||
|
||
```sh | ||
cd JK-VITS | ||
``` | ||
|
||
5. **Install the necessary dependencies:** | ||
|
||
```sh | ||
pip install -r requirements.txt | ||
pip install -U pyopenjtalk==0.2.0 --no-build-isolation | ||
``` | ||
--- | ||
|
||
## Preparing Dataset Example | ||
|
||
- Place the audio files as follows. | ||
.wav files are okay. The sample rate of the audio must be 44100 Hz. | ||
|
||
|
||
- Preprocessing (g2p) for your own datasets. Preprocessed phonemes for your dataset. | ||
```sh | ||
python preprocess.py --filelists filelists/train.txt filelists/val.txt | ||
``` | ||
|
||
- Set configs. | ||
If you train with japanese dataset, refer [configs/ja.json](configs/ja.json) | ||
If you train with korean dataset, refer [configs/ko.json](configs/ko.json) | ||
--- | ||
|
||
## Training Exmaple | ||
```sh | ||
python train.py -c configs/ft.json -m ft | ||
``` | ||
|
||
|
||
--- | ||
## Inference Exmaple | ||
See [inference.ipynb](inference.ipynb) | ||
|
||
|
||
|
||
--- | ||
## References | ||
For more information, please refer to the following repositories: | ||
- [jaywalnut310/vits](https://github.com/jaywalnut310/vits.git) | ||
- [MasayaKawamura/MB-iSTFT-VITS](https://github.com/MasayaKawamura/) | ||
- [Kyubyong/g2pK](https://github.com/Kyubyong/g2pK) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file was deleted.
Oops, something went wrong.
Empty file.
Empty file.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,31 +1,17 @@ | ||
# utils | ||
cmake | ||
ffmpeg | ||
|
||
# torch | ||
--extra-index-url https://download.pytorch.org/whl/cu117 | ||
torch==1.13.1+cu117 | ||
torchvision==0.14.1+cu117 | ||
torchaudio==0.13.1 | ||
|
||
# vits | ||
Cython==0.29.21 | ||
librosa==0.8.0 | ||
matplotlib==3.3.1 | ||
numpy==1.18.5 | ||
scipy==1.5.2 | ||
tensorboard==2.3.0 | ||
Unidecode==1.1.1 | ||
pysoundfile==0.9.0.post1 | ||
monotonic-align | ||
g2pk2 | ||
eunjeon | ||
ko_pron==1.3 | ||
jamo==0.4.1 | ||
pyopenjtalk==0.2.0 | ||
# pyopenjtalk==0.2.0 | ||
jaconv | ||
protobuf==3.19.0 | ||
|
||
# # Nuwave2 | ||
# prefetch_generator | ||
# omegaconf==2.0.6 | ||
# pytorch_lightning==1.2.10 | ||
Cython==0.29.21 | ||
librosa==0.8.0 | ||
numpy==1.18.5 | ||
scipy==1.5.2 |
File renamed without changes.