-
Notifications
You must be signed in to change notification settings - Fork 28
Multi‐Speaker Function Description
After installing and selecting a source, you can use the voice color (CLR) parameter to quickly assign speakers to each word:
If you need to fine-tune the scale of each speaker, click the gear icon in the lower left corner → "Add all expressions suggested by renderers" to add the speaker control curve parameters (CL01, CL02, etc.) to the project. The range of the parameter is 0~100.
Specific mechanism: sum of the control curve parameters for each speaker. If the sum is less than 100%, it is complemented with the speaker specified in the CLR. If the sum is greater than 100%, it is scaled down so that the sum equals 100%.
The following packing method is based on the FemaleTriplet voicebank as an example:
Place all the exported emb files in the audio folder (put together with dsconfig.yaml. The name of emb can be modified, but it must be in English)
Then add the following to dsconfig.yaml. Note: All text encoding is UTF-8. Indentation of the yaml file is critical. It is recommended to use VSCode with syntax checking to edit the yaml file:
# hidden_size used for training, default is 256
hidden_size: 256
# English name of the speaker, corresponds to the filename of the emb
speakers:
- opencpop
- qixuan
- xiayezi
Add the following to character.yaml:
# Each speaker is a subbank
# color is the speaker's name, which must start with a two-digit number in the same order as in this file: 01,02,03,04,...
# suffix is the English name of the speaker, corresponding to the .emb file name
# prefix and tone_ranges can be included but are not necessary
subbanks:
- color: "01: Opencpop"
prefix: ''
suffix: opencpop
tone_ranges:
- C1-B7
- color: "02: 绮萱"
prefix: ''
suffix: qixuan
tone_ranges:
- C1-B7
- color: "03: 夏叶子"
prefix: ''
suffix: xiayezi
tone_ranges:
- C1-B7