This project is an open source, offline project, and all developers and maintainers of this project (hereinafter referred to as contributors) have no control over this project. The contributor of this project has never provided any organization or individual with any form of assistance, including but not limited to data set extraction, data set processing, computing support, training support, infering, etc. Contributors to the project do not and cannot know what users are using the project for. Therefore, all AI models and synthesized audio based on the training of this project have nothing to do with the contributors of this project. All problems arising therefrom shall be borne by the user.
Stopped updating (due to download and upload speed): Vocoder & HiddenUnitBert
Latest repository link : HuggingFace
Notes on exporting onnx sub models:
- HuBert: input_names should be ["source"], output_names should be ["embed"], dynamic_axes should be {"source":[0,2],}
- hifigan for Diffusion model: input_names should be ["c", "f0"], output_names should be ["audio"], dynamic_axes should be {"c":[0,1], "f0":[0,1],}
- hifigan for Tacotron2: input_names should be ["x"], output_names should be ["audio"], dynamic_axes should be {"x":[0,1],}
- 1、 You must bear all consequences arising from the use of the program at your own risk.
- 2、 You may not sell the program and its affiliated sub-models, and you will be responsible for all consequences resulting from such sale.
- 3、When using the program, you must consciously abide by the local laws and regulations, you must not use MoeSS to engage in illegal activities, if you engage in illegal activities, you will be responsible for all the consequences.
- 4、It is forbidden to use it for any commercial games, low quality games and Galgame production, except free high quality game production and mod production for other games.
- 5、It is forbidden to use the project and its derivatives as well as the released models to make “digital junk” (i.e. most of the game content like artworks are generated by ai).
- 6、Prohibited to use for political-related purposes, the consequences caused by the generation of political-related content shall be borne by you.
A: The project is permanently open source and free. If there is a paid version of this software elsewhere, please report it immediately to your shopping site and do not buy it, it is permanently free. If you would like to support Shirakana by making a donation, you can go to https://afdian.net/a/NaruseMioShirakana for more information.
A: In principle we do not offer this service. Training TTS models is relatively simple and there is no need to spend money on it, just follow the online tutorials step by step.
A: 1. Originality. The percentage of your own stuff in the overall project (for AI, creations using models trained entirely independently by you belong to you; work generated using someone else's model belongs to someone else). Aspects covered include, but are not limited to, programming, artwork, audio, designing, etc. For example, a game that uses a template from an engine's market such as Unity is "digital junk".
2. Developer attitude. The attitude of the author is whether he or she is trying to make a profit or simply to satisfy vanity. For example, if the game is promoted with exaggerated adjectives such as "the first" or "the best" to attract attention, but turns out to be very bad or mediocre, and the author clearly has no intention of making the game properly, this type of work is "digital junk".
3. We oppose any commercial use of AI models trained from unlicensed datasets.
A: If it can be established that what you are doing is not “digital junk” and is also legally compliant and not heavily political, I will provide some technical support where I can.
A fully C++ Speech Synthesis UI based on various open source TTS, VC and SVS projects
Supported projects:
The image resource used is derived from:
目前仅支持Windows
1、Download the software package in the release and unzip it.
2. Download the appropriate sub-models or additional modules from the [Vocoder & HiddenUnitBert] repository above and place them in the appropriate folders, the correspondence between the sub-models and the project will be mentioned below.
3. Place the model in the "Mods" folder and select the model from the model selection module at the top left, for the standard model structure please refer to "Supported Projects" below
4, enter the text to be converted in the input box below, click on "Enable Plugin" to execute the text Cleaner, and change the line to the batch conversion clause symbols (SoVits/DiffSvc need to enter the audio path, DiffSinger need to enter the path of the ds or json project file)
5, click on "start synthesis" to start synthesizing the voice,then wait for the progress to complete.When the audio is ready,you can preview the audio in the player at the top right,and you can also save the audio file directly at the top right of the interface.
6、It can also be run from the command line: (version 1.X only)
Shell: & '. \xxx.exe' "ModDir" "InputText." "outputDir" "Symbol"
CMD: "xxx.exe" "ModDir" "InputText." "outputDir" "Symbol"
where ModDir is the "model path\\model name" e.g. "Mods\\Shiroha\\\Shiroha"
InputText is the text to be converted (only spaces, commas and letters are supported)
outputDir is the output file name (not the path, but the file name, no need to add suffixes)
See below for Symbol relevance.
The output file is in the "tmpDir" folder by default.
- The software standardises the model reading module, models are saved in a sub-folder under the Mods folder. A json file like "********.json" is a model configuration file, which is used to declare the model path and its display name.You need to convert the model to an Onnx model before you can use it. The codes used to convert these models can be found in the repositories on my Github page.
- Folder: The name of the folder where the model is stored
- Name: the name of the model to be displayed in the UI
- Type: type of the model(see below)
- Rate: the sampling rate (must be exactly the same as the rate you set during training)
{
"Folder" : "Atri",
"Name" : "atri-Tacotron2",
"Type" : "Tacotron2",
"Rate" : 22050,
"Symbol" : "_-!'(),.:;? ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz",
"Cleaner" : "JapaneseCleaner",
"Hifigan": "hifigan"
}
//Symbol:Symbol of the model,If you don't know what it is, you are advised to check the TTS information on the internet.This field must be filled in the Tacotron2 model's configuration file.
//Cleaner:The name of the plugin,can be left blank, but if it is filled in, the corresponding CleanerDll must be placed in the Cleaner folder, if the Dll does not exist or if there is an internal error in the Dll, it will report an error when loading the model
//Hifigan:Hifigan model name, required and must be placed in the "hifigan" folder for hifigan model downloaded from the sub-model repository
{
"Folder" : "SummerPockets",
"Name" : "SummerPocketsReflectionBlue",
"Type" : "Vits",
"Rate" : 22050,
"Symbol" : "_,.!?-~…AEINOQUabdefghijkmnoprstuvwyzʃʧʦ↓↑ ",
"Cleaner" : "JapaneseCleaner",
"Characters" : ["鳴瀬しろは","空門蒼","鷹原うみ","紬ヴェンダース","神山識","水織静久","野村美希","久島鴎","岬鏡子"]
}
//Symbol:Symbol of the model,If you don't know what it is, you are advised to check the TTS information on the internet.This field must be filled in the VITS model's configuration file.
//Cleaner:The name of the plugin,can be left blank, but if it is filled in, the corresponding CleanerDll must be placed in the Cleaner folder, if the Dll does not exist or if there is an internal error in the Dll, it will report an error when loading the model
//Characters:For multi-speaker model this must be filled in as a list of your speakers' names, for single-speaker model it can be left out
{
"Folder" : "SummerPockets",
"Name" : "SummerPocketsReflectionBlue",
"Type" : "Pits",
"Rate" : 22050,
"Symbol" : "_,.!?-~…AEINOQUabdefghijkmnoprstuvwyzʃʧʦ↓↑ ",
"Cleaner" : "JapaneseCleaner",
"Characters" : ["鳴瀬しろは","空門蒼","鷹原うみ","紬ヴェンダース","神山識","水織静久","野村美希","久島鴎","岬鏡子"]
}
//Symbol:Symbol of the model,If you don't know what it is, you are advised to check the TTS information on the internet.This field must be filled in the VITS model's configuration file.
//Cleaner:The name of the plugin,can be left blank, but if it is filled in, the corresponding CleanerDll must be placed in the Cleaner folder, if the Dll does not exist or if there is an internal error in the Dll, it will report an error when loading the model
//Characters:For multi-speaker model this must be filled in as a list of your speakers' names, for single-speaker model it can be left out
{
"Folder" : "NyaruTaffySo",
"Name" : "NyaruTaffy-SoVits",
"Type" : "SoVits",
"Rate" : 32000,
"Hop" : 320,
"Cleaner" : "",
"Hubert": "hubert",
"SoVits3": true,
"Characters" : ["Taffy","Nyaru"]
}
//Hop:HopLength of the model, if you don't know what it is you are advised to look up the information on the internet. This must be filled in the configuration file of the SoVits model.(The value must be the one you set during training and can be seen in the configuration file you used to train the model)
//Cleaner:The name of the plugin,can be left blank, but if it is filled in, the corresponding CleanerDll must be placed in the Cleaner folder, if the Dll does not exist or if there is an internal error in the Dll, it will report an error when loading the model
//Hubert:Hubert model name, required and must be placed in the "Hubert" folder for Hubert model downloaded from the sub-model repository
//Characters:For multi-speaker model this must be filled in as a list of your speakers' names, for single-speaker model it can be left out
{
"Folder" : "NyaruTaffySo",
"Name" : "NyaruTaffy-SoVits",
"Type" : "SoVits",
"Rate" : 48000,
"Hop" : 320,
"Cleaner" : "",
"Hubert": "hubert",
"SoVits3": true,
"Characters" : ["Taffy","Nyaru"]
}
//Hop:HopLength of the model, if you don't know what it is you are advised to look up the information on the internet. This must be filled in the configuration file of the SoVits model.(The value must be the one you set during training and can be seen in the configuration file you used to train the model)
//Cleaner:The name of the plugin,can be left blank, but if it is filled in, the corresponding CleanerDll must be placed in the Cleaner folder, if the Dll does not exist or if there is an internal error in the Dll, it will report an error when loading the model
//Hubert:Hubert model name, required and must be placed in the "Hubert" folder for Hubert model downloaded from the sub-model repository
//Characters:For multi-speaker model this must be filled in as a list of your speakers' names, for single-speaker model it can be left out
{
"Folder" : "NyaruTaffySo",
"Name" : "NyaruTaffy-SoVits",
"Type" : "SoVits",
"Rate" : 44100,
"Hop" : 512,
"Cleaner" : "",
"Hubert": "hubert4.0",
"SoVits4": true,
"Characters" : ["Taffy","Nyaru"]
}
//Hop:HopLength of the model, if you don't know what it is you are advised to look up the information on the internet. This must be filled in the configuration file of the SoVits model.(The value must be the one you set during training and can be seen in the configuration file you used to train the model)
//Cleaner:The name of the plugin,can be left blank, but if it is filled in, the corresponding CleanerDll must be placed in the Cleaner folder, if the Dll does not exist or if there is an internal error in the Dll, it will report an error when loading the model
//Hubert:Hubert model name, required and must be placed in the "Hubert" folder for Hubert models downloaded from the sub-model repository
//Characters:For multi-speaker model this must be filled in as a list of your speakers' names, for single-speaker model it can be left out
{
"Folder" : "DiffShiroha",
"Name" : "白羽",
"Type" : "DiffSvc",
"Rate" : 44100,
"Hop" : 512,
"MelBins" : 128,
"Cleaner" : "",
"Hifigan": "nsf_hifigan",
"Hubert": "hubert",
"Characters" : [],
"Pndm" : 100,
"V2" : true
}
//Hop:HopLength of the model, if you don't know what it is you are advised to look up the information on the internet. This must be filled in the configuration file of the SoVits model.(The value must be the one you set during training and can be seen in the configuration file you used to train the model)
//Melbins:Melbins of the model, if you don't know what it is you are advised to look up the information on the internet. This must be filled in the configuration file of the Diffsvc model.(The value must be the one you set during training and can be seen in the configuration file you used to train the model)
//Cleaner:The name of the plugin,can be left blank, but if it is filled in, the corresponding CleanerDll must be placed in the Cleaner folder, if the Dll does not exist or if there is an internal error in the Dll, it will report an error when loading the model
//Hubert:Hubert model name, required and must be placed in the "Hubert" folder for Hubert models downloaded from the sub-model repository
//Hifigan:Hifigan model name, required and must be placed in the "hifigan" folder for nsf-hifigan model downloaded from the sub-model repository
//Characters:For multi-speaker model this must be filled in as a list of your speakers' names, for single-speaker model it can be left out
//Pndm:Acceleration multiplier, required in the case of V1 models and must be the acceleration multiplier set at the time of export
//V2:If your diffsvc model is a V2 model,set this to "true".(example:FishDiffusion SVC models)
{
"Folder" : "utagoe",
"Name" : "utagoe",
"Type" : "DiffSinger",
"Rate" : 44100,
"Hop" : 512,
"Cleaner" : "",
"Hifigan": "singer_nsf_hifigan",
"Characters" : [],
"MelBins" : 128
}
//Hop:HopLength of the model, if you don't know what it is you are advised to look up the information on the internet. This must be filled in the configuration file of the Diffsinger model.(The value must be the one you set during training and can be seen in the configuration file you used to train the model)
//Melbins:Melbins of the model, if you don't know what it is you are advised to look up the information on the internet. This must be filled in the configuration file of the Diffsvc model.(The value must be the one you set during training and can be seen in the configuration file you used to train the model)
//Cleaner:The name of the plugin,can be left blank, but if it is filled in, the corresponding CleanerDll must be placed in the Cleaner folder, if the Dll does not exist or if there is an internal error in the Dll, it will report an error when loading the model
//Hifigan:Hifigan model name, required and must be placed in the "hifigan" folder for nsf-hifigan model downloaded from the sub-model repository
//Characters:For multi-speaker model this must be filled in as a list of your speakers' names, for single-speaker model it can be left out
// $Below are the model files needed for several different projects (they need to be placed in the corresponding model folders).
// Tacotron2:
${Folder}_decoder_iter.onnx
${Folder}_encoder.onnx
${Folder}_postnet.onnx
// Vits: Single-speaker VITS
${Folder}_dec.onnx
${Folder}_flow.onnx
${Folder}_enc_p.onnx
${Folder}_dp.onnx
// Vits: multi-speaker VITS
${Folder}_dec.onnx
${Folder}_emb.onnx
${Folder}_flow.onnx
${Folder}_enc_p.onnx
${Folder}_dp.onnx
// SoVits:
${Folder}_SoVits.onnx
// DiffSvc:
${Folder}_diffSvc.onnx
// DiffSvc: V2
${Folder}_encoder.onnx
${Folder}_denoise.onnx
${Folder}_pred.onnx
${Folder}_after.onnx
// DiffSinger: OpenVpiVersion
${Folder}_diffSinger.onnx
// DiffSinger:
${Folder}_encoder.onnx
${Folder}_denoise.onnx
${Folder}_pred.onnx
${Folder}_after.onnx
For example:_-!'(),.:;? ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz
Open the project you are using to train the model, open text\symbol.py and join the 4 strings above in the order of the underlined list as shown
/*
The Cleaner should be placed in the Cleaners folder in the root directory and should be a dynamic library (.dll) defined as required. The dll should be named Cleaner and the Cleaner name is what is entered in the "Cleaner" of the model configuration file.
The following functions need to be defined for all plug-in dlls, the function name must be PluginMain and the Dll name must be the plug-in name (or Cleaner name).
*/
const wchar_t* PluginMain(const wchar_t*);
// The interface only requires consistent input and output, not consistent functionality, which means that you can implement any functionality you want in the Dll, such as ChatGpt, translation, etc.
// Using ChatGpt as an example, the PluginMain function passes in an input string input, passes that input into ChatGpt, passes ChatGpt's output into PluginMain, and finally returns the output.
wchar_t* PluginMain(wchar_t* input){
wchar_t* tmpOutput = ChatGpt(input);
return Clean(tmpOutput);
}
// Note: Please use the extern "C" keyword when exporting the dll to prevent destructive naming in C++.
Any country, region, organization, or individual using this project must comply with the following laws.
任何组织或者个人不得以丑化、污损,或者利用信息技术手段伪造等方式侵害他人的肖像权。未经肖像权人同意,不得制作、使用、公开肖像权人的肖像,但是法律另有规定的除外。 未经肖像权人同意,肖像作品权利人不得以发表、复制、发行、出租、展览等方式使用或者公开肖像权人的肖像。 对自然人声音的保护,参照适用肖像权保护的有关规定。
【名誉权】民事主体享有名誉权。任何组织或者个人不得以侮辱、诽谤等方式侵害他人的名誉权。
【作品侵害名誉权】行为人发表的文学、艺术作品以真人真事或者特定人为描述对象,含有侮辱、诽谤内容,侵害他人名誉权的,受害人有权依法请求该行为人承担民事责任。 行为人发表的文学、艺术作品不以特定人为描述对象,仅其中的情节与该特定人的情况相似的,不承担民事责任。