This is a traditional Chinese musical instrument Guqin. A distinctive system known as Jianzipu (JZP), which utilizes reduced Chinese character notation, is employed to record guqin music. Our project aims to use the latest artificial intelligent technology to read the JZP and play the proper music from JZP.
We have two system for Jiazipu. The first one is from Jianzipu to the Jianzi document and a system to generate the music from our Jianzi document.
We mainly use python and here we list some baseline and python with a gpu enviroment.
- python 3.10
- pytorch
pip install -r requirements.txt
- JZP notation: We use 五声琴谱 for our ocr dataset. The dataset made by Suzi AI. The SuziAI is a tool for notation. Please follow the gui-tool tutorial to make sure you can do the notation work.
- JZP recognition: The JZP recognition model is trained with folloing method. In order to know which one is better, we need to evaluate the folloing method on our data.
- Basic NLP Method: Transformer series model.
We have published a jianzipu dataset in datasets
folder. This dataset includes basic jianzipu finger technique collection and a jianzi character collection. It is a image-notation dataset. Each image has its own decomposed tree notation. We proposed a new jianzi character decomposed tree to help us on the next Jianzipu OCR method.
- To strart the annotation tool, first switch the composer button to jianzipu button and use open to open a jianzipu image folder.
- Press Auto-Segmentation button to get the annotation boxes from picture. And press Music(ind.) to annotate the Jianzi Character.
- Follow the video to annotate the JZP character. (images/tutorial.mp4)
5. After finished annotation, we will soon developed a json-string tool to get the description of Jianzi character. The tool is in developing, to be continued...
Guqin music generation system is aim to generate music from JZP document. Since the JZP notation doesn't include some music features, we need train our model with both document and music. The datasets includes two parts: the music parts are collected from Guqin exam videos and online video resources, the sequence parts are collected from our JZP OCR parts.
- basic sound generation model: VAE, VQVAE, Diffusion, Sound Stream...
- Symbolic Music Generation: Muzic
- Music generation from text: MusicLM
- Construct dataset: Our dataset need both JZP images and JZP representation list. The images comes from the Wushen scores and JZP representation comes from the gui-tool
- JZP recognition model train: This model includes two parts, the JZP character sequence generation and JZP document generation.
- Guqin music model train.
- A system of our work and a novel evalutaion system for our work. The evalutaion system need consider both human side and computer science side.
If you have a suggestion that would make this better, please fork the repo and create a pull request. You can also simply open an issue with the tag "enhancement". Don't forget to give the project a star! Thanks again!
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Distributed under the MIT License. See LICENSE.txt
for more information.
Zheng Youcheng - [email protected]
Project Link: https://github.com/AkunaMTT/guqinMM