From a79caf8e69d7309129a76b228a29d7e2983f6eaf Mon Sep 17 00:00:00 2001 From: Yin-Jyun Luo Date: Sat, 12 Dec 2020 22:52:13 +0800 Subject: [PATCH 1/2] Updated docs --- CHANGELOG.md | 14 ++++----- README.md | 64 +++++++++++++++------------------------- docs/source/index.rst | 23 ++++++++------- docs/source/tutorial.rst | 16 +++++----- 4 files changed, 52 insertions(+), 65 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 8f49471..47db089 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -2,23 +2,23 @@ ## 0.2.0 - 2020- -### Vocal transcription is available now! -After a long development and experiments, we finally complete the vocal transcription module -and integrate them into omnizart. +### Vocal melody transcription in both frame- and note-level are live! +We release the modules for vocal melody transcription after a decent amount of effort. +Now you can transcribe your favorite singing voice. ### Features - Release `vocal` and `vocal-contour` submodules. ### Enhancement - Improve chord transcription results by filtering out chord predictions with short duration. -- Unify the way of resolving the transcirption results' output path. +- Resolve the path for transcription output in a consistent way. ### Documentation -- Re-organize the quick start and tutorial page to give a more clean and fluent reading experience. -- Move the development section origially in README.md to CONTRIBUTING.md. +- Re-organize Quick Start and Tutorial pages to improve accessibility. +- Move the section for development from README.md to CONTRIBUTING.md. ### Bug Fix -- Fix bug of passing the wrong parameter to vamp of chroma feature extraction. +- Fix bugs of passing the wrong parameter to vamp for chroma feature extraction. --- diff --git a/README.md b/README.md index 98a09b9..fd4fbbf 100644 --- a/README.md +++ b/README.md @@ -1,4 +1,4 @@ -# omnizart +# OMNIZART [![build](https://github.com/Music-and-Culture-Technology-Lab/omnizart/workflows/general-check/badge.svg)](https://github.com/Music-and-Culture-Technology-Lab/omnizart/actions?query=workflow%3Ageneral-check) [![docs](https://github.com/Music-and-Culture-Technology-Lab/omnizart/workflows/docs/badge.svg?branch=build_doc)](https://music-and-culture-technology-lab.github.io/omnizart-doc/) @@ -7,63 +7,47 @@ [![PyPI - Downloads](https://img.shields.io/pypi/dm/omnizart)](https://pypistats.org/packages/omnizart) [![Docker Pulls](https://img.shields.io/docker/pulls/mctlab/omnizart)](https://hub.docker.com/r/mctlab/omnizart) -Omniscient Mozart, being able to transcribe everything in the music, including vocal, drum, chord, beat, instruments, and more. -Combines all the hard works developed by everyone in MCTLab into a single command line tool. Python package and docker -image are also available. +Omnizart is a Python library that aims for democratizing automatic music transcription. +Given polyphonic music, it is able to transcribe pitched instruments, vocal melody, chords, drum events, and beat. +This is powered by the research outcomes from [Music and Culture Technology (MCT) Lab](https://sites.google.com/view/mctl/home). -### Try omnizart now!! [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://bit.ly/omnizart-colab) +### Transcribe your favorite songs now in Colab! [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://bit.ly/omnizart-colab) -A quick-start example is as following: +# Quick start + +Visit the [complete document](https://music-and-culture-technology-lab.github.io/omnizart-doc/) for detailed guidance. + +## Pip ``` bash # Install omnizart pip install omnizart -# Download the checkpoints after installation +# Download the checkpoints omnizart download-checkpoints -# Now it's ready for the transcription~ +# Transcribe your songs omnizart drum transcribe omnizart chord transcribe omnizart music transcribe ``` -Or use the docker image: +## Docker ``` docker pull mctlab/omnizart:latest docker run -it mctlab/omnizart:latest bash ``` -Comprehensive usage and API references can be found in the [official documentation site](https://music-and-culture-technology-lab.github.io/omnizart-doc/). - -# About -[Music and Culture Technology Lab (MCTLab)](https://sites.google.com/view/mctl/home) aims to develop technology for music and relevant applications by leveraging cutting-edge AI techiniques. - -# Plan to support -| Commands | transcribe | train | evaluate | Description | +# Supported applications +| Application | Transcription | Training | Evaluation | Description | |------------------|--------------------|--------------------|----------|-----------------------------------| -| music | :heavy_check_mark: | :heavy_check_mark: | | Transcribes notes of instruments. | -| drum | :heavy_check_mark: | :interrobang: | | Transcribes drum tracks. | -| vocal | :heavy_check_mark: | :heavy_check_mark: | | Transcribes pitch of vocal. | -| vocal-contour | :heavy_check_mark: | :heavy_check_mark: | | Transcribes contour of vocal. | -| chord | :heavy_check_mark: | :heavy_check_mark: | | Transcribes chord progression. | -| beat | | | | Transcribes beat position. | - -**NOTES** Though the implementation of training the drum model is 90% complete, but there still exists some -invisible bugs that cause the training fails to converge compared to the author's original implementation. - -Example usage -
-omnizart music transcribe path/to/audio
-omnizart chord transcribe path/to/audio
-omnizart drum transcribe path/to/audio
-
+| music | :heavy_check_mark: | :heavy_check_mark: | | Transcribe musical notes of pitched instruments. | +| drum | :heavy_check_mark: | :interrobang: | | Transcribe events of percussive instruments. | +| vocal | :heavy_check_mark: | :heavy_check_mark: | | Transcribe note-level vocal melody. | +| vocal-contour | :heavy_check_mark: | :heavy_check_mark: | | Transcribe frame-level vocal melody (F0). | +| chord | :heavy_check_mark: | :heavy_check_mark: | | Transcribe chord progressions. | +| beat | | | | Transcribe beat position. | -For training a new model, download the dataset first and follow steps described below. -
-# The following command will default saving the extracted feature under the same folder,
-# called train_feature and test_feature
-omnizart music generate-feature -d path/to/dataset
+**NOTES**
+The current implementation for the drum model has unknown bugs, preventing loss convergence when training from scratch.
+Fortunately, you can still enjoy drum transcription with the provided checkpoints.
 
-# Train a new model
-omnizart music train-model -d path/to/dataset/train_feature --model-name My-Model
-
diff --git a/docs/source/index.rst b/docs/source/index.rst index 934782c..66deda6 100644 --- a/docs/source/index.rst +++ b/docs/source/index.rst @@ -25,19 +25,24 @@ Omnizart provides the main functionalities that construct the life-cycle of deep covering from *dataset downloading*, *feature pre-processing*, *model training*, to *transcription* and *sonification*. Pre-trained checkpoints are also provided for the immediate usage of transcription. +Demonstration +############# -Demo -#### +Colab +***** Play with the `Colab notebook `_ to transcribe your favorite song almost immediately! -Below is a demonstration of chord and drum transcription. +Sound samples +************* + +Original song .. raw:: html -The result of chord transcription +Chord transcription .. raw:: html @@ -47,7 +52,7 @@ The result of chord transcription -The result of drum transcription +Drum transcription .. raw:: html @@ -57,7 +62,7 @@ The result of drum transcription -The result of vocal transcription. +Note-level vocal transcription .. raw:: html @@ -67,7 +72,7 @@ The result of vocal transcription. -The result of vocal pitch contour transcription. +Frame-level vocal transcription* .. raw:: html @@ -78,9 +83,7 @@ The result of vocal pitch contour transcription. Source files can be downloaded `here `_. -You can use *Audacity* to open it. - -All works are developed under `MCTLab `_. +You can use *Audacity* to open the files. .. toctree:: diff --git a/docs/source/tutorial.rst b/docs/source/tutorial.rst index 23cc4e1..89887f9 100644 --- a/docs/source/tutorial.rst +++ b/docs/source/tutorial.rst @@ -32,7 +32,7 @@ Detailed descriptions for the usage of each sub-command can be found in the dedi * :doc:`drum/cli` * :doc:`chord/cli` * :doc:`vocal-contour/cli` -* vocal *(preparing)* +* :doc:`vocal/cli` * beat *(preparing)* All the applications share a same set of actions: **transcribe**, **generate-feature**, and **train-model**. @@ -44,14 +44,14 @@ Transcribe As the name suggests, this action transcribes a given input. The supported applications are as follows: -* ``music`` - Transcribes polyphonic music, and outputs notes of pitched instruments in MIDI. -* ``drum`` - Transcribes polyphonic music, and outputs events of percussive instruments in MIDI. -* ``chord`` - Transcribes polyphonic music, and outputs chord progression in MIDI and CSV. -* ``vocal`` - Transcribes polyphonic music, and outputs note-level vocal melody. -* ``vocal-contour`` - Transcribes polyphonic music, and outputs frame-level vocal melody (F0) in text. -* ``beat`` *(preparing)* - MIDI-domain beat tracking. +* ``music`` - Transcribe musical notes of pitched instruments in MIDI. +* ``drum`` - Transcribe events of percussive instruments in MIDI. +* ``chord`` - Transcribe chord progressions in MIDI and CSV. +* ``vocal`` - Transcribe note-level vocal melody in MIDI. +* ``vocal-contour`` - Transcribe frame-level vocal melody (F0) in text. +* ``beat`` *(preparing)* - Transcribe beat position. -Except ``beat`` which takes as input a MIDI file, all the applications receive audio files in WAV. +Note that all the applications receive polyphonic music in WAV, except ``beat`` receives inputs in MIDI. Example usage: From ac001d873988f77e39964ec2035ec1cdb30219c8 Mon Sep 17 00:00:00 2001 From: Derek-Wu Date: Sat, 12 Dec 2020 23:40:29 +0800 Subject: [PATCH 2/2] Re-format the table in README.md --- README.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index fd4fbbf..c7a5e6f 100644 --- a/README.md +++ b/README.md @@ -38,14 +38,14 @@ docker run -it mctlab/omnizart:latest bash ``` # Supported applications -| Application | Transcription | Training | Evaluation | Description | -|------------------|--------------------|--------------------|----------|-----------------------------------| -| music | :heavy_check_mark: | :heavy_check_mark: | | Transcribe musical notes of pitched instruments. | -| drum | :heavy_check_mark: | :interrobang: | | Transcribe events of percussive instruments. | -| vocal | :heavy_check_mark: | :heavy_check_mark: | | Transcribe note-level vocal melody. | -| vocal-contour | :heavy_check_mark: | :heavy_check_mark: | | Transcribe frame-level vocal melody (F0). | -| chord | :heavy_check_mark: | :heavy_check_mark: | | Transcribe chord progressions. | -| beat | | | | Transcribe beat position. | +| Application | Transcription | Training | Evaluation | Description | +|------------------|--------------------|--------------------|------------|--------------------------------------------------| +| music | :heavy_check_mark: | :heavy_check_mark: | | Transcribe musical notes of pitched instruments. | +| drum | :heavy_check_mark: | :interrobang: | | Transcribe events of percussive instruments. | +| vocal | :heavy_check_mark: | :heavy_check_mark: | | Transcribe note-level vocal melody. | +| vocal-contour | :heavy_check_mark: | :heavy_check_mark: | | Transcribe frame-level vocal melody (F0). | +| chord | :heavy_check_mark: | :heavy_check_mark: | | Transcribe chord progressions. | +| beat | | | | Transcribe beat position. | **NOTES** The current implementation for the drum model has unknown bugs, preventing loss convergence when training from scratch.