Merge pull request #10 from Music-and-Culture-Technology-Lab/vocal_frame

Updated docs
Music-and-Culture-Technology-Lab · Dec 12, 2020 · 2e969b3 · 2e969b3
2 parents 34916bb + ac001d8
commit 2e969b3
Show file tree

Hide file tree

Showing 4 changed files with 53 additions and 66 deletions.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -2,23 +2,23 @@
 
 ## 0.2.0 - 2020-
 
-### Vocal transcription is available now!
-After a long development and experiments, we finally complete the vocal transcription module
-and integrate them into omnizart.
+### Vocal melody transcription in both frame- and note-level are live!
+We release the modules for vocal melody transcription after a decent amount of effort. 
+Now you can transcribe your favorite singing voice.
 
 ### Features
 - Release `vocal` and `vocal-contour` submodules.
 
 ### Enhancement
 - Improve chord transcription results by filtering out chord predictions with short duration.
-- Unify the way of resolving the transcirption results' output path.
+- Resolve the path for transcription output in a consistent way.
 
 ### Documentation
-- Re-organize the quick start and tutorial page to give a more clean and fluent reading experience.
-- Move the development section origially in README.md to CONTRIBUTING.md.
+- Re-organize Quick Start and Tutorial pages to improve accessibility.
+- Move the section for development from README.md to CONTRIBUTING.md.
 
 ### Bug Fix
-- Fix bug of passing the wrong parameter to vamp of chroma feature extraction.
+- Fix bugs of passing the wrong parameter to vamp for chroma feature extraction.
 
 ---
 

diff --git a/README.md b/README.md
@@ -1,4 +1,4 @@
-# omnizart
+# OMNIZART
 
 [![build](https://github.com/Music-and-Culture-Technology-Lab/omnizart/workflows/general-check/badge.svg)](https://github.com/Music-and-Culture-Technology-Lab/omnizart/actions?query=workflow%3Ageneral-check)
 [![docs](https://github.com/Music-and-Culture-Technology-Lab/omnizart/workflows/docs/badge.svg?branch=build_doc)](https://music-and-culture-technology-lab.github.io/omnizart-doc/)
@@ -7,63 +7,47 @@
 [![PyPI - Downloads](https://img.shields.io/pypi/dm/omnizart)](https://pypistats.org/packages/omnizart)
 [![Docker Pulls](https://img.shields.io/docker/pulls/mctlab/omnizart)](https://hub.docker.com/r/mctlab/omnizart)
 
-Omniscient Mozart, being able to transcribe everything in the music, including vocal, drum, chord, beat, instruments, and more.
-Combines all the hard works developed by everyone in MCTLab into a single command line tool. Python package and docker
-image are also available.
+Omnizart is a Python library that aims for democratizing automatic music transcription.
+Given polyphonic music, it is able to transcribe pitched instruments, vocal melody, chords, drum events, and beat.
+This is powered by the research outcomes from [Music and Culture Technology (MCT) Lab](https://sites.google.com/view/mctl/home).
 
-### Try omnizart now!! [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://bit.ly/omnizart-colab)
+### Transcribe your favorite songs now in Colab! [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://bit.ly/omnizart-colab)
 
-A quick-start example is as following:
+# Quick start
+
+Visit the [complete document](https://music-and-culture-technology-lab.github.io/omnizart-doc/) for detailed guidance.
+
+## Pip
 ``` bash
 # Install omnizart
 pip install omnizart
 
-# Download the checkpoints after installation
+# Download the checkpoints
 omnizart download-checkpoints
 
-# Now it's ready for the transcription~
+# Transcribe your songs
 omnizart drum transcribe <path/to/audio.wav>
 omnizart chord transcribe <path/to/audio.wav>
 omnizart music transcribe <path/to/audio.wav>
 ```
 
-Or use the docker image:
+## Docker
 ```
 docker pull mctlab/omnizart:latest
 docker run -it mctlab/omnizart:latest bash
 ```
 
-Comprehensive usage and API references can be found in the [official documentation site](https://music-and-culture-technology-lab.github.io/omnizart-doc/).
-
-# About
-[Music and Culture Technology Lab (MCTLab)](https://sites.google.com/view/mctl/home) aims to develop technology for music and relevant applications by leveraging cutting-edge AI techiniques.
-
-# Plan to support
-| Commands         | transcribe         | train              | evaluate | Description                       |
-|------------------|--------------------|--------------------|----------|-----------------------------------|
-| music            | :heavy_check_mark: | :heavy_check_mark: |          | Transcribes notes of instruments. |
-| drum             | :heavy_check_mark: | :interrobang:      |          | Transcribes drum tracks.          |
-| vocal            | :heavy_check_mark: | :heavy_check_mark: |          | Transcribes pitch of vocal.       |
-| vocal-contour    | :heavy_check_mark: | :heavy_check_mark: |          | Transcribes contour of vocal.     |
-| chord            | :heavy_check_mark: | :heavy_check_mark: |          | Transcribes chord progression.    |
-| beat             |                    |                    |          | Transcribes beat position.        |
-
-**NOTES** Though the implementation of training the drum model is 90% complete, but there still exists some
-invisible bugs that cause the training fails to converge compared to the author's original implementation.
-
-Example usage
-<pre>
-omnizart music transcribe <i>path/to/audio</i>
-omnizart chord transcribe <i>path/to/audio</i>
-omnizart drum transcribe <i>path/to/audio</i>
-</pre>
+# Supported applications
+| Application      | Transcription      | Training           | Evaluation | Description                                      |
+|------------------|--------------------|--------------------|------------|--------------------------------------------------|
+| music            | :heavy_check_mark: | :heavy_check_mark: |            | Transcribe musical notes of pitched instruments. |
+| drum             | :heavy_check_mark: | :interrobang:      |            | Transcribe events of percussive instruments.     |
+| vocal            | :heavy_check_mark: | :heavy_check_mark: |            | Transcribe note-level vocal melody.              |
+| vocal-contour    | :heavy_check_mark: | :heavy_check_mark: |            | Transcribe frame-level vocal melody (F0).        |
+| chord            | :heavy_check_mark: | :heavy_check_mark: |            | Transcribe chord progressions.                   |
+| beat             |                    |                    |            | Transcribe beat position.                        |
 
-For training a new model, download the dataset first and follow steps described below.
-<pre>
-# The following command will default saving the extracted feature under the same folder,
-# called <b>train_feature</b> and <b>test_feature</b>
-omnizart music generate-feature -d <i>path/to/dataset</i>
+**NOTES**
+The current implementation for the drum model has unknown bugs, preventing loss convergence when training from scratch.
+Fortunately, you can still enjoy drum transcription with the provided checkpoints.
 
-# Train a new model
-omnizart music train-model -d <i>path/to/dataset</i>/train_feature --model-name My-Model
-</pre>
diff --git a/docs/source/index.rst b/docs/source/index.rst
@@ -25,19 +25,24 @@ Omnizart provides the main functionalities that construct the life-cycle of deep
 covering from *dataset downloading*, *feature pre-processing*, *model training*, to *transcription* and *sonification*.
 Pre-trained checkpoints are also provided for the immediate usage of transcription.
 
+Demonstration
+#############
 
-Demo
-####
+Colab
+*****
 
 Play with the `Colab notebook <https://bit.ly/omnizart-colab>`_ to transcribe your favorite song almost immediately!
 
-Below is a demonstration of chord and drum transcription.
+Sound samples
+*************
+
+Original song
 
 .. raw:: html
 
    <iframe width="560" height="315" src="https://www.youtube-nocookie.com/embed/hjJhweRlE-A" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
 
-The result of chord transcription
+Chord transcription
 
 .. raw:: html
 
@@ -47,7 +52,7 @@ The result of chord transcription
    </audio>
 
 
-The result of drum transcription
+Drum transcription
 
 .. raw:: html
 
@@ -57,7 +62,7 @@ The result of drum transcription
    </audio>
 
 
-The result of vocal transcription.
+Note-level vocal transcription
 
 .. raw:: html
 
@@ -67,7 +72,7 @@ The result of vocal transcription.
    </audio>
 
 
-The result of vocal pitch contour transcription.
+Frame-level vocal transcription*
 
 .. raw:: html
 
@@ -78,9 +83,7 @@ The result of vocal pitch contour transcription.
 
 
 Source files can be downloaded `here <https://drive.google.com/file/d/15VqHearznV9L83cyl61ccACsXXJ4vBHo/view?usp=sharing>`_.
-You can use *Audacity* to open it.
-
-All works are developed under `MCTLab <https://sites.google.com/view/mctl/home>`_.
+You can use *Audacity* to open the files.
 
 
 .. toctree::

diff --git a/docs/source/tutorial.rst b/docs/source/tutorial.rst
@@ -32,7 +32,7 @@ Detailed descriptions for the usage of each sub-command can be found in the dedi
 * :doc:`drum/cli` 
 * :doc:`chord/cli`
 * :doc:`vocal-contour/cli`
-* vocal *(preparing)*
+* :doc:`vocal/cli`
 * beat *(preparing)*
 
 All the applications share a same set of actions: **transcribe**, **generate-feature**, and **train-model**.
@@ -44,14 +44,14 @@ Transcribe
 As the name suggests, this action transcribes a given input.
 The supported applications are as follows:
 
-* ``music`` - Transcribes polyphonic music, and outputs notes of pitched instruments in MIDI.
-* ``drum`` - Transcribes polyphonic music, and outputs events of percussive instruments in MIDI.
-* ``chord`` - Transcribes polyphonic music, and outputs chord progression in MIDI and CSV.
-* ``vocal`` - Transcribes polyphonic music, and outputs note-level vocal melody.
-* ``vocal-contour`` - Transcribes polyphonic music, and outputs frame-level vocal melody (F0) in text.
-* ``beat`` *(preparing)* - MIDI-domain beat tracking.
+* ``music`` - Transcribe musical notes of pitched instruments in MIDI.
+* ``drum`` - Transcribe events of percussive instruments in MIDI.
+* ``chord`` - Transcribe chord progressions in MIDI and CSV.
+* ``vocal`` - Transcribe note-level vocal melody in MIDI.
+* ``vocal-contour`` - Transcribe frame-level vocal melody (F0) in text.
+* ``beat`` *(preparing)* - Transcribe beat position.
 
-Except ``beat`` which takes as input a MIDI file, all the applications receive audio files in WAV.
+Note that all the applications receive polyphonic music in WAV, except ``beat`` receives inputs in MIDI.
 
 Example usage: