Skip to content

Commit

Permalink
Merge pull request #9 from Music-and-Culture-Technology-Lab/vocal-mix
Browse files Browse the repository at this point in the history
Add note-level vocal transcription, with integration of vocal-contour module.
  • Loading branch information
yjlolo authored Dec 12, 2020
2 parents b931b4a + 32e1818 commit 34916bb
Show file tree
Hide file tree
Showing 45 changed files with 2,288 additions and 245 deletions.
21 changes: 21 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,26 @@
# Changelog

## 0.2.0 - 2020-

### Vocal transcription is available now!
After a long development and experiments, we finally complete the vocal transcription module
and integrate them into omnizart.

### Features
- Release `vocal` and `vocal-contour` submodules.

### Enhancement
- Improve chord transcription results by filtering out chord predictions with short duration.
- Unify the way of resolving the transcirption results' output path.

### Documentation
- Re-organize the quick start and tutorial page to give a more clean and fluent reading experience.
- Move the development section origially in README.md to CONTRIBUTING.md.

### Bug Fix
- Fix bug of passing the wrong parameter to vamp of chroma feature extraction.

---

## 0.1.1 - 2020-12-01
### Features
Expand Down
4 changes: 2 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,8 +43,8 @@ Comprehensive usage and API references can be found in the [official documentati
|------------------|--------------------|--------------------|----------|-----------------------------------|
| music | :heavy_check_mark: | :heavy_check_mark: | | Transcribes notes of instruments. |
| drum | :heavy_check_mark: | :interrobang: | | Transcribes drum tracks. |
| vocal | | | | Transcribes pitch of vocal. |
| vocal-contour | | | | Transcribes contour of vocal. |
| vocal | :heavy_check_mark: | :heavy_check_mark: | | Transcribes pitch of vocal. |
| vocal-contour | :heavy_check_mark: | :heavy_check_mark: | | Transcribes contour of vocal. |
| chord | :heavy_check_mark: | :heavy_check_mark: | | Transcribes chord progression. |
| beat | | | | Transcribes beat position. |

Expand Down
19 changes: 18 additions & 1 deletion docs/source/base.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,5 +2,22 @@ Base Classes
============

.. automodule:: omnizart.base


Transcription
-------------
.. autoclass:: omnizart.base.BaseTranscription
:members:


Label
-----
.. autoclass:: omnizart.base.Label
:members:
:undoc-members:


Dataset Loader
--------------
.. autoclass:: omnizart.base.BaseDatasetLoader
:members:

9 changes: 8 additions & 1 deletion docs/source/chord/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Chord Transcription

App
###
.. automodule:: omnizart.chord.app
.. autoclass:: omnizart.chord.app.ChordTranscription
:members:
:show-inheritance:

Expand All @@ -19,6 +19,13 @@ Feature
:undoc-members:


Dataset
#######
.. autoclass:: omnizart.chord.app.McGillDatasetLoader
:members:
:show-inheritance:


Inference
#########
.. automodule:: omnizart.chord.inference
Expand Down
9 changes: 8 additions & 1 deletion docs/source/drum/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,14 @@ Drum Transcription

App
###
.. automodule:: omnizart.drum.app
.. autoclass:: omnizart.drum.app.DrumTranscription
:members:
:show-inheritance:


Dataset
#######
.. autoclass:: omnizart.drum.app.PopDatasetLoader
:members:
:show-inheritance:

Expand Down
27 changes: 27 additions & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,31 @@ The result of drum transcription
</audio>


The result of vocal transcription.

.. raw:: html

<audio controls="controls">
<source src="_audio/high_vocal_synth.mp3" type="audio/mpeg">
Your browser does not support the <code>audio</code> element.
</audio>


The result of vocal pitch contour transcription.

.. raw:: html

<audio controls="controls">
<source src="_audio/high_vocal_contour.mp3" type="audio/mpeg">
Your browser does not support the <code>audio</code> element.
</audio>


Source files can be downloaded `here <https://drive.google.com/file/d/15VqHearznV9L83cyl61ccACsXXJ4vBHo/view?usp=sharing>`_.
You can use *Audacity* to open it.

All works are developed under `MCTLab <https://sites.google.com/view/mctl/home>`_.


.. toctree::
:maxdepth: 2
Expand All @@ -67,6 +92,7 @@ The result of drum transcription
music/cli.rst
drum/cli.rst
chord/cli.rst
vocal/cli.rst
vocal-contour/cli.rst


Expand All @@ -77,6 +103,7 @@ The result of drum transcription
music/api.rst
drum/api.rst
chord/api.rst
vocal/api.rst
vocal-contour/api.rst
feature.rst
models.rst
Expand Down
9 changes: 9 additions & 0 deletions docs/source/models.rst
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,15 @@ Chord Transformer
:show-inheritance:


Pyramid Net
###########

.. automodule:: omnizart.models.pyramid_net
:members:
:undoc-members:
:show-inferitance:


Utils
#####

Expand Down
9 changes: 8 additions & 1 deletion docs/source/music/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,14 @@ Music Transcription

App
###
.. automodule:: omnizart.music.app
.. autoclass:: omnizart.music.app.MusicTranscription
:members:
:show-inheritance:


Dataset
#######
.. autoclass:: omnizart.music.app.MusicDatasetLoader
:members:
:show-inheritance:

Expand Down
46 changes: 24 additions & 22 deletions docs/source/tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -47,8 +47,8 @@ The supported applications are as follows:
* ``music`` - Transcribes polyphonic music, and outputs notes of pitched instruments in MIDI.
* ``drum`` - Transcribes polyphonic music, and outputs events of percussive instruments in MIDI.
* ``chord`` - Transcribes polyphonic music, and outputs chord progression in MIDI and CSV.
* ``vocal`` - Transcribes polyphonic music, and outputs note-level vocal melody.
* ``vocal-contour`` - Transcribes polyphonic music, and outputs frame-level vocal melody (F0) in text.
* ``vocal`` *(preparing)* - Transcribes polyphonic music, and outputs note-level vocal melody.
* ``beat`` *(preparing)* - MIDI-domain beat tracking.

Except ``beat`` which takes as input a MIDI file, all the applications receive audio files in WAV.
Expand All @@ -73,27 +73,29 @@ The processed features will be stored in *<path/to/dataset>/train_feature* and *

The supported datasets for feature processing are application-dependent, summarized as follows:

+-----------+-------+------+-------+------+---------------+
| Module | music | drum | chord | beat | vocal-contour |
+===========+=======+======+=======+======+===============+
| Maestro | O | | | | |
+-----------+-------+------+-------+------+---------------+
| Maps | O | | | | |
+-----------+-------+------+-------+------+---------------+
| MusicNet | O | | | | |
+-----------+-------+------+-------+------+---------------+
| Pop | O | O | | | |
+-----------+-------+------+-------+------+---------------+
| Ext-Su | O | | | | |
+-----------+-------+------+-------+------+---------------+
| BillBoard | | | O | | |
+-----------+-------+------+-------+------+---------------+
| BPS-FH | | | | | |
+-----------+-------+------+-------+------+---------------+
| MIR-1K | | | | | O |
+-----------+-------+------+-------+------+---------------+
| MedleyDB | | | | | O |
+-----------+-------+------+-------+------+---------------+
+-------------+-------+------+-------+------+-------+---------------+
| Module | music | drum | chord | beat | vocal | vocal-contour |
+=============+=======+======+=======+======+=======+===============+
| Maestro | O | | | | | |
+-------------+-------+------+-------+------+-------+---------------+
| Maps | O | | | | | |
+-------------+-------+------+-------+------+-------+---------------+
| MusicNet | O | | | | | |
+-------------+-------+------+-------+------+-------+---------------+
| Pop | O | O | | | | |
+-------------+-------+------+-------+------+-------+---------------+
| Ext-Su | O | | | | | |
+-------------+-------+------+-------+------+-------+---------------+
| BillBoard | | | O | | | |
+-------------+-------+------+-------+------+-------+---------------+
| BPS-FH | | | | | | |
+-------------+-------+------+-------+------+-------+---------------+
| MIR-1K | | | | | O | O |
+-------------+-------+------+-------+------+-------+---------------+
| MedleyDB | | | | | | O |
+-------------+-------+------+-------+------+-------+---------------+
| Tonas | | | | | O | |
+-------------+-------+------+-------+------+-------+---------------+

Before running the commands below, make sure to download the corresponding datasets first.
This can be easily done in :ref:`Download Datasets`.
Expand Down
2 changes: 1 addition & 1 deletion docs/source/vocal-contour/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ It will be loaded by the class :class:`omnizart.setting_loaders.VocalContourSett
The name of the attributes will be converted to snake-case (e.g. HopSize -> hop_size).
There is also a path transformation when applying the settings into the ``VocalContourSettings`` instance.
For example, the attribute ``BatchSize`` defined in the yaml path *General/Training/Settings/BatchSize* is transformed
to *MusicSettings.training.batch_size*.
to *VocalContourSettings.training.batch_size*.
The level of */Settings* is removed among all fields.

.. literalinclude:: ../../../omnizart/defaults/vocal_contour.yaml
Expand Down
54 changes: 54 additions & 0 deletions docs/source/vocal/api.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
Vocal Transcription
===================


.. automodule:: omnizart.vocal


App
###
.. autoclass:: omnizart.vocal.app.VocalTranscription
:members:
:show-inheritance:


Dataset
#######
.. autoclass:: omnizart.vocal.app.VocalDatasetLoader
:members:
:show-inheritance:


Inference
#########
.. automodule:: omnizart.vocal.inference
:members:


Labels
######
.. automodule:: omnizart.vocal.labels
:members:
:undoc-members:


Prediction
##########
.. automodule:: omnizart.vocal.prediction
:members:
:undoc-members:


Settings
########
Below are the default settings for building the vocal model. It will be loaded
by the class :class:`omnizart.setting_loaders.VocalSettings`. The name of the
attributes will be converted to snake-case (e.g. HopSize -> hop_size). There
is also a path transformation process when applying the settings into the
``VocalSettings`` instance. For example, if you want to access the attribute
``BatchSize`` defined in the yaml path *General/Training/Settings/BatchSize*,
the coressponding attribute will be *VocalSettings.training.batch_size*.
The level of */Settings* is removed among all fields.

.. literalinclude:: ../../../omnizart/defaults/vocal.yaml
:language: yaml
25 changes: 25 additions & 0 deletions docs/source/vocal/cli.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
omnizart vocal
==============

Lists the detailed available options of each sub-commands.


transcribe
##########

.. click:: omnizart.cli.vocal.transcribe:transcribe
:prog: omnizart vocal transcribe


generate-feature
################

.. click:: omnizart.cli.vocal.generate_feature:generate_feature
:prog: omnizart vocal generate-feature


train-model
###########

.. click:: omnizart.cli.vocal.train_model:train_model
:prog: omnizart vocal train-model
2 changes: 1 addition & 1 deletion omnizart/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,4 @@
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'
os.environ['VAMP_PATH'] = os.path.join(MODULE_PATH, "resource", "vamp")

__version__ = "0.1.1"
__version__ = "0.2.0"
Loading

0 comments on commit 34916bb

Please sign in to comment.